scrawler 學校課程資訊爬蟲

school + crawler 的簡稱 = scrawler

scrawler 是各大學的課程資訊爬蟲目前支援：

中興

每次更新課程後
會自動匯入django所指定的資料庫如Mysql
以及建立所有cal所需要查詢的表
省去複雜的步驟，讓維護cal變得輕鬆

建立的表如下：

Django中timetable.models.Course
MongoDB中CourseOfDept
MongoDB中CourseOfTime

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.

Prerequisities

OS：Ubuntu / OSX would be nice
environment：need python3 sudo apt-get update; sudo apt-get install; python3 python3-dev
需要mongoDB:

Ubuntu:請看這篇安裝教學
mac：請看這篇安裝教學

lxml dependencies sudo apt-get install libxml2-dev libxslt-dev
cryptography dependencies sudo apt-get install libssl-dev libffi-dev

Installing

git clone https://github.com/stufinite/scrawler
使用虛擬環境：
virtualenv venv
啟動方法 1. for Linux：. venv/bin/activate 2. for Windows：venv\Scripts\activate
pip install -r requirements.txt

行前須知：

請確保mongodb是啟動的，啟動mongoDB的指令為

ubuntu：sudo systemctl start mongod.service
macOS：不確定

scralwer/settings.py的96行，請更新cal專案在該環境的絕對路徑，否則會無法存取資料到cal的資料庫中。

Running & Testing

Run

scrapy crawl NCHU -a semester=學期(1061 or 1062 or ...)：個人開發時使用

nohup python run.py 學期(1061 or 1062 or ...) &：佈署到伺服器時使用
因為run.py是一個無限回圈，所以讓他在背景執行即可
效果：看到start sleep且上方沒有任何error就是正常

  以上省略多行
  ...
 'item_scraped_count': 6,
 'log_count/DEBUG': 14,
 'log_count/INFO': 7,
 'response_received_count': 7,
 'scheduler/dequeued': 6,
 'scheduler/dequeued/memory': 6,
 'scheduler/enqueued': 6,
 'scheduler/enqueued/memory': 6,
 'start_time': datetime.datetime(2017, 2, 6, 13, 48, 44, 579855)}
2017-02-06 21:48:53 [scrapy.core.engine] INFO: Spider closed (finished)
-----------------------------------
start sleep
-----------------------------------

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
scrawler		scrawler
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
run.py		run.py
scrapy.cfg		scrapy.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

scrawler 學校課程資訊爬蟲

Getting Started

Prerequisities

Installing

行前須知：

Running & Testing

Run

Break down into end to end tests

And coding style tests

Results

Built With

Contributors

License

Acknowledgments

About

Uh oh!

Releases

Packages

Contributors 2

Languages

License

Stufinite/scrawler

Folders and files

Latest commit

History

Repository files navigation

scrawler 學校課程資訊爬蟲

Getting Started

Prerequisities

Installing

行前須知：

Running & Testing

Run

Break down into end to end tests

And coding style tests

Results

Built With

Contributors

License

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages