Weibocrawler

A Distributed Weibo Crawler supported by MongoDB

View project onGitHub

Welcome to WeiboCrawler GitHub Pages.

This is a web crawler for Sina Weibo, which stimulate the user login to fetch information inside weibo.

$ cd your_path
$ git clone https://github.com/charliemorning/weibocrawler.git

Requirement

Python 2.7 mongoengine >= 0.7 -- A MongoDB ORM in Python. MongoDB >= 2.0 BeautifulSoup

Recommendation

Pycharm is a good IDE for common python project and Django project. Here we recommend to use Pycharm to work. You can fetch a 30 days free version on the website of Pycharm.

Authors and Contributors

The information retrieval of web page is easy to out of date owning to the fast update of web page framework. So we are eager for you to participate in this project to update the parse method when our method loss efficacy. You can contact me at any time to charliegithub@gmail.com I will reply ASAP.