This is a community-supported Bobik SDK for web scraping in Python.
You can install the SDK using pip
:
pip install bobik_python_sdk
If you want to install the latest development version, use:
pip install -e git+https://github.com/emirkin/bobik_python_sdk#egg=bobik_python_sdk
Here's a quick example to get you started
from bobik import Bobik
bobik_api = Bobik(YOUR_AUTH_TOKEN, debug=True)
def success_handler(response):
for site in response['results']:
print 'Results for %s' % site
for query in response['results'][site]:
print '\tQuery: %s' % query
for result in response['results'][site][query]:
print '\t\t%s' % result
def error_handler(error_list):
for err in error_list:
print err
query = {
'urls' : ['amazon.com', 'zynga.com', 'http://finance.yahoo.com'],
'queries' : ['//th', '//img/@src', 'return document.title', "return $('script').length"]
}
bobik_api.scrape(query, success_handler, error_handler)
Documentation can be found at http://usebobik.com/sdk/python/index.html.
The docs are generated with Sphinx. To generate
the docs, enter the docs
folder and run:
$ make html
The documentation will be generated inside the _build/html
directory.
Write to [email protected] to become a collaborator.
Submit issues with the SDK here on Github.