Veerapat's IT journey: Fetch Web Pages in Python using urllib2

Saturday, May 28, 2011

Fetch Web Pages in Python using urllib2

urllib2 is an extensible library for opening URLs. Mostly, we will use fetch data from URLs via HTTP. We can use it to access normal Website, HTTP Authentication, Web Proxy, and etc. urllib2 is included in python standard library.

Example:

import urllib2
# Normal usage
f = urllib2.urlopen('http://wwww.google.com')
print f.read()

# Use under Proxy
auth_handler = urllib2.ProxyBasicAuthHandler(urllib2.HTTPPasswordMgrWithDefaultRealm())
auth_handler.add_password(realm=None,uri='proxy-name',user='username',passwd='password')
opener = urllib2.build_opener(auth_handler)
urllib2.install_opener(opener)
f = urllib2.urlopen('http://wwww.bing.com')
print f.read()

In the example, It use GET to fetch data. If you want to change it to POST, you have to put data attribute in the method.

The basic usage is "urllib2.urlopen(url[, data][, timeout])".

More details at: http://docs.python.org/library/urllib2.html

1 comment:

AnonymousDecember 30, 2014 at 8:16 PM
Working with python is awesome.I am scraper and doing scraping since last 5+ years.I have made custom scraper for my clients.
ReplyDelete
Replies

Add comment

Pages

Profile

Saturday, May 28, 2011

Fetch Web Pages in Python using urllib2

1 comment: