Fetch html which is loaded dynamically? python -

May 15, 2012

i writing crawler in python must extract links pdfs listed in page:

http://www.peekyou.com/barack_obama

(scroll down, there "documents" section links pdfs. )

the problem "documents" section loaded in background, after few seconds, in javascript. , function using fetch html page not fetch section.

to fetch html, have been given code:

        ...         req = urllib2.request(url)                     req.add_header('user-agent', random.choice(listagent))                                 page = urllib2.urlopen(req)                                                 if page.info().getmaintype() == "text":             html = page.read()             ...

which not fetch section, said.

what proper way deal problem? there api can use? thank you.

Search This Blog

KHS

Fetch html which is loaded dynamically? python -

Comments

Post a Comment

Popular posts from this blog

blackberry 10 - how to add multiple markers on the google map just by url? -

php - guestbook returning database data to flash -

java - Using an Integer ArrayList in Android -