html - Extracting javascript using Web::Scraper -

June 15, 2013

i'm having trouble extracting javascript using web::scraper. below test script:

#!/usr/bin/perl use modern::perl; use web::scraper; use data::dumper;  $contents = { local $/; <data> }; $scraper = scraper { process "//script", "scripts[]" => 'text'; }; $res = $scraper->scrape($contents);  dumper $res;  exit;  __data__ <html><head><title>hello</title></head> <body>   <script type="text/javascript">     var dummy = {}   </script> </body> </html>

and output:

$var1 = {           'scripts' => [                          ''                        ]         };

it seems me i'm finding script tag not saving contents between tags.

i found solution after digging xpath bit.

changing scraper line from:

my $scraper = scraper { process "//script", "scripts[]" => 'text'; };

to:

my $scraper = scraper { process "//script" => 'scripts[]' =>                     scraper { process '//text()', 'script'=>'text'} };

outputs javascript code:

$var1 = {           'scripts' => [                          {                            'script' => '     var dummy = {}   '                          }                        ]         };

i'm not convinced process line concise works.

Search This Blog

KHS

html - Extracting javascript using Web::Scraper -

Comments

Post a Comment

Popular posts from this blog

user interface - Python attempting to create a simple gui, getting "AttributeError: 'MainMenu' object has no attribute 'intro_screen'" -

jquery - Common JavaScript snippet to share files on Google Drive, Dropbox, Box.net or SkyDrive -

Android Gson.fromJson error -