html - Extracting javascript using Web::Scraper -


i'm having trouble extracting javascript using web::scraper. below test script:

#!/usr/bin/perl use modern::perl; use web::scraper; use data::dumper;  $contents = { local $/; <data> }; $scraper = scraper { process "//script", "scripts[]" => 'text'; }; $res = $scraper->scrape($contents);  dumper $res;  exit;  __data__ <html><head><title>hello</title></head> <body>   <script type="text/javascript">     var dummy = {}   </script> </body> </html> 

and output:

$var1 = {           'scripts' => [                          ''                        ]         }; 

it seems me i'm finding script tag not saving contents between tags.

i found solution after digging xpath bit.

changing scraper line from:

my $scraper = scraper { process "//script", "scripts[]" => 'text'; }; 

to:

my $scraper = scraper { process "//script" => 'scripts[]' =>                     scraper { process '//text()', 'script'=>'text'} }; 

outputs javascript code:

$var1 = {           'scripts' => [                          {                            'script' => '     var dummy = {}   '                          }                        ]         }; 

i'm not convinced process line concise works.


Comments

Popular posts from this blog

python - How to create a legend for 3D bar in matplotlib? -

java - Multi-Label Document Classification -

php - Dynamic url re-writing using htaccess -