rest - R, GET and GZ compression -


i building clients onto restful apis. links let me download attachments (files) server, , in best case these .txt. mention restful part since means have send headers , potentially body each post - standard r 'filename'=url logic won't work.

sometimes people bundle many txts zip. these awkward since don't know contain until download many of them.

for moment, unpackaging these, gzipping files (adds .gz extension) , re-uploading them. can indexed , downloaded.

i'm using hadley's cute httr package, can't see elegant way decompress gz files.

when using read.csv or similar files gz ending automatically decompressed (convenient!). what's equivalent when using httr or curl?

content(get("http://glimmer.rstudio.com/alexbbrown/gz/sample.txt.gz")) [1] 1f 8b 08 08 4e 9e 9b 51 00 03 73 ... 

that looks nice, compressed byte stream correct header (1f 8b). need text contents, tried using memdecompress, says should this:

memdecompress(content(get("http://glimmer.rstudio.com/alexbbrown/gz/sample.txt.gz")),type="gzip") error in memdecompress(content(get("http://glimmer.rstudio.com/alexbbrown/gz/sample.txt.gz")),  :    internal error -3 in memdecompress(2) 

what's proper solution here?

also, there way r pull index of remote .zip file without downloading of it?

you can add parser handle mime type. @ ?content , line you can add new parsers adding appropriately functions httr:::parser

ls(httr:::parsers)  #[1] "application/json"                  "application/x-www-form-urlencoded" #"image/jpeg"                        #[4] "image/png"                         "text/html"                         #"text/plain"                        #[7] "text/xml"      

we can add 1 handle gz content. dont have better answer @ point gave can incorporate function.

assign("application/octet-stream", function(x, ...) {scan(gzcon(rawconnection(x)),"",,,"\n")},envir = httr:::parsers)  content(get("http://glimmer.rstudio.com/alexbbrown/gz/sample.txt.gz"), = "parsed")  read 1 item [1] "these not droids looking for" >  

edit: hacked alternative:

assign("application/octet-stream", function(x, ...) {f <- tempfile(); writebin(x,f);untar(f);readlines(f, warn = false)},envir = httr:::parsers)  content(get("http://glimmer.rstudio.com/alexbbrown/gz/sample.txt.gz"), = "parsed") #[1] "these not droids looking for" 

with regards listing files in archive maybe can adjust function somewhat. if try httr source files. have mime type "application/x-gzip"

assign("application/x-gzip", function(x, ...) {     f <- tempfile();      writebin(x,f);      if(!is.null(list(...)$list)){         if(list(...)$list){             return(untar(f, list = true))         }else{             untar(f, ...);             readlines(f)         }     }else{         untar(f, ...);         readlines(f)     } }, envir = httr:::parsers)  content(get("http://cran.r-project.org/src/contrib/httr_0.2.tar.gz"), = "parsed", list = true)  # > head(content(get("http://cran.r-project.org/src/contrib/httr_0.2.tar.gz"), = "parsed", list = true)) #[1] "httr/"                 "httr/md5"              "httr/tests/"           #[4] "httr/tests/test-all.r" "httr/readme.md"        "httr/r/" 

Comments

Popular posts from this blog

blackberry 10 - how to add multiple markers on the google map just by url? -

php - guestbook returning database data to flash -

java - Using an Integer ArrayList in Android -