Friday, August 5, 2011

Downloading market data from Stooq to R

Stooq provides multiple market data for free in the CSV format. I often use them in my analyses in R.

However there was a small problem with automatically downloading market data from Stooq.

The CSV files are generated dynamically, so there is no static URL available. The name of the instrument for downloading is passed in cookie.

Therefore, I've written a code that circumvents this limitation:


# http://www.rhinocerus.net/forum/lang-php/662887-how-read-download-attachment-uri.html - inspiration :)


library(RCurl)


getStooqData <- function(asset_code,static_cookie=TRUE) {

data_tmp <- tempfile() # "data.csv"
cookie_tmp <- "cookie.txt"

u1 <- paste("http://stooq.com/q/d/?s=",asset_code,sep="")
u2 <- paste("http://stooq.com/q/d/l/?s=",asset_code,"&i=d",sep="")

if (!static_cookie) {



h <- c(paste("GET ",u1," HTTP/1.0",sep=""),
Accept="image/gif",Accept="image/x-xbitmap",Accept="image/jpeg",Accept="mage/pjpeg",Accept="application/x-shockwave-flash",Accept="application/vnd.ms-excel",Accept="application/msword",Accept="*/*",
'Accept-Language'="pl, en-us;q=0.7",'User-Agent'="Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;SV1)",'Proxy-Connection'="Keep-Alive")


u1Opts <- curlOptions(header=TRUE,httpheader=h,cookiejar=cookie_tmp)
curlPerform(url=u1,.opts=u1Opts,verbose=TRUE)

h <- c(paste("GET",u2,"HTTP/1.0"),
Accept="image/gif",Accept="image/x-xbitmap",Accept="image/jpeg",Accept="mage/pjpeg",Accept="application/x-shockwave-flash",Accept="application/vnd.ms-excel",Accept="application/msword",Accept="*/*",
'Accept-Language'="pl, en-us;q=0.7",'User-Agent'="Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;SV1)",'Proxy-Connection'="Keep-Alive")

u2Opts <- curlOptions(header=TRUE,httpheader=h,cookiefile=cookie_tmp)

}

else {


h <- c(paste("GET",u2,"HTTP/1.0"),
Accept="image/gif",Accept="image/x-xbitmap",Accept="image/jpeg",Accept="mage/pjpeg",Accept="application/x-shockwave-flash",Accept="application/vnd.ms-excel",Accept="application/msword",Accept="*/*",
'Accept-Language'="pl, en-us;q=0.7",'User-Agent'="Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1;SV1)",'Proxy-Connection'="Keep-Alive",Cookie=paste("cookie_uu=p;cookie_user=%3F0001dllg000011500d1300%7C",asset_code,sep=""))

u2Opts <- curlOptions(header=TRUE,httpheader=h)

}


reader <- basicTextGatherer()

w <- getURLContent(url=u2,.opts=u2Opts)


write(w,file=data_tmp)

stooq_data <- read.csv(data_tmp)

stooq_data
}


stooq_data <- getStooqData("es.f",static_cookie=TRUE)


In most cases, the function should work with static_cookie set to TRUE. However, a possibility that the cookie will change exists, so I've added some additional code that actually downloads a cookie before downloading the data.

1 comment:

Kacper Trzaskalski said...

Unfortunately you code doesn't work now.I am not sure where is the problem. Maybe You will be able to find what is wrong.