Monday, May 30, 2011

Google Prediction API v1.2 for R

Google has recently made available a new version (v1.2) of Google Prediction API. (see announcement from Google I/O 2011).

Hence the previously available implementation of Google Prediction API for R has stopped working :(

I've spent some time and adapted it to new version of the API as well as made some small extensions and modification for Windows/R 2.13.0.

You can find the source code and R package at:

Sample usage of the package:

# install package
install.packages("googlepredictionapi_0.12.tar.gz", repos=NULL, type="source")
 #--- initialize
 # turn off SSL check - see: &
options(RCurlOptions = list(capath = system.file("CurlSSL", "cacert.pem", package = "RCurl"), ssl.verifypeer = FALSE))
 # put your own email, password and API key below
myEmail <- "***"
myPassword <- "***"
myAPIkey <- "***"
 # put path to python.exe on your computer and path do gsutil directory
myPython <- "c:/Python27/python.exe"
myGSUtilPath <- "c:/gsutil/"
myVerbose <- FALSE
 #--- work
 # upload local CVS file to Google Storage and initiate training; local file must be in R working directory
my.model <- PredictionApiTrain(data="./language_id_pl.txt",remote.file="gs://prediction_example/prediction_models/languages")
 # alternative: initiate training of a model already uploaded to Google Storage
my.model <- PredictionApiTrain(data="gs://prediction_example/prediction_models/languages",tillDone=FALSE) # tillDone - repeat checking till model is trained
 # check whether model is trained; if tillDone=TRUE was set above, there is no need for that
result <- PredictionApiCheckTrainingStatus("prediction_example","prediction_models/languages",verbose=TRUE)
 # you can adapt the result returned by PredictionApiCheckTrainingStatus to 'predictionapimodel' class used in predictions
my.model <- WrapModel(result)
 # check new data against model (I have added some Polish-language texts to the Google Prediction API 'Hello World' example)
predict(my.model,"'Prezydent Obama spotkał się z parlamentarzystami'")
 # please note, this package returns all labels and scores for a given data in a format:
# [1] "Polish"   "French"   "Spanish"  "English"  "0.36195"  "0.26396"  "0.260067" "0.114022"
 # some other prediction request
predict(my.model,"'This is a test'")
 # list objects in a Google Storage bucket

No comments: