wiki:xQTLBioinformaticianScript

xQTL workbench - Add a new R script

  • To start adding an R script, go to Configure analysis in the main menu and select R script.
  • Click the Add new record button.
  • Fill in a name for the script (eg 'myScript') and the extension, 'r'. Also select the investigation this script is part of.
  • Upload the actual script through direct CSV input or by file.
  • After adding, you can have a look at the R api (click on R api at the top right corner of the screen) and confirm that your script is listed under #loading user defined scripts.
  • Your R script is wrapped in order to be used inside the computation framework of xQTL. Have a look at it. It should be located at /api/R/userscripts/myScript.R.

Example script

map <- getparameter("map",jobparams)
method <- getparameter("method",jobparams)
model <- getparameter("model",jobparams)
stepsize <- getparameter("stepsize",jobparams)
  
report(2,"LoadingCrossobject")
load(paste("./run",jobid,"/cross.RData",sep=""))
report(2,"FinishedLoading")
if(stepsize!=0){
	cross <- calc.genoprob(cross, step=stepsize)
	report(2,"FinishedMarkerImputation")
}
  
report(2,"Starting%20QTL")
if(map=="scanone"){
	results <- scanone(cross,pheno.col=item,method=method,model=model,verbose=TRUE)
}else{
	results <- mqmscan(cross,pheno.col=item,verbose=TRUE)
}
  
report(2,"Starting%20QTL%20profile%20plot")
imagefilename <- paste("P",jobid,"_",item,".fig",sep="")
xfig(file = imagefilename)
plot(results)
dev.off()

report(2,"PlotInTemp.fig")
postForm(paste(dbpath,"/uploadfile",sep=""),investigation_name=investigationname, name=imagefilename, type='InvestigationFile', file=fileUpload(filename=imagefilename), style='HTTPPOST')

report(2,"UploadedFIGtoDatabase")

ResultsToMolgenis(result=results,resultname=outname, Trait_num=(item-1),investigationname=investigationname)
report(3,"Job%20Finished")

Adding new / additional analysis (R scripts)

New R scripts can be easily added as a new analysis. To get started, take a look at Add new R script. This shows the basic example (template) of such a script. This is a most minimal job.

The script is written in R, but the execution of your code does not happen here. Instead, it will produce a file that is ran by the job execution framework.

Structure of the user function

All options defined in the parameter set and dataset configuration screens are passed to the custom script. From then on, you are in controll and can do what you want. Parameters can be retrieved by using:

message <- getparameter("message",jobparams)

The user supplied data and parameter sets are not the only parameters passed to the userscript. The most important one is the myanalysisfile in which code need to be generated specific for the subjob and item

dbpath - database URL 
subjob - Which subjob are we
item - whihc item does this job need to do
jobid - which analysis run do we belong to
outname - output name for file or database
jobparams - User supplied parameters
investigationname - Not used yet, but will be when we have multiple instances running
libraryloc - Location of the R libraries

There is one special object you can use: a pre builded R/qtl cross. This is available when one passes a 'genotype' and a 'phenotype' matrix to a certain analysis.

load(paste("./run",jobid,"/cross.RData"),"\n",sep="");

We can thus use R as an high level esperanto language. This enables us to execute for example Beagle or Plink one could create a statement like:

shell.exec("plink -job=", paste("myjob"+item));
shell.exec("beagle -job=", paste("myjob"+item));

Points of attention

The progression of the job should be reported to the database. Use 'report' to do this, this reports a statuscode and a short text describing what happened to the subjob:

report(3,\"JobFinished\")

Data produced during your custom analysis can be written back directly to the database, please look at some existing scripts to see how this is done. However we recommend using XGAP standard functions like: find.<your_xgap_datatype> or add.<your_xgap_datatype>

Setting up Plink analysis

Try the following to see if it works:

  • Install  Plink and WGET. ( Mac,  Windows)
  • Put plink on your path.
  • Upload  example data in your database. (just hapmap1.map and hapmap1.ped)
  • Create a new Analysis with target function 'PLINK'.
  • Add an R script with function 'run_PLINK'. Let the script download the MAP and PED example files. (use servlet /downloadfile)
  • Run Plink system call inside R: plink --noweb --file hapmap1 --assoc --out results
  • Upload results.assoc back in database. (use servlet /uploadfile)

User script for plink analysis

  inputname <- getparameter("inputname",jobparams)
  
  system(paste("wget ",paste(dbpath,"/downloadfile",sep=""),"?name=",inputname,"_ped -O ",inputname,".ped",sep=""))
  report(2,"PEDDownloaded")
  system(paste("wget ",paste(dbpath,"/downloadfile",sep=""),"?name=",inputname,"_map -O ",inputname,".map",sep=""))
  report(2,"MAPDownloaded")
  system(paste("plink --noweb --file ",inputname," --assoc --out ",outname,"",sep=""))
  report(2,"beforeUpload")
  postForm(paste(dbpath,"/uploadfile",sep=""),investigation_name=investigationname, name=paste(outname,".assoc",sep=""), type = 'InvestigationFile', file = fileUpload(filename=paste(outname,".assoc",sep="")), style='HTTPPOST')
  report(3,"PLINKfinished")