R in Kepler


What is R?

R is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment which was developed at Bell Laboratories (formerly AT&T, now Lucent Technologies) by John Chambers and colleagues...R provides a wide variety of statistical (linear and nonlinear modelling, classical statistical tests, time-series analysis, classification, clustering, ...) and graphical techniques, and is highly extensible.

from the R Project Web page - http://www.r-project.org

Installing R

There are several examples workflows which use the R system. For these example to operate, R must be installed on the computer running the Kepler application. R can be freely downloaded from links on the R Project web site (http://www.r-project.org). Follow the instructions provided for installation. In addition, the R 'bin' directory must be added to the PATH variable on the host computer. A simple test to see if installation is correct is to open a command/terminal window and type the command 'R'. This should startup the R system and provide messages telling the user that R has been started.

R ExampleWorkflows

There are several examples of using R in Kepler. Most use the RExpression actor which is described more fully in "docs/user/RExpressionActor.pdf". Most of the examples can be found in the 'workflows' subdirectory

Average Count Data by Species This workflow accesses a table in the local filesystem and then averages count data by species.
EML Simple Plot This example shows how to convert EML column data to arrays and then plot using R. This is an R version of the eml-simple-plot example.
Local File to DataFrame Example
This workflow shows how to read a local table file into R and then process the information in the table.
Ecogrid Data Source to R DataFrame (I)
This example uses an EML2 Datasource as the source for an R DataFrame, passing the data as a file name.
Ecogrid Data Source to R DataFrame (II) This example uses an EML2 Datasource as the source for an R DataFrame, passing the data as a Kepler 'column record' token.
Sequence to Record Example
This workflow shows how sequences of values can be converted to arrays and then to a record, creating a table which is passed to an RExpression actor.
Linear Regression Example
Here, the RExpression actor is used to carry out a linear regression calculation and plot the results
EML Pairs Plot (from arrays)
This workflow configures the EML2 DataSource to supply arrays for each column. The arrays are connected to RExpression.
Output from RExpression
This example shows how the RExpressions actor can output Kepler arrays