Daniel Smith, MBA CPA CMA
Principal Consultant
An example of R in TERR (Package “RinR”) using the Census REST API
In our previous post we installed R and downloaded the jsonlite package. Now we are going to use R and jsonlite to access US Census data directly within R, then use the TERR package RinR to bring the Census data into TIBCO Spotfire.
Our first step is figuring out how to use the Census API within R.
Manipulating the Census API in R:
The Census API is extremely easy to use:
1) Click here to learn about the API
3) Determine the dataset and parameters you wish to use
4) Paste your parameterized link including the key acquired in step two into your browsers address bar to see if it works
Example:
The following link returns the total population by state from the SF1 2010 Census:
https://spotfire.tibco.com[KEY]&get=P0010001,NAME&for=state:*
(Insert your own key at [KEY], but do not include brackets!)
The result will look like this:
I changed the key after entering the URL, but it gives an example of what the final product will look like.
With the connection confirmed, we can use R to read the data provided by the URL and turn it into a data table for analysis. This is why we downloaded the package jsonlite.
jsonlite uses another R Package RCurl to display content returned by a URL. Originally this package was for searching online text data like emails and forum posts as well as automatically completing online forms for manipulating sites through scripting; however, with the rise of RESTful data, RCurl has become a great way to access data.
jsonlite makes the complicated functions of RCurl much easier to use. In order to create a dataframe (R’s version of a flat file) from the Census API, all that’s needed is the fromJSON() function and a Census API URL.
To use the Census API URL we have just created in R, we need to create a script. Open up the R GUI if it isn’t already, then go to File > New Script.
A new window will open. This window will allow you to type multiple lines of R code which can be ran in a batch. Copy and paste the following, and make sure to use with your API key instead of [KEY]:
library(jsonlite)
data <- data.frame(
fromJSON("https://api.census.gov/data/2010/sf1?key=[KEY]&get=P0010001,NAME&for=state:*"
)
,stringsAsFactors = FALSE
)
data
Highlight everything, then hit F5 or Ctrl + R to run the script, the output should look like this:
A few quick notes on what that script is doing:
data.frame(): This tells R the data should be read as a dataframe which we can think of as a flat file or spread sheet.
stringsAsFactors = FALSE: An argument of data.frame() strings as factors tells R we do not want additional metadata attached to any strings in the dataframe. Using strings as factors is helpful in statistical analysis but it becomes a nuisance in Spotfire.
data <- : “<-” is the assignment operator in R. Here we are telling R to put the dataframe created by the operation into a variable called “data.” At the end of the script we call data so it is displayed in the console / terminal.
Using TERR Package “RinR” in Spotfire
We are pulling in data with the Census API, fantastic! Now we just copy and paste this into the register data function window and call it a day, right? Nope… turns out jsonlite is one of those packages that doesn’t work in TERR.
But all is not lost, TERR includes a package “RinR” pre-installed and “RinR” includes a function “REvaluate” that allows the use of a local R instance within TERR.
This deserves repeating – REvaluate allows the use of a local R instance within TERR. Why is this so important? It allows every CRAN package to be used locally in Spotfire! Every clustering, forecasting, and graphing function that doesn’t require external devices (like R Commander) can now be used locally in Spotfire.
But first things first, let’s use jsonlite in TERR. Please note, this will not work in R GUI, so keep an eye out for an upcoming – post describing how to create a TERR and R development environment in Rstudio.
Here’s the final script which we will run in Spotfire:
data_pre <- REvaluate(
{
library(jsonlite);
data_R <-data.frame(
fromJSON("https://api.census.gov/data/2010/sf1?key=[KEY]&get=P0010001,NAME&for=state:*")
,stringsAsFactors = FALSE)
data_R
}
)
colnames(data_pre) <- as.character((data_pre[1,, drop = TRUE]))
data <- data_pre[-1,]
Well that looks a little different than what we had before. Here is a breakdown of the changes.
Revaluate({}): The function contains the entire previous script we executed in R, the dataframe it returns is being assigned to the variable data_pre. The original “data” variable has been renamed “data_R” to reduce conflicts.
data_pre: We are using an intermediate variable to clean our dataframe before our final assignment. This is to eliminate any conflicts when assigning an output variable in Spotfire.
colnames(data_pre)… : Unfortunately fromJSON doesn’t read the first lines of the Census data as column names. The operations in this line convert the first row of data_pre (data_pre[1,]) into characters (as.character) then assign those characters to the column names of data_pre (colnames(data_pre)).
data <- data_pre[-1,]: Finally, we do not need the first row anymore, so we delete the first row of data_pre (data_pre[-1,]) and assign it to our ultimate data variable “data.”
Using the Census API in Spotfire
Now for the easy part! As long as you are using your local TERR engine in Spotfire, this is a piece of cake.
1) Open Spotfire and go to Tools > Register Data Functions
2) Select script type “R script – TIBCO Enterprise Runtime for R”
3) Indicate “RinR” as the only package you need to use
4) Insert the TERR script above into the “script” window
5) Assign the “data” variable to an output parameter with data type as Table
a. The script is pulling in data from the Census API only; therefore, no input parameter is required in this example
6) Tell Spotfire to create or refresh a table with the output parameter via an “Output Handler”
Final Script:
Output Parameter:
Now select “Run” at the top to try it out. If everything goes according to plan you will get a prompt telling you where your data needs to go in Spotfire and which Spotfire tables you are retrieving data from, if you had inputs.
As we only have data going to Spotfire, we only need to set an output. Let Spotfire know the data parameter is going to produce a table with the Output Handler, then select that you want a new table.
Click “OK” and you have now connected Spotfire to the Census API!
Now that we have the connection there is a lot more we can do with the Census API. In subsequent posts we will explore how to pass different values into the API URL via input parameters – creating a way to dynamically change the data pulled from the Census API without ever having to store files locally.
We will also explore as how to properly set up a development environment for TERR and R. RStudio allows us to create R Scripts referencing different R engines. This is very helpful for debugging the complex operations of which R and TERR are capable.
Want to learn more about Spotfire TERR?
Want to learn more about R and specifically R in Spotfire – TERR? Starting in June 2014, Syntelli will be offering classes. Take a look and let us know which ones you’d be interested in joining.
Have a general question about R? Ask our expert, Dan! Submit your inquiry and he’ll be sure to touch base with you as soon as possible.
Take your TERR knowledge to the next level with our latest Ebook!
[maxbutton id=”8″]