There is a common misconception about the costs associated with structuring data for financial filings to regulators. Companies assume that converting financial reports to XBRL a high effort / low value expense. But the lack of perceived value is due to their misunderstanding of the value of data automation and standardization, not XBRL.
In other words, accountants are manually creating reports then converting them to XBRL.
The true benefits of XBRL and other data standardization tools are seen when they are automated. And automation must begin before the reporting process – namely, by tagging data from the source.
Unfortunately, one article alone could never cover all the steps involved in full XBRL data automation. However, while the tutorial below doesn’t delve into the intricacies of tagging XBRL data, it does explain how to automate XBRL for financial analysis, after the tagging has occurred. Hopefully, this will help show the value of automated reporting – moving from compiling data to considering insights – as the end goal of a full automated data standardization pipeline.
This project is for the automation of XBRL financial analysis, it acts as an example of how easy automation can be just by learning a few key concepts.
We will use the XBRL API to grab financial statements, then use publically available libraries to QA and analyze the statements.
1. Installation
If you do not have dev tools already installed, do so, then install the remaining git packages
#install.packages("devtools")
library(devtools)
#devtools::install_github("bergant/xbrlus") #https://github.com/bergant/xbrlus
#devtools::install_github("bergant/finstr") #https://github.com/bergant/finstr
library(XBRL)
library(xbrlus)
library(finstr)
library(dplyr)
library(tidyr)
library(pander)
2. Getting Data
Initally we are going to use hard coded values, then create a micro service to get them online.
But first, will will need an API key from XBRL
You can get it from here: https://xbrl.us/home/use/howto/data-analysis-toolkit/
Then either create a system environment variable XBRLUS_API_KEY=APIValueHere or add a new R system variable to .Renviron in your R home (find by using normalizePath(“~”) in R Console)
Or use the code below each time you run the script, but putting an API key in code is never a good practice
Additional info here: https://xbrl.us/home/use/howto/data-analysis-toolkit/ on the api
##Uncomment below if seting API key in code
#api_key = "xxxxx"
#Sys.setenv(XBRLUS_API_KEY=api_key)
##
balanceSheetCompare <- function(c1 = "aapl",c2 = "goog",c3 = "fb", year =2016, return_data = FALSE){
#heavily lifted from example on https://github.com/bergant/xbrlus
cik <- xbrlCIKLookup(c(c1,c2,c3))
elements <- xbrlBaseElement(c(
"AssetsCurrent",
"AssetsNoncurrent",
"Assets",
"LiabilitiesCurrent",
"LiabilitiesNoncurrent",
"Liabilities",
"StockholdersEquity",
"MinorityInterest",
"StockholdersEquityIncludingPortionAttributableToNoncontrollingInterest",
"LiabilitiesAndStockholdersEquity"
))
values <- xbrlValues(
CIK = cik$cik,
Element = elements$elementName,
DimReqd = FALSE,
Period = "Y",
Year = year,
NoYears = 1,
Ultimus = TRUE,
Small = TRUE,
as_data_frame = TRUE
)
balance_sheet <-
elements %>%
left_join(values, by = "elementName") %>%
select(entity, standard.text, amount) %>%
mutate(amount = round(amount / 10e6,0)) %>%
spread(entity, amount)
if(ncol(balance_sheet == 5)) {balance_sheet[,5] <- NULL}
balance_sheet <- balance_sheet[
order(order(elements$elementName)),
!is.na(names(balance_sheet))]
row.names(balance_sheet) <- NULL
pandoc.table(
balance_sheet,
caption = "Balance Sheet Comparison",
big.mark = ",",
split.table = 200,
style = "rmarkdown",
justify = c("left", rep("right", 3)))
print(pandoc.table)
if(return_data){return(balance_sheet)}
}
This function allows me to compare three ticker symbols for any year I want, pretty cool and it was lifted right from the github of the author so it’s basically cut and paste!
It also returns the balance sheet value if I want to get more data
balanceSheetCompare()
Balance Sheet Comparison
standard.text | Alphabet Inc. | APPLE INC | FACEBOOK INC |
---|---|---|---|
Assets | 16,750 | 32,169 | 6,496 |
Assets, Current | 10,541 | 10,687 | 3,440 |
Assets, Noncurrent | 6,209 | NA | NA |
Liabilities | 2,846 | 19,344 | 577 |
Liabilities and Equity | 16,750 | 32,169 | 6,496 |
Liabilities, Current | 1,676 | 7,901 | 288 |
Liabilities, Noncurrent | NA | NA | NA |
Stockholders’ Equity Attributable to Noncontrolling Interest | NA | NA | NA |
Stockholders’ Equity Attributable to Parent | 13,904 | 12,825 | 5,919 |
Stockholders’ Equity, Including Portion Attributable to Noncontrolling Interest | NA | NA | NA |
3. Getting XBRL Ourselves
Next we want to look at parsing entire xbrl files we get from EDGAR: https://www.sec.gov/edgar/searchedgar/companysearch.html
Again, this is taken right from the github for the library:
getXbrlStatement <- function(url) {
old_o <- options(stringsAsFactors = FALSE)
fullStatementXbrl <- xbrlDoAll(url, verbose = TRUE)
options(old_o)
bullStatementDF <- xbrl_get_statements(fullStatementXbrl)
}
getBalanceSheet <- function(parsedXBRL){
if(!is.null(parsedXBRL$StatementOfFinancialPositionClassified)){
balanceSheet <- parsedXBRL$StatementOfFinancialPositionClassified
}
if(!is.null(parsedXBRL$ConsolidatedBalanceSheets)){
balanceSheet <- parsedXBRL$ConsolidatedBalanceSheets
}
return(balanceSheet)
}
getIncomeStatement <- function(parsedXBRL){
if(!is.null( parsedXBRL$StatementOfIncome)){
incomeStatement <- parsedXBRL$StatementOfIncome
}
if(!is.null( parsedXBRL$ConsolidatedStatementsOfComprehensiveIncome)){
incomeStatement <- parsedXBRL$ConsolidatedStatementsOfComprehensiveIncome
}
return(incomeStatement)
}
4. Financial Analysis
Here’s where it gets incredible, using built in calculation functions in the finstr library, we can take multiple years of data and completely automate the analysis!
applXBRL2016 <- "https://www.sec.gov/Archives/edgar/data/320193/000162828016020309/aapl-20160924.xml"
applXBRL2015 <- "https://www.sec.gov/Archives/edgar/data/320193/000119312515356351/aapl-20150926.xml"
appl201610K <- invisible(getXbrlStatement(applXBRL2016))
appl201510k <- invisible(getXbrlStatement(applXBRL2015))
appl2016Balance <- getBalanceSheet(appl201610K)
appl2015Balance <- getBalanceSheet(appl201510k)
Sweet, we have our statements, now remember in introductory accounting classes how much it work it was to close the books and verify there weren’t any errors?
…checking for errors:
check <- check_statement(appl2016Balance)
check
Number of errors: 0 Number of elements in errors: 0
… looking at specific calcuations:
check$elementId[3]
[1] “AssetsCurrent”
check$expression[3]
[1] “+ CashAndCashEquivalentsAtCarryingValue + AvailableForSaleSecuritiesCurrent + AccountsReceivableNetCurrent + InventoryNet + NontradeReceivablesCurrent + OtherAssetsCurrent”
check$original[3]
…and verification those calcuations are correct:
[1] 8.9378e+10
check$calculated[3]
[1] 8.9378e+10
hmmm…
Well, there goes a few jobs…
Let’s get rid of some more shall we!
We can merge the statements together and do whatever analysis we like, even custom calculations
balance_sheet <- merge( appl2015Balance, appl2016Balance )
ratios <- balance_sheet %>% calculate( digits = 2,
current_ratio = AssetsCurrent / LiabilitiesCurrent,
quick_ratio =
( CashAndCashEquivalentsAtCarryingValue +
AvailableForSaleSecuritiesCurrent +
AccountsReceivableNetCurrent
) / LiabilitiesCurrent
)
Year over year ratios:
pandoc.table(
ratios,
caption = "Year Over Year Ratios",
big.mark = ",",
split.table = 200,
style = "rmarkdown",
justify = c("left", rep("right", 2)))
date | current_ratio | quick_ratio |
---|---|---|
2014-09-27 | 1.08 | 0.67 |
2015-09-26 | 1.11 | 0.73 |
2016-09-24 | 1.35 | 1.05 |
5. Making a Plot
And last but not least a quick visual to show some things aren’t as easy in R 🙂
library(ggplot2)
library(reshape2)
First we have to make our data ‘long’ for it to work appropriately in ggplot:
long_df <- invisible(melt(ratios)) #invisible to hide log output
## Using date as id variables
pandoc.table(
long_df,
caption = "Ratios in Long Format",
big.mark = ",",
split.table = 200,
style = "rmarkdown",
justify = c("left", rep("right", 2)))
date | variable | value |
---|---|---|
2014-09-27 | current_ratio | 1.08 |
2015-09-26 | current_ratio | 1.11 |
2016-09-24 | current_ratio | 1.35 |
2014-09-27 | quick_ratio | 0.67 |
2015-09-26 | quick_ratio | 0.73 |
2016-09-24 | quick_ratio | 1.05 |
Then we have to use the unusual language of ggplot to make a chart, it looks cool, but the syntax is… less than obvious…
ratio_plot <- ggplot(data = long_df, aes(x=date, y=value, fill = variable))
ratio_plot <- ratio_plot + geom_bar(stat="identity", position=position_dodge())
ratio_plot
The complete (end to end) automation that XBRL provides can help organizations with data governance and maintaining a level of standardization that saves time and resources, as well as significantly reduce errors from manual touch points.