This is our third and concluding post in the series of achieving analytics maturity for finance and accounting professionals. Read the first part here, and the second part here.
As seen in our Data Science example in part two of our series on becoming analytically mature, success in analytics does not start with an analytics product or platform, it begins with domain knowledge, motivated and capable employees, and an organization willing to invest time and resources into becoming more analytically mature.
Make no mistake, Business Intelligence and Data Science Tools are very, very important to the success of an organization; however, they are vehicles which need to be driven by people within a facilitating organization. Just like in Formula One Racing, the Driver and the Crew win races, not just the car – the people and the strategy grow an organization from descriptive, to Diagnostic and finally to the competitive advantage of predictive analytics.
So why is it so often that large scale analytics implementations fail? This goes back to the concept of the tool trap. Most, if not all, analytics projects that I’ve seen have been focused on one or two particular products. It may be a Spotfire on Hortonworks engagement, or maybe a SSRS reporting initiative.
In my opinion such tool specific projects are somewhat backward. The tool is not what leads to success in analytics, it is the skill of the people involved. A qualified group is capable of succeeding with any platform. Thinking that a tool alone is going to solve your problems is like expecting a Souffle at a fast food restaurant – just like a French Bistro it may have ovens, eggs, milk, flour, and cheese – but the skill and motivation of the people would make a Souffle impossible to get at your local drive-thru.
In practice, the skills are what leads to success of analytics projects and therefore moves a company forward in analytics growth. There are 3 core skill sets involved in effectively implementing analytics and moving up in analytics maturity. Those are domain knowledge, technical ability, and analytical ability. That combination, those pillars, are what sets the stage for success.
Domain Knowledge
Domain knowledge is arguably the most important, initially. Domain knowledge is the knowledge of the industry, the typical metrics used to measure success, and the overall objectives of the analytics efforts.
A Domain Knowledge Expert will ask broader questions such as:
- What are we trying to accomplish, and how does it improve the business? And more importantly, how does it improve the lives of our customer base?
- Are we going to decrease the cost of our products, so that our customers are able to purchase more of our goods? (Efficiency)
- Are we providing information about the benefits of our goods, so that more people can be aware of them? (Marketing Effectiveness)
A broader perspective is imperative for successful analytics. Unfortunately, it is in this area companies often fall short in analytics initiatives. Far too often domain expertise is an afterthought or an assumption, as opposed to an explicit inclusion as a skill set in the project requirements.
Technical Ability
Our next pillar is Technical Ability. Here is where the tools come into play. However, technical ability isn’t about knowing one specific tool or another, oftentimes it is about knowing which tool to use to achieve the project goals as efficiently as possible. For example, just because your organization spends $100,000 on a license to one particular platform doesn’t justify extending a project an additional $500,000 worth of billable time or project costs.
If a desired outcome may be achieved using standard programming languages and standard programming skills, particularly with open source languages and tools, then why on earth would you spend so much money on an “Analytics Platform,” not to mention the premium required to get people with a very specific skill who can use said platform?
Don’t get me wrong, I am not some poly-anna with my head in the sand. I understand that there’s budgets and politics. People do not want to say after purchasing some giant software package that they are not going to use it. But in the end, what is going to make someone look better professionally, spending a ton of money getting this particular tool that they want to work, or delivering a project in scope and under budget?
So again, technical ability is not necessarily about knowing how to use a specific tool; it’s knowing what the best tool is for the job, and how to use said tool. These tools take a few flavors. There’s data platform. Data platform can be Microsoft SQL server, it can be Postgres it can be MongoDB, there are the flavors of Hadoop, so HDFS file system where you can store flat files and access them with any other tool, like Hive or H base … The list goes on.
Next are your data interface tools. These are tools that are used to access data platform and make transformations. Those transformations can be anything from data cleaning to neural network modeling. Data interface tools are typically data focused programming or scripting languages for custom solutions and ETL and integration applications for more industry standard use cases. Examples of languages Spark, Scala, Python, Pig, R, or any of the SQL languages and examples of integration tools are Boomi, Business Works, SSIS, and Informatica.
Finally, the data needs to be represented to the end user, this is the application layer. Application layer is how your solution is distributed to the larger organization. These can be pre-built platforms like Spotfire, Tableau, SiSense], Click View, etc. They can be the one stop shop type of tools like the SSRS, SQL deployment layer, or the whole .net stack. They may even be build your own tools, for example using Python to not only interface with the data, but also create a data application. Same can be said for Scala, and Java.
Limiting yourself to a very specific set of tools is limiting, as you can see. I do understand that the emphasis on having a consolidated and unified skill set across a team, even across a department. This straightforward approach was and is the model of choice for enterprise application development. Unfortunately, or fortunately, data science and analytics can be done quickly and agilely. There is no need to limit to one specific platform or tool.
Analytics Skill Set
Finally, the analytic skill sets themselves. Here’s where we get into the heady mathematics. This is where we have the architects of the entire solution. Not necessarily why we’re doing it – that’s up to the domain experts. This is the how we’re going to do it. That is, now that I know what behavior we want to influence, now we need to figure out, for example, hot to predict our customer satisfaction, or we need to predict customer satisfaction and then use that in a model to predict customer purchasing behavior. Or we need to figure out exactly how we’re going to structure that prediction, the list goes on!
In short, the analytic skill set encompasses the framework for acquiring the data and how that data is shaped, through data ingestion, to data modeling, and finally the representation of that data.
Understanding how that data needs to be represented – the shape of the data, the appearance of the data – is different than what we use to represent that data. The what is a technical question. It is often the case an analyst will be able to represent some information in a certain way, using a certain tool, but that might not be the best tool for the job. That’s for the technical skills. What’s important is that analyst isn’t stuck trying to figure out how to use that tool. That analyst can say, “I want this represented in this manner. Here it is in my tool of choice. Here I’ve done it in R, but I understand that R may not be the most enterprise deployment friendly of applications. But I’m not going to spend the next 6 weeks learning a business intelligence tool, I’m going to talk to my technical liaison, and they will determine what the best platform is, given their knowledge and skill set, and have that analysis deployed in a fraction of the time, with the amount of rigor and double checks on the data required.”
As you can see, having 3 separate specializations – domain, technical, and analytical – in your analytics road map allows for all the necessary skill sets to be covered. A project that is only focused on one specific tool, simply because that tool is available, or because someone’s used it before, is limiting the options, limiting the resource pool from the get go, and increasing the probability for a project to not be successful.
Tying It All Together (The “Full Stack”)
Descriptive, Diagnostic and Predictive Analytics – and their corresponding specializations reporting, business intelligence, and data science – do not operate independently in an organization; at least not if they are to operate effectively.
A relatively new term emerging into our business vocabulary is “Full-Stack Data Science”; describing how all layers of analytics and data must operate in concert to maximize organizational returns on analytical activities.
Let us break down the phrase “Full-Stack Data Science.” “Data Science” (as we’ve mentioned in previous articles) remains a nebulous term at best. As a quick refresher, Data Scientist tend to fall into two categories: one, PhDs in computer science who create programs leveraging statistics, and two statisticians who understand how to incorporate programmatic solutions to accelerate data transformations and insights.
“Full-Stack” is a term appropriated from web development. In Web Dev a “Full Stack Developer” creates the user interface, data interface, and the data repository, as well as implements the technologies for users to access their applications via the web.
Full Stack Data Science is very similar to Full Stack Web Development. The Data Science stack requires some way to interface with data, a means to transform the data (i.e. mathematical operations), data storage, and an architecture to support the entire process.
Below we will provide some high level examples of Web Dev and Data Science Full Stack use cases as well as some technologies often used. This is not an exhaustive list by any means, only designed to introduce the terminology and provide a frame of reference for those familiar with the more common web based tools and technology.
Full Stack Terminology
- User Interface
- Definition: How the end user accesses and interacts with the application
- Examples:
- Web Dev: User goes to department store website and is able to quickly find desired product
- Data Science: Analysts wishes to see results of a model or report; user wishes to change parameters of existing model
- Web Development Technologies:
- HTML; CSS; JavaScript
- Data Science Technologies:
- Data Interface
- Definition: How the data entered into the user interface is moved to the data storage layer and how data in data layer is retrieved given user interface requests and information required for website
- Data may be passed directly from interface to repository or it may be transformed in some capacity (e.g. aggregations, type changes, mathematical operations, modeling)
- Examples:
- Web Dev: User purchases good from website, credit card number is transferred to banking system and purchase is saved in payment processing system. Shopping history may be retrieved by user at a later date.
- Data Science: Support Vector Machine Unsupervised Learning processes generates product suggestions for potential online customers; high value customers identified via squared error cost function neural network model
- Web Development Technologies:
- PHP; Angular.js; node.js
- Data Science Technologies:
- R; Spark; Python; SSIS; T-SQL; Data Modeling aspect of BI Tools(e.g. Tableau VQL; Spotfire TERR and Information Designer)
- Definition: How the data entered into the user interface is moved to the data storage layer and how data in data layer is retrieved given user interface requests and information required for website
- Data Repository (Web Dev and Data Science)
- Long term storage of data required for websites, predictive modeling, reporting
- Example: Long term storage of customer information; ERP; CMS
- Technologies:
- SQL; MongoDB; Cassandra; HBASE; Delimited Text; XML; Unstructured
This article concludes our series of how to achieve analytics maturity for your organization. While we focused on analytical maturity for accounting and finance professionals, all these concepts can be applied to any industry that seeks growth through data. Refer back to part one, where we discuss the conceptual framework of BI and Analytics, and part two ,where we discuss in detail what Data Science is and why it’s important for your organization.