Wrestling with Big Data? Try This Tag Team: Hadoop and Spotfire

Companies seldom extract value from the ocean of largely unstructured data stored on their servers. Why? The volume of unused data is too large to handle in a timely way. Or, the tools that can handle it aren’t available or affordable. And analyzing the content? That job adds another layer of effort and cost to a complex problem.

Fortunately, tools that beat the challenge of Big Data problems are available. Apache™ Hadoop, TIBCO™ Spotfire® and TERR (a scripting engine for the R language), provide an especially effective combo. Together, they make short work of high-volume data discovery and analysis. In this post, we talk about Hadoop-Spotfire integration and how it can simplify data discovery.

What Does It Take to Beat Big Data Obstacles?

Staying out of the complex big data jungle. That is, not getting stuck between Hadoop infrastructure, software, install, and configuration problems that pop up every step of the way. If this happens, you can lose the path and never see big data benefits.

Even when you succeed in combining structured, semi-structured, and unstructured data, you’re not home free. You still have to clean it up to make it available to end-users before they can use it for analytics and self- service reporting.

Ideally, an effective Big Data solution helps analysts:

  • Build connections into different forms of output such as visualizations and reports.
  • Distribute and share output.
  • Process huge volumes of data and extract it on demand for further analysis

Spotfire uses Apache™ Drill® to add the analytics layer on top of unstructured data. Drill supports many NoSQL database and file system formats, including:

  • Google Cloud Storage
  • MapR-FS and MapR-DB
  • Amazon S3
  • Azure Blob Storage
  • HBase

A single query can join data from multiple data stores. For example, you can join a user profile collection in MongoDB with a directory of event logs in Hadoop.

Serious Capabilities Deliver Serious Benefits

Hadoop-Spotfire integration can deliver these capabilities and more, quickly and with less effort. Here’s why:

  •  Spotfire offers native support for Hadoop capabilities. Analysts can select data by grouping Hadoop integration points into Spotfire native data connectors. Spotfire Hadoop connections can be quickly configured into analytic workflows, dashboards, or reports, which can then be shared, reused, and consumed across organizations. KPIs based on Hadoop data can be pushed to virtually any device that uses TIBCO Spotfire Metrics.
  • Spotfire features a data service connector to Hadoop. The connector enables users to combine and analyze information from Hadoop clusters along with structured data from business applications such as an SAP or Oracle ERP
  • Data connectors enable users to access Hadoop data via in-database, in-memory, on demand, or a combination of methods. In-database enables analysis of huge data sets by pushing aggregations into Hadoop. These in-database master visualizations are often combined with detailed visualizations, where row-level data slices are extracted from Hadoop on-demand and loaded into memory.
  • Spotfire connects, models, and visualizes Hadoop data without the need for scripting or manual query editing. Spotfire offers custom query support with in-database, in-memory, and on demand. This approach enables experienced data analysts the freedom to create optimized queries. Working in Spotfire, data analysts can extract data from Hadoop for further analysis. This can be done by drilling down from visualizations (data on demand) or by posing custom queries.

That’s how Hadoop-Spotfire integration reduces the effort of data discovery. But what’s the end result?

Spotfire Delivers Value from Big Data Assets

Hadoop-Spotfire integration enables users to search for and select valuable nuggets from huge data sets. The rewards for their search include:

  • Faster results with less effort. Hadoop is built to search the largest data sets quickly and accurately. You can query data, view visualizations and quickly insert Spotfire-Hadoop connections into many kinds of output without time-consuming configuration.
  • Sophisticated results with or without a data science degree. Business users can select data, analyze it, and share results without knowing about customized queries, scripting or the mysteries of Hadoop. But data analysts have plenty of advanced tools that expedite results.
  • The ability to analyze and share the results. Spotfire Hadoop connections can be quickly configured into analytic workflows, dashboards, or reports. Users can send the results for analysis and distribute their results quickly and easily.

 

Get Free Remote Access to a Fully Configured MapR/Drill/Spotfire Environment 

Our test drive helps users fire up a pre-built and ready-to-go environment to immediately see how to extract value from big data technologies. Learn more or get started right now with our free trial (no download needed)! 

Free Trial!

About the Author

Shikha is a tech leader with deep expertise in emerging technologies such as Big Data analytics using Actian, Hortonworks, Tableau, and Spotfire. Her experience includes working with Fortune 500 companies, implementing solution design, architecting, and project managing. Shikha leads Technology for Syntelli and is passionate about non-profit causes and giving back to the community.

Shikha Kashyap

Chief Technology Officer