Blog

How To Create The Sentiment Analysis Data Viz

facebooktwittergoogle_plusredditpinterestlinkedinmail

Syntelli Blog - College Football Sentiment

Seems like everyone’s agog about the national collegiate football championships in Dallas. The high-energy enthusiasm for the Ohio State Buckeyes and University of Oregon Ducks made me think of the power of visualization. (Yes, really.) After all, it’s one of the most potent tools we have to make very large or complex data sets easy to understand and use.

The game provided an excellent way to show the power of visualization in an informal way. We put Tableau through its paces by creating a visualization that captured viewer sentiment during the championship finals. It was a great way to show how Tableau extracts value from social media information.

From Collegiate Buzz to Insight: Ducks versus Bucks

We used the R-language and sentiment data from 330,000 tweets to create a snazzy Bucks versus Ducks visualization. It lights up a map of the U.S. by showing the geographic origin, timing and sentiment of tweets sent throughout the game. Here’s the step-by-step description of how we did it.

  1. Get oriented. When you engage in Twitter analyses, it’s most effective to start from high-level (top-down) perspective. Then drill down into the details, using time, location and sentiment data.
  2. Get established. To access Twitter data, I created a Twitter developer account. As a Twitter developer, I was given special credentials to access Twitter data from a programming language platform such as R for statistical computing or Python.
  3. Gather data. In R, I created a script to collect, wrangle, and score data from live tweets of the game. Then I pulled the tweets into Tableau. The tweets contained geographic and time series data, which I used as chart dimensions to display sentiment frequency and intensity: tweets with higher positive or negative scores contain frequent words from the sentiment algorithm’s positive/negative word lists.
  4. Make the map. I used geo-coordinates to create the map dimensions and time series data (measured in minutes) to create the line chart dimensions. Map point size, line chart y-axis, and bubble chart color all measure sentiment score.
    Sentiment is the amount of enthusiasm about a topic. Words used in each tweet are given a positive or negative score depending on their classification. Positive words like “good” or “Wow!” are given positive scores. Negative words “bad” or “stupid” are scored against a scale of negative values. In each case, the more intense the sentiment, the more positive or negative the score. Individual words are tallied into a single sentence score.
    sentiment example
  5. Connect to the Tweeterverse. I then made a Tweeterverse – every tweet in the dataset visualized in the bubble chart. Selections in the map and line charts filter down the Tweeterverse to show tweets only in a certain location, and/or at a certain point in time or certain sentiment scores. Tweets were colored by sentiment. Viewers could use the map, line chart, and time filters to zoom in on and out of the dataset.
  6. Run the visualization. When I ran the visualization, different amounts of fan sentiment appeared from locations indicated on the map.

Besides being a lot of fun to create and watch, the visualization was a great way to show how easy it was to find, collect and display relatively complex sentiment data. It made it easy for an analyst (yours truly) to identify game events and connect them with intensity of response and locations of the viewers who sent tweets.

In business analytics, this connection might not be as fast or obvious. But it is possible to find, track and measure sentiment through time and space in exactly the same way.

More than ever, it’s important to stay competitive and know what customers are saying about your brand. Social network media and sentiment analysis provide a very useful way to stay in touch. And they deliver results that are marketing gold.



Mario-Carloni_Syntelli-ProfileMario Carloni

Data Scientist
About Mario: Mario performs statistical analysis and reporting for Energy, Manufacturing, and Medical sectors. He is focused on analysing and visualising business logic to allow the discovery of systemic flaws. Prior to Syntelli, Mario was a Research Assistant at the Public Policy Center. While at PPC, Mario worked on social science research for evidence-based policymaking at the state, regional, and local levels for public, private, and nonprofit partners.
Mario received a B.A. in Political Science from the University of Massachusetts. While at UMass, Mario was International Orientation Leader, International Student Conversation Partner, and IT Assistant, gaining broad systems thinking at the technological and social level.

facebooktwittergoogle_plusredditpinterestlinkedinmail

Leave a Comment

Your email address will not be published. Required fields are marked *

 

Login

Register | Lost your password?