Reshaping Data For Tableau With A Flat File Fix In 5 Steps
Working with survey data to play nicely with Tableau can be a challenge as Tableau (as well as most other data visualization tools) is structured to work most optimally with tall data sets. It can be the case the more often than not, survey data (as well as other types of data) is designed to be wide rather than tall. Here is an example below of what I mean by “tall” data and “wide” data.
note: click on image to view details
In the context of a survey, a wide data set will have a separate column for every question whereas a tall data set will have one “questions” column (column “I” in the picture above) and unique questions will be represented in different rows. If you find yourself working with wide data and need optimal utilization in Tableau, here is how you can reshape your data using tableau’s add-in for excel in 5 easy steps.
- Identify the columns on which you would like to be able to filter and copy and paste them to the front of a new excel workbook. This first step will allow all of your data to be linked and to filter on the columns identified. The data I’ll be using is restaurant survey data.
- Copy and paste all relevant data after your filter columns. Copy and paste all data that you want to be able to visualize in tableau (even if you also used it as a filter column).
- Download and install tableau’s excel add-in and use it to reshape your data. Follow the directions how to download and install the add-in here: https://community.tableau.com/s/question/0D54T00000G54lLSAR/installing-and-troubleshooting-the-tableau-addin-for-reshaping-data-in-excel
When installed, go to your excel file and click “Tableau”, “Reshape data”, and then select the first cell where your non-filter column data starts. In my case that starts in cell I2. Then click ok.
- Make a “survey helper file” (if needed). In our data example we were given data that was purely numerical, that is, questions “Q1” and “Q2” correspond to questions and groups of questions on a survey. In order to bring all these questions in automatically (without writing them out as aliases every time we make a viz) is to make a helper file that you can either use to blend with the primary data source (not recommended) or add this information to the reshaped data (recommended).
The reason I recommend adding this descriptive data to the reshaped data instead of using Tableau’s data blending feature is because I learned a valuable lesson on my last project: there are some limitations to using data blending. On 95% of your visualizations you will be able to effortlessly use Tableau’s data blending feature, but there was one functionality that I absolutely needed that data blending doesn’t allow: to sort on a field located in a different data source. In the next step I’ll show you to how to use a VLOOKUP function in excel to combine the data on these two files.
- Use the VLOOKUP function in excel to match question IDs with their respective wording in your survey helper file. Now that we have both of our files ready to merge, let’s do a VLOOKUP function to get all the helper file data into our primary survey data source.
- In the next data cell available (K2 in my case) type “=VLOOKUP(“ then click the value you want to lookup which is the Question ID, in my case, cell I2 and add a comma. So far it should look like this.
- In the next part of the function it wants us to define the table array we want to use for the lookup. So click on the excel workbook that has the helper file data and select all columns.
- Then add another comma, the column number from where you want to retrieve data, and then 0 for an exact match. In our case the column from where we want to retrieve data is the second one, so we’ll say “2”. It should look like this.
- Then double click the bottom right part of the cell with the lookup function and it will bring in the data all data from the secondary source into the primary.
Then copy and paste the contents of cell K2 into L2 and change the formula to this:
And then click the bottom right of the cell to bring down the data again. It should like this:
- The last step is to copy columns K and L and paste them as values. This will save it as text and not as lookup functions.
Now I can bring in this data into Tableau and it is fully optimized to work in Tableau. My “wording” automatically appears, I can filter the data, and do something that I wouldn’t be able to do if I blended the data- sort the wording of the question by the value.