Christopher Sardegna's Blog

Thoughts on technology, design, data analysis, and data visualization.


Using Excel and Tableau to Visualize Trends in Merit Badge Data

Using Excel and Tableau to Visualize Trends in Merit Badge Data

Excel is a powerful tool for transforming datasets for use in more powerful visualization engines. Tableau and Excel together make a dangerous team when leveraged correctly, allowing for clear and concise communication of actionable data efficiently.

Getting the Data

To begin, I used the dataset published on Scouting Magazine’s post about Merit Badge popularity. This post included some summary metrics but also included the raw data of how many merit badges were earned for the years 2013 through 2017. The data is presented in an HTML table, which we can copy and paste directly into an Excel workbook without too much hassle.

Sanitizing the Data

Once the data is in an Excel sheet, there are two additional dimensions we can pull from this webpage as well as some cleanup to do to the Merit Badge names.

Tagging Badge Types

The page lists for different asterisk levels denoting some classification about the requirements for Eagle:

To handle this, we can create a new “Eagle Required” column that leverages a simple formula to generate a dummy variable that allows us to differentiate between Eagle-Required and non-Eagle-Required merit badges:

=ISNUMBER(SEARCH("~*",A2))

This formula does two things:

  1. The SEARCH("~*",A2) part looks for the character string * inside of cell A2. Since by default Excel considers * to be a wildcard condition, we need to use ~* to escape it. This will return the position of * as a number if it exists in the string. If not, it returns #VALUE!
  2. The ISNUMBER() part checks if the result of the SEARCH is a number. If SEARCH returned a number, i.e., the position of an asterisk, the cell will display TRUE. If SEARCH does not return a number, i.e., there is no asterisk, the cell will display FALSE.

This means that We can now filter on TRUE/FALSE to see whether a merit badge is Eagle Required or not.

Cleaning Names

Since we do not want to see the Merit Badge names with asterisks next to them, we can take the column and paste it as values. Next, we can initiate a find and replace on the asterisk character on the range of Merit Badge names. Remember that we need to make the criteria to replace ~* instead of * due to the wildcard condition.

Tagging New Badges

The Scouting Magazine page uses colors to denote whether a merit badge was released in 2013 or later and considers these to be new badges. Excel cannot use a formula to check the color of a cell.1 Since there are only a handful, I tagged these manually in a new column.

Transforming the Data

Because the data has multiple value columns, it is not a simple task to pivot this in a useful way:

Badge Eagle Required New 2013 2014 2015 2016 2017
Personal Fitness True False 56,295 50,693 52,499 52,079 50,428
Programming False True 480 2,970 3,577 4,085 4,138

To pivot this data, we need to manipulate it such that each year column turns into a single column:

Badge Eagle Required New Year Value
Art False False 2013 28031
Art False False 2014 25438
Art False False 2015 24374
Art False False 2016 22990
Art False False 2017 21567

We can do this by using Get and Transform (previously called Power Query) in Excel. First, we need to make the source data we want to transform into a table, so select any cell with our Merit Badge data and hit CTRL+T to do so. Make sure “My table has headers” remains checked.

Next, use “From Table” in the “Get and Transform” section under “Data” in the ribbon. The key chord to do this is ALT+N → PT. This will open up the Query Editor in a new window.

Once this window is open, navigate to the “Transform” section in the Query Editor Ribbon. Select the five year columns by clicking on 2013 and then shift-clicking on 2017. Once the years are selected, we can select “Unpivot” in the ribbon under the “Any Column” section.

This will transform the data the way we need it, where there is only one value column with one row for each year.

To load this data into Excel where it is usable, in the Get and Transform ribbon, navigate to “Home” and select “Close and Load.” This will display the transformed data in a new sheet.

Now that the data is pivotable we can go ahead and run some analysis using a PivotTable or a PivotChart where the value for the years are separated longitudinally instead of categorically. However, this is also a perfect time to leverage Tableau’s visualization prowess.

Visualizing the Data

Tableau can connect to Excel documents to extract data from them. This extraction also supports queries like the one we just made to transform the data. In Tableau, select “Microsoft Excel” from the Connect menu and select the file we created above. We should see three entries here: two sheets (one for each sheet we created) and one Query (the query we created in Power Query). Drag the Query in and load that as the source.

Converting Dates

Tableau does not immediately understand that the “Year” column is a date and not a number, so we need to specify the data type. In the Data sidebar, right-click on the Year dimension and navigate to Change Data Type > Date.

To get these dates to display as just a year (2017) instead of MM/DD/YYYY, we need to drag it in as a variable. I put it in the column section so that it gets sorted left-to-right. Once the variable has been added, right-click and select “Year” to suppress the display of the interpolated month and day.

To compare Eagle-Required badge volume to normal badge volume, we can use a column chart. For columns, use Year and Eagle Required in that order. For Rows, put Value from the measures section.

To color code the columns, also drag the Eagle Required dimension onto the “Color” section of the Marks box.

This generates a nice bar graph of the volume earned over time for normal and Eagle-Required Merit Badges:

This evinces an interesting trend: While Eagle-Required badges seem to be more or less steady, non-Eagle required badges are declining year-over-year. This leads me to believe that more people are only doing the "bare minimum" instead of continuing to work after they make Eagle.

Charting Popularity of Badges Longitudinally

We can also chart the changes over time as a line graph. This allows us to more easily discern the differences in popularity of merit badges over time. To do this, we only need to add the Dimension Year as a Column and the Measure Value as a Row.

To color code the lines, also drag the Merit Badge dimension onto the “Color” section of the Marks box.

From here, we need to create some filters to make the chart more readable. Drag the Dimension Eagle Required and the Measure Value into the filters section. Right-click on each and select “Show filter” in the menu.

This will create two filters. One where we can specify if we want to include or exclude normal or Eagle-Required badges and one where we can restrict the number earned.

This generates a nice line chart. When filtered only on Eagle Required badges, we see the following trends:

This is interesting because it shows that most of the eagle required badges are relatively constant. Interestingly, Cooking spikes in 2014 to the highest point of any Merit Badge. This is because in 2014 it was added as a new requirement to make Eagle.

Also interesting is that First Aid is the most popular merit badge overall. This means that scouting leaders must push for this to be the first Eagle Required badge earned by scouts. Since other Eagle Required badges are all lower than this one, it makes sense to conclude that the added immediate importance of first aid over a badge like Citizenship in the Community influences the popularity.

Raw Data

The data comes from Scouting Magazine’s blog post entitled Presenting the 2017 Merit Badge Rankings: Which were the most and least popular?

You can access the Tableau file I generated here. I also uploaded the Excel file with the Query construction here. Note that since Get and Transform is a Windows-only feature, you cannot edit the query on a Mac2.


Discussion: r/DataIsBeautiful


  1. You can do this in VBA, but it is too complicated for a small project like this. Pasting from the website removes the formatting anyway.↩︎

  2. Please vote for this to be added here!↩︎