Christopher Sardegna's Blog

Thoughts on technology, design, data analysis, and data visualization.


Cleaning Team Liquid’s Historical Data

Cleaning Team Liquid’s Historical Data


My last post about the Counter-Strike Global Offensive Major in Columbus exploded, prompting more digging into better data. Along the way, there were several roadblocks that can be easily solved with a few simple tools in Excel.

Acquiring Data

Team Liquid, founded in 2000, is a leading worldwide professional esports organization. Analyzing their historical match data is a big task, but one that can be streamlined with several pieces of software. As before, HLTV.org’s brilliant searchable index of historical CS:GO matches becomes extremely useful. To narrow down the matches to the ones relevant to Liquid, filter on “Liquid” under Team and tick the “Has Demo” filter.

It should return about 93 matches1. Once downloaded, copy them to C:\Program Files (x86)\Steam\steamapps\common\Counter-Strike Global Offensive\csgo and open CSGO-Demos-Manager. Once analyzed, export the demos to a single file. This is where the fun begins.

Cleaning Data

As with nearly every raw data source, this one comes out messy and must be scrubbed before analysis can be done.

General Sheet

The winner column in the CSGO-Demos-Manager export has historically been wrong, so it is best to replace that column with an =IF that checks the Score team 1 and Score team 2 columns to find the proper winner. This formula can be used to find the winner at the half as well: simply add a new column and fill with the proper cell references to the Score 1st half team 1 and Score 1st half team 2 columns.

There is a far more pressing issue, however. All of the ESEA matches lack a team name. This is due to the nomenclature they use in their demo files. Instead of listing “Team 1” as “Liquid,” ESEA demos lost “Team 1” as “Team 1” and “Team 2” as “Team 2.” This is incredibly frustrating because the original method of linking the round winner to the Rounds sheet is now broken. To fix this, either omit the ESEA matches from the export and try again2 or manually find the proper team names.

The easiest way to do this is to select the ESEA matches in CSGO-Demos-Manager and export them all individually. Then, on the Players sheet of the individual match exports, manually match the team name with the player’s respective team and enter it in the proper cell on the General sheet of the group export.

For example, in the Players sheet of the individual match export above, the team name “Team 2” in the General sheet of the group export is actually “OpTic” and “Team 1” is actually “Liquid.” Here are the corrected cells, highlighted in green:

A lack of standardized nomenclature between leagues causes CSGO-Demos-Manager to list individual teams under multiple names, i.e. Liquid as “Liquid” and “Team Liquid.” To correct for this, a simple find-and-replace will suffice. This different team name nomenclature is a problem that affects almost all of the sheets and correcting the General sheet reduces the work later on.

Teams Sheet

Because teams have entries under different names, the Teams sheet is entirely useless. To see the proper statistics, replace the entire sheet with a PivotTable of the General sheet. Set the Row to focus on Winner and add the relevant columns as values.

This automates the model so that any more changes to the General sheet will be reflected by the Teams sheet.

Rounds Sheet

Finally, the Rounds sheet needs some attention. The Winner Clan Name column is incorrect due to the ESEA nomenclature3. The proper team names can be referenced from the General sheet with the new =IFS formula:

=IFS(E2="Team 2",VLOOKUP(A2,General!B:N,13,FALSE),E2="Team 1",VLOOKUP(A2,General!B:N,12,FALSE),E2<>"",E2)

The first part of the formula checks if the Winner column is “Team 2” and if it uses a VLOOKUP to pull Team 2’s name from the General sheet. If that case is not hit, it checks if the Winner column is “Team 1” and if it uses a VLOOKUP to pull Team 1’s name from the General sheet. If neither of these cases are hit it checks if the cell is empty. If the cell has data, it then copies whatever was in that cell. This way it will only alter the data if the string “Team 1” or “Team 2” is present.

To reference the match winner, add a new column at the end of the Rounds sheet like before. A simple formula like =VLOOKUP(A2,General!B:V,21,FALSE) will pull the proper match winner.

Summarizing Data

These changes allow us to quickly summarize Liquids historical performance. They have won 54% of the matches they play as well as 52% of the rounds. Of those round wins here is a simple breakdown by side using COUNTIFS(F:F,"Liquid",G:G,"T") and COUNTIFS(F:F,"Liquid",G:G,"CT"):

Data Count %
Wins as T 686 57%
Wins as CT 561 47%

This dataset is now ripe for expansion like the MLG Columbus analysis.


  1. I found 95 but 2 of the demos were corrupted. ↩︎

  2. This can be easily done by looking at the filename of the demos in column A and removing the respective files from the csgo folder where CSGO-Demos-Manager looks. ↩︎

  3. Assuming you did not omit the ESEA matches in a new export. If you ignored these matches, disregard these formulas. ↩︎

Discussion on r/GlobalOffensive