If you haven’t heard the term “big data” yet, then don’t worry; it’s still a relatively new term to most pure sports fans, although this does depend on how far these technologies have begun to change your favorite sports so far. As we will discuss below, the NHL and soccer are two of the games in which these techniques are being openly discussed, but there seems little doubt that big data is being used behind the scenes in almost ever sport.
Sports & Big Data in the Media
The creators of “Worldwide Soccer Manager 2022” (also known as Football Manager 2022) made a huge deal this year of how their game supposedly allows players to use the very same data analysis techniques using large data sets to help making decisions about their starting line-up and formation against different teams, when to make substitutions, and other such critical decisions.
We’ve heard a lot from those involved in preparing the US and Canadian hockey teams too, about how these same techniques are being used to help with their own decisions regarding who to take with them to Beijing for the forthcoming Winter Olympic Games, based on both individual and paired player performance.
Beyond the Field: How the Bookmakers are Using Data Analysis
With laws permitting online sports betting popping up in every corner of the United States. On the first of January 2021, the first online betting site in IA went live, leading many to ask, once again – where do online sportsbooks (and indeed, all sportsbooks) get their odds from?
The truth answer is becoming and more complex all the time; in areas where many bookmakers operate on the same patch, such as in the United Kingdom, a kind of “cartel” situation arose, where all of the major names would trade data on what bets had been placed with them, allowing each of the members to adjust their odds accordingly.
When the bookmakers were called out on this practice, their answer was to combine into a smaller number of big names – you can’t be indicted for sharing information with your competitors once you and your competitors become parts of the same company, after all.
Back on the Field: Statistical Analysis
Today, analysis of previous seasons is combined with incredibly intricate data – not just that published in the league tables, but also very deep-level player analysis and a minute-by-minute evaluation and re-evaluation of previous games.
All this data is used to create models which, once again, are fostering competition across the betting industry – whoever’s analysis turns out to be the best will make the most money. This is exactly how it should be in a free-market economy such as the United States.
Statistical Analysis techniques such as this rely on highly advanced math – this is surely the first time in history that many sports teams have seen fit to employ a team of mathematicians! Digging a little deeper into some of the complex scientific papers on the subject, it turns out that these techniques belong to a branch of “applied mathematics”, which allows you to infer specific recommendations from the “big data” set.
Statistical features of the data are considered: mean, variance, entropy, and minimum/maximum values, which, when combined, allow managers and coaches to look at player movement patterns in whole new ways which might not be obvious simply by watching video playback. Based on the results of this analysis, coaches can create more effective training plans than they ever could before.
Going Further: Factors Influencing Player Performance
Here is why these techniques are changing the games forever; whether we are talking about simple data analysis such as the goal line technology that is used in many major soccer tournaments today to determine if a goal actually did go over the line, to the more advanced techniques we will discuss below, it is easy to look back at some of the most controversial games in sports history and find conclusions as to how things might have ended differently in these technologies were available at the time of the game.
Sticking with soccer, a team led by scientist Pappalardo et al used a database of soccer logs which included close to 32 million unique events. These events covered 19,619 unique football matches involved 296 clubs and 21,361 players. Each event logged has a unique identifier, allowing it to be linked with other similar events as well the teams, players, and match concerned, as well as a timestamp, the position on the field at which the event took place, and tags for the type and subtype of the event.
Without going much deeper into this – you should have no problem finding the full report by searching for Pappalardo – it is again very easy to see how data this complex, this granular, could be used to great effect by managers. Not every relationship is easy to determine in the middle of fast past action on the field. Big data and statistical analysis allow managers, coaches, and even players, to break free of the restraints that traditional sports analysis has placed them in.
Expect to hear more and more about this in every sport in the coming years – some corners of the NFL aren’t keen on big data getting too involved in their game, believing that the status quo is perfectly suitable, but unless the league bans such techniques outright, it is only a matter of time before some teams begin to use them. Just as in F1, once one team has an advantage the others have no choice but to copy them in order to stay competitive.