In the weeks preceding the Australian Open, we have decided to open up a new front and treat it with the calm that only the absence of the tournaments’ daily grind can give us. In this series of articles, we will talk about the topic of tennis data, with the purpose – a bit ambitious, to be honest – to guide you from the simplest to the more complex concepts of this field to help you understand why the future of tennis, and the present too, will be strongly characterized by numbers and data.
Describing the history of a match to obtain patterns and tactical indications is not a trivial process. It is even less easy to get out of the perspective of a single match in order to draw trend lines to recognize styles of play. Curiously, a binary sport such as tennis, with discreet scores and phases of play (and thus perfectly suitable for statistical analysis), is among those lagging behind in this field. Despite a significant delay when compared to other sports such as basketball and baseball, rooted in sabermetrics-centric America, tennis has also taken a step forward in this direction in the last few years.
On the WTA Tour, the analysis tools provided by SAP allow coaches to receive detailed information on match performances in real time – and the new deal signed with Stats Perform promises to take the concept to an even higher degree. On the ATP Tour, another IT giant like Infosys (an Indian company with over 200,000 employees) has recently renewed its agreement as Global Technology Services Partner and Digital Innovation Partner until 2023.
The use of data analysis techniques is becoming more and more common among top players, and is also giving rise to a cultural change that requires data analysis experts to assess large amounts of data on the one hand and to show coaches easy-to-read information on the other. These can be used by an athlete’s team to set both tactical plans for individual matches and long-term developments of his or her playing style.
The data collection and processing issue is addressed with a professionalism that varies from semi-handcraft to more sophisticated approaches. At the same time, to respond to this growing demand, companies are also starting to provide analysis and support services in statistical interpretation, whose forerunner is probably Dartfish, a video tagging tool (assignment of ‘tags’ to the different match events, broken up into each of its 15) first used by Craig O’Shannessy, the most famous analyst who collaborates as a tactical analyst in Djokovic’s team.

At one end of the spectrum, we find the classic approach, the one for which statistical information are useful, even though it’s not structured in a decision-making process: information that from time to time are intercepted by the coach who, on the basis of his experience, elaborates them with a layman’s perspective. An example is Nicolas Massù, whose contribution to the growth of Thiem is unquestionable and based on his tennis wisdom.
At a higher level of awareness, we probably find the majority of coaches who, not having the time and skills needed to embark on a structured process of data analysis as Massù, nonetheless feel the need to have this information funnelled to them in some way. It would be ideal for this type of coach to have a qualified counterpart, mixing tennis and data analysis skills, being able to take part in decision-making processes, speaking the same ‘language’ of a tennis coach, and providing easy to understand insights. An example is Medvedev’s coach, Gilles Cervara (up to a certain point in his career), who until the summer of 2019 didn’t use this type of analysis but left the door open to possible collaborations.
The next step is a well-defined collaboration, in which statistical analysis finally finds a place in the player’s team. At this level we find collaborations with individual players who combine tennis skills with a professional approach based on data collection and processing. We are talking about situations in which tennis competence is predominant, which, combined with craft (but effective) match charting techniques, allows to obtain additional high-value information that can be successfully integrated in the tactical preparation of matches. An example is constituted by Gilles Cervara… 2.0, who began his collaboration with a Swiss consultant, Fabrice Sbarro, in the summer of 2019. It turns out that this professional has brought a significant added value. He’s another example of ‘craftsmen’ with tennis wisdom who have paved the way for tennis analysis. Finally, the last step of this curve of acceptance to statistical analysis schemes is represented by the inclusion in the coaching process of the services offered by specialised companies, such as Golden Set Analytics (GSA), which assesses the performances of more than 150 players of the ATP circuit through the work of a team of experts. In addition to GSA, a leader in this sector, other companies provide advanced services such as Data Driven Sports Analytics or Sportiii Analytics. This link will take you to an interview with the founder of the latter company, which mix advanced big data and data representation techniques and is moving, just like DDSA, in the direction of incorporating automatic capture techniques from video sources.
If we want to refer to the graphic representation at the beginning of this article, the tennis world is entering the phase of awareness by the business domain (i.e., players and coaches) and is taking the first steps towards data analysis applications, which, enriched over time with advanced data processing features, will allow the full deployment of data science techniques.
In conclusion, the panorama is extremely varied: it ranges from enthusiasts like Djokovic, who has started to use statistical reports thanks to the collaboration with O’Shannessy, in order to identify those game patterns to be used in crucial moments. Another player who has declared to use O’Shannessy’s services is Berrettini, who, unlike Djokovic, prefers not to receive very granular information, but only “pills” of statistical data that can be of immediate help without running the risk of getting too confused. Another player who declared to use data analysis services is Zverev, who talked about it at ATP Finals 2019 as an important help to better frame his rivals’ playing style. A further example of virtuous collaboration in the women’s category is the one enjoyed by Bianca Andreescu, who, thanks to the support of Tennis Canada, was able to receive ad hoc analytical reports for her own match preparation.
On apparently more vague positions we find Federer, who has repeatedly declared how data are interesting, but must be carefully handled not to be misleading. However, in addition to being a GSA client, according to a leak reported some time ago by the Telegraph Federer has a privileged relationship and so for a higher price he would have access to exclusive insights that are not available to his opponents.
Nadal’s position is much more conservative. He has repeatedly reiterated that he elects to rely on Moya’s tennis savvy for his matches preparation (and given the masterpiece of the last final in Paris we don’t dare to reproach him). According to Nadal, the use of analysis techniques is mostly limited to the use of sensors for the bio-mechanical representation of his shots and to acquire information about his game, but without pretensions of tactical comparison with his opponents.
Concluded this first overview of the most prominent tennis players using a ‘data driven’ approach, we wait for you with the next article in the series, in which we will specifically analyse which are the most important metrics to be analysed in the tennis world.
Article by Federico Bertelli; translated by Alice Nagni; edited by Tommaso Villa