New Times,
New Thinking.

Football’s data delusion

A new book by Rory Smith looks at why the English Premier League is still searching for its Moneyball moment.

By Simon Kuper

Liverpool FC’s research department has featured an astrophysicist, a chess champion and an alumnus of the Large Hadron Collider at Cern near Geneva, all led by a polymer physicist from Cambridge. In 2017 the department lobbied the club’s manager, Jürgen Klopp, to sign Egyptian forward Mo Salah from Roma.

The analysts used a statistical term, covariance, which measures the relationship between different elements. What was the covariance between, say, two forwards? Did they complement each other? The analysts concluded that Salah would combine well with Roberto Firmino, a striker already recruited largely because of his data. Klopp initially rejected Salah, but eventually the analysts persuaded him. In 2019 and 2020, Liverpool won the Champions League and then their first Premier League title since 1990.

Data analytics in football has leaped ahead thislast decade, argues the New York Times’ chief soccer correspondent Rory Smith in his breezy, readable, useful introduction to the story. Analytics has made a historically stupid game more intelligent. Smith goes so far as to herald an “unspoken revolution”, though that’s not right: football analytics is much trumpeted and not quite a revolution. Rather than transforming the game, analytics has so far only improved it marginally.

[See also: The last days of Roger Federer]

The origins of data in football are conventionally timed to 3.50pm on 18 March 1950, when the RAF accountant Charles Reep (glamorously ranked “wing commander”) began logging match events during the second half of a Swindon Town game. In that half he recorded 147 attacks by Swindon. Extrapolating from this tiny sample, Reep calculated that 99.29 per cent of attacks in football failed.

Select and enter your email address Your weekly guide to the best writing on ideas, politics, books and culture every Saturday. The best way to sign up for The Saturday Read is via saturdayread.substack.com The New Statesman's quick and essential guide to the news and politics of the day. The best way to sign up for Morning Call is via morningcall.substack.com
Visit our privacy Policy for more information about our services, how Progressive Media Investments may use, process and share your personal data, including information on your rights in respect of your personal data and how you can unsubscribe from future marketing communications.
THANK YOU

But only this century did analytics seep into clubs. Michael Lewis’s 2003 book Moneyball, which showed how analytics had transformed American baseball, inspired many in football, especially in England. Video games like the FIFA and Pro Evolution Soccer series, which calculated complex ability ratings for players, also fed interest in data. And as clubs got richer, they began hiring statisticians to see if they might add any value.

Increased computing power produced data more telling than the useless old stats on tackles made, passes completed and kilometres run. The logging of match events was outsourced to cheap hires in the Philippines, Egypt or Russia, who not only documented where a player had passed from but also tried to quantify how much pressure he had been under. New “tracking data” helped show what each player did in the 89 minutes a game, on average, that he didn’t have the ball.

Gradually, modern football’s key metric emerged from the fog: “expected goals” (xG), a measure of how many goals a team would typically have scored based on the quantity and quality of its chances in a match. Once a team’s xG had been determined, you could break down each player’s contribution to that number.

[See also: What the Huxleys got wrong]

An iron rule of football is that people from outside the sport are cleverer than insiders. Perhaps to combat this, insiders have produced a dogma which holds that only ex-players truly understand the game. Many managers were suspicious of data analysts whose rival form of expertise could make them redundant. Often, the geek would be shut away in a distant office where nobody could hear him scream.

About three quarters of the book is a story of disappointment. Smith follows analytics enthusiasts who are employed by clubs, and ultimately depart without having made much difference. They tell him about their brilliant data-driven signings of players, but – as he acknowledges – less about their bad data-driven signings. Often, even when the analytics evangelist identifies a gem – like Hendrik Almstadt urging Arsenal to buy the young Kevin De Bruyne – the club decides against. The temptation for Smith in writing a book such as this is to overclaim for data, and it’s to his credit that he rarely does. But the cumulative effect is anticlimactic.

Most clubs today use data primitively and only as a plug-in extra. In an urgent industry, they lack time and expertise: often, nobody on staff has a maths degree. A club that wants to sign a player might cherry-pick data points to justify its predetermined decision. Some clubs monitor sprint data to identify shirkers. Footballers increasingly understand this, and find ways to game the analytics. When matches stopped for injury breaks, Manchester City’s veteran right-back Pablo Zabaleta would boost his stats by running sprints, writes Smith.

But data hasn’t particularly powered the game’s ceaseless evolution. Football’s most influential coach, Pep Guardiola, spends much more time watching videos than studying spreadsheets.

Only late on does Smith identify the football clubs doing a Moneyball: Brighton, owned by professional gambler Tony Bloom, and Brentford and FC Midtjylland in Denmark, both owned by Bloom’s former employee Matthew Benham. Armed with numeracy and oodles of data, these men have lifted relatively small clubs into the top division. They’ve identified undervalued players and found data-led ways to score from the set pieces – free kicks, corners, even throw-ins – that other clubs waste. (The convention is that the star player is allowed to line up a free kick, go into meditation mode and then blam it into the crowd when he should have passed.)

“Midtjylland are now regularly the most effective team at set pieces… in Europe,” writes Smith. Benham’s long-time point man Rasmus Ankersen says: “Midtjylland is further ahead of the team in second than the team in second is ahead of the team in 73rd.”

But it’s easier to achieve such a transformation at their level, football’s middle reaches, where undervalued players abound. At the top, where players are closely scrutinised, talent tends to be already recognised.

[See also: The Philip Pullman affair shows social media is where good arguments go to die]

Smith needs an elite example of a Moneyball team and he picks Liverpool, owned by the American commodities trader John Henry. Ian Graham, the club’s director of research, believes the main impact analytics can make is on recruitment. Almost all of football’s money goes on talent. If a club can find good players, they will generally do the rest themselves. And Liverpool does indeed seem to recruit efficiently. In the new edition of our book Soccernomics, Stefan Szymanski and I show that from 2011 through to 2021 the club’s net spend on wages plus transfers was lower than that of its rivals Manchester City, Chelsea and Manchester United.

But Smith misses the true recruitment star: Tottenham Hotspur. Spurs in that decade regularly challenged competitors at the top of the table while averaging an annual net spend of just £283m, half that of Chelsea and less than half that of the two Manchester clubs. Smith does discuss Spurs’ early use of analytics: the data firm Decision Technology was contracted to the club until 2018. But even its staff don’t know whether anyone at Spurs so much as read its stream of reports. What happens inside Daniel Levy’s Spurs is a black box. I’m guessing, though, that they have replaced Decision Technology with world-class in-house analytics.

Smith’s focus on Liverpool over Spurs illustrates the book’s limitations. The great barrier in football writing is access: clubs rarely give it. So when insiders do talk, the temptation is to exaggerate their roles. Smith ends up apparently unsure of what he thinks. The book’s subtitle is “The Story of How Data Conquered Football and Changed the Game Forever”, but his conclusion is more downbeat. He quotes Daryl Morey, pioneer of analytics in basketball, as saying that soccer’s “data sucks, no one cares, and no one should”. Many analysts inside clubs sympathise with that view, Smith admits.

There are new frontiers ahead, though. In 2017, AlphaZero, an artificial intelligence program created by Google’s research lab DeepMind, beat the world’s best computer chess engine, having taught itself chess in a matter of hours. AlphaZero, blind to tradition, reasons out the best way to play. Liverpool are now reportedly investigating whether AI can reveal the secrets of football. It may show, for instance, that you should never shoot from a free kick. Perhaps football analytics has barely got started.

Rory Smith has seen inside parts of football, and has good stories. But the whole feels a touch shallow, as reflected in hurried, magazine-feature prose: “Mayfair, with its grand Georgian façades, its private members’ clubs, its dazzling veneer of money”; “the fast-talking, somewhat hot-headed Bostonian”; “park its tanks on its old rival’s lawn”; “caught the gimlet eyes of teams higher up the food chain”; and so on. Like most of us outsiders, he has run into football’s locked doors.

Expected Goals: The Story of How Data Conquered Football and Changed the Game Forever
Rory Smith
Mudlark, 304pp, £20

Purchasing a book may earn the NS a commission from Bookshop.org, who support independent bookshops

Simon Kuper writes for the FT

[See also: Hilary Mantel’s death is an incalculable loss to our national life and literature]

Content from our partners
Water security: is it a government priority?
Defend, deter, protect: the critical capabilities we rely on
The death - and rebirth - of public sector consultancy

Topics in this article : , , ,

This article appears in the 28 Sep 2022 issue of the New Statesman, The Truss Delusion