Despite the rise of big data and data analysis in sports, few people think of the NBA as a treasure trove for analytics. Perhaps in response to this preconception, the NBA hosted its second annual hackathon from Sept. 23-24 in New York City.

This year’s winners were a pair of former Princeton students, Harold Li ’15 and Barbara Zhan ’16. Competing as “Team Data Buckets,” named after a blog the pair had started while still at Princeton, the duo created a winning submission that featured a decision-making tool for scoring basketball games according to entertainment value.

Perhaps the most difficult part of the project came with defining broad, abstract terms, such as “entertainment value” and understanding how to translate that ineffable quality into something tangible, such as value for the league.

“We were forecasting TV and web popularity, and attendance of actual games to inform a popularity metric and how much that meant in dollars,” explained Zhan. “For example, if a game is more watched by people, then it's going to rake in more revenue.”

The two found several leading indicators that forecasted a game’s success. Early ticket sales were often a good sign for games, while the prevalence of players with high jersey sales also factored into a match’s ultimate entertainment value. In addition, certain teams seemed to bring high ratings, such as the Golden State Warriors, Boston Celtics, Cleveland Cavaliers, and Oklahoma City Thunder. Even less successful teams, like the New York Knicks, often managed to benefit from their geographical associations.

Li and Zhan competed against undergraduate and graduate students from across the United States and Canada. Teams had 24 hours to complete their projects before presenting to a panel of judges, NBA executives, and members of the media. This year’s competitors had the option to choose between two competitive categories: basketball analytics and business analytics. Li and Zhan won first prize in the business analytics track.

Last year’s winning submission featured an analysis of “Hero Ball,” exploring whether star players attempt to carry too much of the offensive load in the postseason. Unsurprisingly, the answer is yes — star players try to carry the load too often, detrimentally impacting the team.

While this analytical framework might be new for a lot of sports fans, Li and Zhan have been working on many similar problems for their blog, Data Buckets.



“We started it when we were students at Princeton, learning the theory of data science,” Li recounted. “We wanted to spend more time applying analytics to the problems we were interested in. The big distinction between our blog and our coursework was that we had to formulate the problems and scrape the data ourselves.”

A recent post tackled the question of which player was the most “clutch” in the 2014-15 NBA season. Their answer, by the way, was Houston Rockets player James Harden.

While basketball certainly lends itself to statistical analysis, the two have not restricted themselves to just that sport, tackling similarly interesting question in tennis, soccer, and other games. In fact, the two are huge tennis fans and Nadal supporters.

As part of the top prize, the team will have lunch with NBA commissioner Adam Silver and an all-expenses-paid trip to the NBA 2018 All-Star Game in Los Angeles. In addition, the duo will also receive tickets for a game in an arena of their choosing.

For these two alumni, it seems that applying data analytics to the court is nothing but net. 

Comments
Comments powered by Disqus