Superbowl XLIX – data visualization using TorQ and d3.js

Blog Data Analytics & Engineering 5 Feb 2015

Matt Doherty

Rumour has it some sports happened in between Katy Perry and commercials

 

This weekend saw the New England Patriots face the Seattle Seahawks in the annual American sports-travaganza that is the superbowl. It was the most closely matched and exciting superbowl in years, but what does this have to do with data visualization and analytics? Sports – in particular the big American sports of baseball, basketball and football – are increasingly data driven. Real time statistics, biometrics and numbers in general play an ever increasing role in watching and coaching these games. Most of these datasets are fiercely guarded and proprietary, however there is one source of sports data that is freely accessible and full of rich insight: sports exchange data. In-play odds on superbowl XLIX provides the perfect dataset to showcase a cool use for some of the technology developed here at AquaQ Analytics.

kdb+ is a technology primarily designed for the capture, storage and analysis of financial data; typically equities, fixed incomes, foreign exchange etc.. The data produced by trading activity on sports exchanges is remarkably similar to financial data. On these exchanges users trade with each other rather than against the house. Sports exchanges operate in a very similar manner to stock exchanges such as the LSE or the NASDAQ – they match people wanting to back or lay a particular outcome i.e. buyers and sellers – and as such kdb+ combined with our TorQ framework is the ideal technology platform.

TorQ forms the basis of a production kdb+ system by implementing some core functionality and utilities on top of kdb+, allowing developers to concentrate on the application business logic. We have developed an extension to TorQ that allows us to capture and store real-time odds from a popular sports exchange for any event and market, and leverage kdb’s extensive analytical capabilities to probe this data.

So, lets have a look…

Superbowl XLIX

[iframe src=”https://aquaq.co.uk/wp-content/uploads/datavis/superbowl/vis.html” width=110{e673f69332cd905c29729b47ae3366d39dce868d0ab3fb1859a79a424737f2bd} height=660][/iframe]

Live odds data visualized

Data is always much easier to understand when displayed visually in graphs and charts than as numbers in a spreadsheet. In the graphic above we’ve made use of Mike Bostock’s d3.js which allows us to attach data to html5 elements. So what are we looking at here? Well the visualization compresses over 300,000 data points collected during the game into something much more easily digestible. It’s really 3 separate charts, displaying three different sets of data, so lets go through each in turn:

  • The first plot shows the live odds based on matched trades from the exchange. So for example if the average odds for a given outcome were 2 (or 1 to 1; even odds), then the implied win probability is 50{e673f69332cd905c29729b47ae3366d39dce868d0ab3fb1859a79a424737f2bd}. If the odds were 3 (2 to 1), then the implied win probability is 33{e673f69332cd905c29729b47ae3366d39dce868d0ab3fb1859a79a424737f2bd} etc.. Here the odds for the two outcomes are displayed in a stack graph, so we can see that the total probability is always 100{e673f69332cd905c29729b47ae3366d39dce868d0ab3fb1859a79a424737f2bd} (minus some noise). In other words, there must be a winner!
  • The middle plot shows the score and important in game events. Hovering over the bubbles and boxes here will display a tooltip saying what happened or the score at that time. For example the green circle near the middle indicates the half time break. You can see the trading volume dropped off as everyone was too busy watching Katy Perry.
  • The third and final plot shows the volume traded against time, with the colour of the shading indicating the market the volume was traded in. In a sports exchange each outcome has it’s own market, which can be backed or layed i.e. whether you think this outcome will happen or not. In this case – since there are only two possible winners – backing one is the same as laying the other.

Visual match report

So from this market data we can really see the story of the game from a whole different perspective, and a number of things really jump out:

  • We can clearly see both teams are very evenly matched to begin with. Through the first half the Patriots score twice, and the Seahawks answer both times. In the 3rd quarter the Seahawks take control, only to have the Patriots make a heroic comeback in the 4th, just in time for the dramatic finish in the final seconds where the odds go crazy.
  • We can clearly see the odds tracking the score, and in most cases the market seems to anticipate scores happening. This is because of the teams field position (as they get closer to the endzone the odds shift as they are more likely to score soon).
  • There are a number of points in the game where the odds shift massively very quickly. In particular in the dramatic final minutes of the game you can first see a big shift to Seattle when Russell Wilson makes a big pass to put the Seahawks close to the endzone. Then an even more dramatic shift back when his next pass is intercepted at the goal line to seal the victory for New England.

In many way this is the visual analogue of a match report, where instead of words we’re using colours and shapes to tell the story of the game. A picture is worth a thousand words! The features above are just a few quick observations based on this one visual, I’m sure there’s plenty more insight waiting to be discovered in such a rich dataset.

Until next time!

At AquaQ our experience to date is predominantly in the capital markets industry, however as you can see the TorQ framework can easily be extended across domains into other sectors. We would be happy to engage with you either implementing and customizing TorQ, or in bespoke development and support of incumbent systems.

Share this:

LET'S CHAT ABOUT YOUR PROJECT.

GET IN TOUCH