Saturday, September 14, 2013

Funny statistics in stock market data

Funny statistics in stock market data

Market data structure functions

I recently got some minute level stock market data from The Bonnot Gang for some data analytic (and stat arb design) purposes, when I noticed some funny behavior in the data structure function. Now the concept of a structure function may not be very widely known with quants/ data analysts/ economists, so here's a definition:

Suppose there's a time series Xt. The structure function of Xt is defined as


where for a given sample of data you just replace the ensemble expectation E() by the sample mean, 1Nt=0N().

These types of structure functions have been studied for some time now in finance in the context of similarities between financial markets and hydrodynamic turbulence. I think it all started in 1996 with the paper Turbulent cascades in foreign exchange markets by Ghashghaie et al. They computed the structure functions for some FX market data, and found a scaling relation Sn(τ)ξn, where ξn is a concave function of n, implying multiscaling in FX markets, similarly to hydrodynamic turbulence (BTW their conclusions about the result were a bit out there, but I guess the data analysis is still good).

So I did some of my own data analysis with the Bonnot Gang data (I hope it's not bad data!). Here's a few plots of the structure functions, first for n=1:


Then n=3:


This is close to linear, i.e. ξ31, as in turbulence. Then n=10:


Clearly you can't fit a power law in all of this, but there seems to be clear power law regimes divided by about 6, 18, 60 and 180 minutes! I don't know the reason for this, but if I had to guess, I'd say it's because of traders/ algorithms operating w.r.t different data timeframes... or maybe it's because of the finite tick size...

Anyway, I don't have time to get to the bottom of this, but maybe someone else will... so if you see this stuff on a paper someday, you saw it first here!! ;)

Written with StackEdit. Try it out, it's awesome!! You can do MathJax and sync everything in Google Drive or Dropbox and publish directli in Blogger, Wordpress, Tumblr etc.!