Wednesday, November 30, 2005

Technical Trading, Neural Networks and the All-Mighty Bid-Ask Spread

I haven't posted in awhile. In the evenings I either write on my blog or I write code. Lately I've been mostly writing code instead. I hope to post some results of some of the experiments I've done later, but in the meantime here is a summary of the stuff I've been working on over the past month or so.

* I finally got my option data in a usable form. I used this to backtest various simple volatility trading strategies involving dynamic hedging. I was mainly interested in finding some useful long-gamma scalping strategies. Nothing really interesting came about from this. The main utility of these experiments was that it was a good practical test of my data access and cleaning routines. Bad data points were a big problem. This project forced me to create more-robust data-scrubing routines which will be invaluable later on.

Generally the long-gamma trading ideas I tested showed piss-poor results. I was surprised in general how poorly hedging the options with stock seemed to work. One revealing experiment was my first one. I tested buying an ATM option and replicating it with black-scholes, hedging the the delta with stock using the end-of-day prices for both the options and the stock. One would expect such a basic strategy would generally not make significant profits or losses (assuming no transaction costs). Going long options and hedging the delta was almost always a big loser on the stocks I tested. This was even the case if I simulated with buying the option at the bid price, and closing it out on the offered price as a market maker would. That replication via black-scholles works so poorly using end of day data, suggest to me that maybe I need to be testing long gamma volatility trading stratagies with intra-day (high-frequency) data.

Now my data is from 2003 to 2005, which is a period over which volatility has generally trended down, so I would expect a straight unfiltered option buying strategies to perform well in general, but I would not expect this strategy to fair as poorly as it did. I can't believe that options trade at such a high premium to actual realized volatility as these results would suggest. I think I read in Telab's Dynamic Hedging that historical volatilities measured at higher frequencies tend to be higher. This would explain the results I was getting from these experiments with end of day data. Presumably if it is true that historical volatility is higher when the measurements are made at higher frequencies then adjusting the delta hedge at higher frequencies would be more profitable.

* I then experimented with a bollinger band system based loosly on the one mentioned in Altucher's Trade Like a Hedge Fund. In the past I had experimented heavily with bollinger band counter-trend based trading systems (by experiment I mean mostly backtests, and a little bit of actual trading). These systems I tested generally are characterized by having a high-probablility of making a small return, and a small probability of having a large loss every now and then. Sounds like selling options short right? This isn't surprising that such systems have a similar risk profile as selling an option short, because trading counter-trend is essentially replicating a short option position. Anyway I was interested in experimenting with trading counter-trend and buying long options or spreads to limit the risk. Another idea I had was using bollinger bands to choose entry points for legging into butterflies. Some of the various schemes I looked at had some promise. What I mainly took away from these experiments was a greater apreciation for the market maker's edge in option markets. It was pretty incredible how much more profitable a system would become if I simulated it by buying the options on the bid and selling on the offer like a market maker. It does seem though that the bid/offer spreads in my end of day option data are a little wider than what I usually actually see in options markets. I will need to look into this later. It may be that the end of day option price data I have is nearly worthless for testing any spreading strategy.

* Currently, I am experimenting with Neural Networks. In my most basic experiment I attempted to predict the next day's return using a basic back-propagation neural network, trained on daily open,close,high,low, volume time-series data. I didn't really expect to get any practical trading systems out of this. Mainly this was just to get my feet wet with neural networks., and to build a toolset for future experimentation. Interestingly enough the simple network described above generally does seem to show *some* statistically significant predictive ability for next day returns on the SPY, but nothing that is tradeable. (SPY is the only stock I've worked with much so far using Neural Networks.) I hope to write a blog-post presenting my results so far a little later on.

My real goal with the neural networks is to explore predicting high-frequency returns. Some of the literature I've read suggest that they work pretty well on high-frequency data and I'd like to check that out. I also want to try out using neural networks trained on option open interest and volume data across a range of strikes and expiration. This would require a lot of data preperation and structuring. I'm still thinking about good ways to handle the data for an experiment like that.

Currently I'm writing new neural network code to be able to handle more advanced network structures, mainly networks that incorporate some sort of feedback loop. I've read at least 2 articles where the authors use networks with feedback-loops on financial time-series data. Other articles I've read about time-series forcasting in general mentions these types of neural networks as a standard tool.

My main take-away from what I've worked on recently is that I want to obtain some high-frequency data. Unfortunately, high frequency data is very expensive to buy. My plan is to write a program to collect this data myself for a few stocks over a month or so and use that as a starting point. There are two major chalanges with working with high frequency market data. The first one is that there is a lot of it. Saving tick data for even one stock could easily be many megabytes a day. Secondly from what I read, high frequency data needs to be filtered to be useful. It tends to have many outlying (bad) data points.