## Data Mining

What will the market do today? Each day we ask our fellow trader friends and ourselves the same old question, and each day the market answers us at the end of the day. One would think that after having asked the question and observed the answer, say, a couple of thousand times or so, we either would learn the answer or stop wasting time pondering the question. Yet, we keep on asking and each day the market baffles us, by always behaving more or less out of sync with what we had expected.

One day the stock market is behaving exactly as anticipated, except for that little dip after lunch that had us stopped out with a loss. Which, by the way, reminded almost everybody about how the T-bond used to behave a couple of years ago, when the typical characteristics of the European currency crisis were casting their shadow over the financial world. Then, the next day, the very same market behaves in a way never seen before. Except for your buddy Joel that is, who traded the grain and meat markets back in the 1970s and therefore recognizes the similarities to how the corn market used to behave back then.

The point I am trying to make here is that it's very difficult to distinguish the behavior of one market from that of another, and the very same second that we think we've nailed it, it all changes and the experience gained loses almost all its value. Let's face it: can you really say that the stock market today behaved exactly as the stock market usually does, or was it more like the recent coffee market or the lumber market in the 1980s? And, if it behaved like the coffee market, is the recent behavior for the coffee market really typical for coffee at this time, and on and on and on? Is there really a specific, consistent behavior for each and every market, and, if so, can it be traded profitably? To find out you must examine each and every market very carefully and ask yourself an abundance of questions:

For how long does a typical trend last? What are the typical characteristics of any corrections? When can a move be considered to have gone too far? What is the likelihood for a certain amount of days in a row in the same direction? And most important: how can you benefit from this information and implement it in your arsenal of existing trading tools? Those are only some of the questions to ask when you start mining your data to come up with high-probability trading opportunities; or at least, in trying to avoid the most obvious bad ones.

For example, if the market has been in a down trend for four months straight but you know that a typical down trend should last for only two months, it probably isn't too good an idea to add to your position when your breakout system signals that you once again should go short. Granted, the market might have changed from what is typical for one market to what is typical for another, but at least you still have the long-term statistics on your side. Alternatively, if you already are short, perhaps it is a good idea to start scaling back, no matter what your system is telling you. Or, if you know that only 22% of all down moves, measured on daily data, last for more than two days, you could speculate on the high statistical likelihood for a short up trend to follow, set up a contrarian trading system, and probe the market with a small position whenever the market has fallen for two days or more. Similarly, if you know that only 9% of all down moves result in a decline of 8% or more, measured on monthly data, you could take a long position, betting on an extended up move to follow, as soon as the market has declined that amount.

To get a feel for relationships like these, it is a good idea to put together a set of tables like Table 5.1 through Table 5.3, which serve as good starting points for experimentation. All tables have been put together using the RAD contract for the S&P 500 futures market, with data expanding from January 1985, to December 1994, and January 1995 to October 1999. (The most up-to-date data, from January 1995 to October 1999, were used for out-of-sample testing, and are placed within parentheses.) From Table 5.1 you can see that during the period January 1, 1985, to December 31, 1994, the average up move for the S&P 500, measured on weekly data, went on for 2.29 weeks, for an average total gain of 3.3%, while the average down move, measured on monthly data, went on for 1.35 months with an average total decline of 4.82%. From Table 5.2 you can see that measured with daily data, 25% of all up moves lasted for longer than two weeks, while only 6% of all down moves could be expected to last for three weeks or more. In Table 5.3 the monthly amplitudes are within the parentheses in the headers. Measured with weekly data, 42% of all up moves resulted in an increase of the market value of more than 3%. Measured with monthly data, 46% of all up moves resulted in an increase of the market value of more than 6%.

To put together Tables 5.1 through 5.3 you must be able to export your data into a text file, which can be opened with the help of MS Excel or any other spreadsheet program. Once in Excel, you first must calculate the percentage change of the closing price with the following formula:

TABLE 5.1

Data mining summary.

TABLE 5.1

Data mining summary.

Move per period |
Periods in move |
Amplitude | ||

Daily data | ||||

All moves Up moves Down moves |
0.68% 0.69% (0.77%) 0.68% (0.71%) |
1.95 2.04 (2.17) 1.86 (1.93) Weekly data |
1.34% 1.41% (1.68%) 1.26% (1.36%) | |

All moves Up moves Down moves |
1.52% 1.44% (1.78%) 1.64% (1.67%) |
1.97 2.29 (2.21) 1.65 (1.52) Monthly data |
2.99% 3.30% (3.98%) 2.70% (2.52%) | |

All moves Up moves Down moves |
3.32% 3.22% (3.47%) 3.56% (3.50%) |
1.72 2.09 (3.31) 1.35 (1.25) |
5.73% 6.72% (11.93%) 4.82% (4.36%) | |

TABLE 5.2 | ||||

Periods of moves of certain length. | ||||

1 |
>1 |
>2 |
>3 | |

Daily data | ||||

All moves Up moves Down moves |
50% 46% (43%) 53% (53%) |
Weekly data |
23% 25% (33%) 22% (25%) |
11% 13% (15%) 10% (12%) |

All moves Up moves Down moves |
50% 41% (43%) 60% (67%) |
Monthly data |
25% 33% (28%) 17% (13%) |
11% 17% (15%) 6% (4%) |

All moves Up moves Down moves |
57% 37% (38%) 76% (83%) |
43% 63% (61%) 24% (17%) |
16% 23% (46%) 9% (8%) |
6% 9% (46%) 3% (0%) |

TABLE 5.3

Percentage of moves of certain amplitude.

TABLE 5.3

Percentage of moves of certain amplitude.

1(2)% |
>1(2)% |
>2(4)% |
>3(6)% |
>4(8)% |
>5(10)% | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Daily data | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

All moves |
55% |
45% |
20% |
9% |
4% |
2% | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Up moves |
51% (42%) |
49% (58%) |
22% (28%) |
11% (18%) |
4% (11%) |
2% (4%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Down moves |
59% (57%) |
41% (43%) |
18% (25%) |
8% (11%) |
4% (6%) |
2% (3%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Weekly data | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

All moves |
26% |
74% |
53% |
36% |
25% |
16% | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Up moves |
23% (10%) |
77% (90%) |
61% (75%) |
42% (59%) |
28% (41%) |
18% (24%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Down moves |
28% (28%) |
72% (72%) |
45% (43%) |
31% (27%) |
22% (21%) |
14% (18%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Monthly data | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

All moves |
25% |
75% |
45% |
35% |
19% |
12% | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Up moves |
26% (0%) |
74% (100%) |
57% (92%) |
46% (62%) |
29% (46%) |
17% (46%) | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

Down moves |
24% (25%) |
76% (75%) |
32% (33%) |
24% (17%) |
9% (8%) |
where E denotes the column where the data is stored. To calculate how many periods in a row a certain move lasts, first use the following formula in the adjacent column: = IF(OR(AND(F3 > 1 ;F2< 1 );AND(F3 < 1 ;F2> 1)); 1 ;G2 +1) Then, continue the calculation in the next column: Finally, in the last column, use the following formula to calculate the percentage move: =IF(H3 < >"" ;PRODUCT(INDEX(F: F; ROW() - AB S(H3) +1; 1): INDEX(F:F;ROW(); 1))— 1;"") After you have filled in all the calculations, type in the following sets of formulae at the bottom of the spreadsheet to derive the necessary numbers for the tables. (For the down moves, simply change ">" to "<=".) To calculate the total number of periods up: To calculate the total number of moves up: To calculate the average number of periods in an up move: =H4431/H4432 To calculate the average percentage amplitude for an up move: =SUMIF(I$3:I4429;">0")/COUNTIF(I$3:I4429;">0") To calculate the average percentage amplitude for each period within the up move: To calculate the likelihood for an up move to last for two periods or more: =COUNTIF(H$3:H4429;">l")/H4432 To calculate the likelihood for the amplitude of an up move to be greater than To test a simple trading system that makes use of information like this, you can, for instance, set up a system that only goes long as soon as you have a down day that follows a down week that follows a down month. Because of the natural upward drift in the stock market and because of what the data mining has shown about up moves going on for a longer period of time, the requirements for a short position could be two up days, two up weeks, and two up months. The TradeStation code for this simple system, Gold Digger I, looks something like this: Condition 1 = CloseM(l) > C and CloseW(l) > C and C[l] > C; Condition2 = CloseM(2) < CloseM(l) and CloseM(l) < C and CloseW(2) < CloseW(l) and CloseW(l) < C and C[2] < C[l] and C[l] < C; If Condition 1 = True and MarketPosition = 0 Then Buy ("Go long") at open; If C[2] < C[l] and C[l] < CThen ExitLong ("Exit long") at close; If Condition2 = True and MarketPosition = 0 Then Sell ("Go short") at open; ExitShort ("Exit short") at close; Using the trade-by-trade export function from Part 1, together with the RAD contract for exporting the results into a spreadsheet program, and then using the Excel formulae we also derived in Part 1, you can put together a performance summary table like those in Tables 5.4 and 5.5. TABLE 5.4 Performance summary for Gold Digger I, January 1985-December 1994. TABLE 5.4 Performance summary for Gold Digger I, January 1985-December 1994.
Over the period January 1, 1985, to December 31, 1994, Gold Digger I produced 251 trades, among which close to 64% were profitable, for an average gain per trade of 0.18% (or $597 in today's market value, with the S&P 500 trading around 1,350). A fairly low drawdown and standard deviation also make us want to continue the research. When tested on out-of-sample data, the average profit had increased to $1,467, while the largest loss had decreased to 4.20% (or $14,175 in today's market value). An ever-low standard deviation and a high percentage profitable trades keep us interested. No money was deducted for slippage and commission, but with an expected $75 in slippage and commission, you can expect this system to generate approximately $1,392 per contract traded ($1,467 - $75) in the immediate future. Table 5.5 also shows that the drawdown decreased substantially from close to 27%, for the in-sample period, to 7% for the out-of-sample period. For drawdown and cumulative profit values to be correct, however, we are assuming that the entire equity, including previously made profits, was reinvested at each new trade. As mentioned in Part 1, this is seldom possible, especially not in the futures markets, but it could be so in the stock market, provided that you can buy fractions of a share. What this does allow you to do, however, is to compare different systems and markets with each other on an equal basis, or compare how your system would have held up against a buy-and-hold strategy over the same time period. Despite the not too bad results, it is important to remember that Gold Digger I is put together merely to illustrate a simple system for taking advantage of a market's statistical characteristics. This particular system is not a good system to trade if you compare the statistical characteristics in Tables 5.1 to 5.3 for the market dur- TABLE 5.5 Performance summary for Gold Digger I, January 1995-October 1999. TABLE 5.5 Performance summary for Gold Digger I, January 1995-October 1999.
ing the first 10-year period, with the characteristics for the latest four years. Comparing the in- and out-of-sample periods with each other shows that ever-so-small differences between the daily and weekly data soon add up to quite large differences in the monthly time perspective. For instance, when looking at the monthly data, you notice the average magnitude of an up move, measured on monthly data, has grown from 2.09 months and 6.72% to 3.31 months and 11.93%. Also, in the table describing the likelihood for periods of moves of certain length, for the period up until 1994 there was only a 9% chance for an up move to last for more than three months, but for the period spanning January 1995 to October 1999, there was a 46% chance for the same type of move. In the same table, the most recent study suggests that 57% of up moves measured on weekly data go on for more than one week. This is two percentage points less than what could be expected looking at the period ending in 1994. Hence, while the number of persistent up trends measured on weekly data has decreased, the number of persistent up trends measured on monthly data has skyrocketed. The same phenomenon also can be seen in the table for percentage of moves of certain amplitude, where the percentage of up moves with an amplitude of more than 5%, measured on weekly data, has only increased by 33% (from 18% to 24%), while the number of up moves with an amplitude of more than 10%, measured on monthly data, has skyrocketed by 170% (from 17% to 46%). This suggests that when the market changes its characteristics or mode, these changes can be very hard to detect when looking at a shorter time frame within the bigger picture. Or, put differently: no matter what long-term mode the market is in, the short-term statistical characteristics are likely to still look the same and be close to stationary in nature. This is a very important conclusion, bccause if this is true, the only way to build a mechanical trading system that holds up and behaves the same way in the future as it does during testing—no matter what the longer-term underlying trend looks like—is to focus on the shorter time perspective, with trades lasting for no longer than approximately a week and using as little historical data for the signals as possible. To test if this could be true, a modified version of the original system considers only daily and weekly data, as suggested by the following TradeStation code, which also does not differentiate between up trends and down trends: Condition 1 = CloseW(2) > CloseW(l) and CloseW(l) > C and C[2] > C[l] and Condition2 = CloseW(2) < CloseW(l) and CloseW(l) < C and C[2] < C[l] and C[1]<C; If Conditionl =True and MarketPosition = 0 Then Buy ("Go long") at open; If C[2] < C[l] and C[l] < C Then ExitLong ("Exit long") at close; If Condition2 = True and MarketPosition = 0 Then Sell ("Go short") at open; ExitShort ("Exit short") at close; The results for this version of the system can be seen in Table 5.6. Tested on the out-of-sample period this system had 107 trades, with 63% profitable trades and an average profit of 0.31% (or $1,045 in today's market value). No money was deducted for slippage and commission, but this is easily accounted for, for trades in the immediate future, by deducting an appropriate amount from the expected value of the average trade. Because Gold Digger II is a simpler system (less curve-fitted) than Gold Digger I, its performance is not quite as good, as judged by a slightly lower profit factor and percentage profitable trades and a much lower cumulative profit. It does, however, seem to be a little more robust, as judged by a lower standard deviation. Although the average profit and profit factor are slightly lower for Gold Digger II when compared to Gold Digger I, the lower standard deviation, in combination with Gold Digger II's being more symmetrical in nature (less curve-fitted), makes it clear that Gold Digger II is the preferable system. From the TradeStation code, however, you can see that the only exit criteria we have on the long side is two up closes in a row, while we will exit a short trade as soon as we have a down day. Because the market does not always behave as we want it to, the losses can be quite severe before we are allowed to exit the trade. This can, for instance, happen if the market immediately takes off in the wrong direction, or if in a long position, every other day is an up day and every other day is a down day, but with down days of greater magnitude than up days. Therefore, the lack of properly developed stops still makes this a dangerous system, as can be seen from the values for the largest and average losing trades. (We will talk more about stops and exits in Part 3.) |

## Insiders Online Stocks Trading Tips

We Are Not To Be Held Responsible If Your Online Trading Profits Start To Skyrocket. Always Been Interested In Online Trading? But Super-Confused And Not Sure Where To Even Start? Fret Not! Learning It Is A Cakewalk, Only If You Have The Right Guidance.

## Post a comment