The center or central tendency of a data series is not a sufficient description for price analysis. The manner in which it is scattered about a given point, its dispersion and skewness, are necessary to describe the data. The mean deviation is a basic method for measuring distribution and may be calculated about any measure of central location, for example, the arithmetic mean. It is found by computing
n where MD is the mean deviation, thé average of the differences between each price and the arithmetic mean of the prices, or other measure of central location, with signs ignored.
The standard deviation is a special form of measuring average deviation from the mean, which uses the root-mean-square m^'-r n , Standard deviation = \l-
where the differences between the individual prices and the mean are squared to emphasize the significance of extreme values, and then total final value is scaled back using the square root function. This popular measure, found throughout this book, is available in all spreadsheets and software programs as @Std or @Stdev. For n prices, the standard deviation is simply @Std(price,w).
The standard deviation is the most popular way of measuring the degree of dispersion of the data. The value of one standard deviation about the mean represents a clustering of about 68% of the data, two standard deviations from the mean include 95.5% of all data, and three standard deviations encompass 99-7%, nearly all the data. These values represent the groupings of a perfecdy normal set of data, shown in Figure 2-4.
Probability of Achieving a Return —
If we look at Figure 2-4 as the annual returns for the stock market over the past 50 years, then the mean is about 8% and one standard deviation is 46%. In any one year we can expect the compounded rate of return to be 8%; however, there is a 32% chance that it will be either greater than 24% (mean plus one standard deviation) or less than —8% (the mean minus one standard deviation). If you would like to know the probability of a return of 20% or greater, you must first rescale the values, objective — mean
If your objective is 20%, we calculate
FIGURE 2-4 Normal distribution showing the percentage area included within one standard deviation about the arithmetic mean.
We look in Appendix A1 under the probability for normal curves, and find that a standard deviation of .75 gives 27.34%, a grouping of 54.68% of the data. That leaves one-half of the remaining data, or 22.66%, above the target of 20%.
Most price data, however, are not normally distributed. For physical commodities, such as gold, grains, and interest rates (yield), prices tend to spend more time at low levels and much less time at extreme highs; while gold peaked at $800 per ounce for one day, it has remained between $375 and $400 per ounce for most of the past 10 years. The possibility of falling below $400 by the same amount as its rise to $800 is impossible, unless you believe that gold can go to zero. This relationship of price versus time, in which markets spend more time at lower levels, can be measured as skewness—the amount of distortion from a symmetric shape that makes the curve appear to be short on one side and extended on the other. In a perfectiy normal distribution, the median and mode coincide. As prices become extremely high, which often happens for short intervals of time, the mean will show the greatest change and the mode will show the least. The difference between the mean and the mode, adjusted for dispersion using the standard deviation of the distribution, gives a good measure of skewness
mean - mode standard deviation
Because the distance between the mean and the mode, in a moderately skewed distribution, is three times the distance between the mean and the median, the relationship can also be written as:
3 x (mean - median) standard deviation
This last formula may be more practical for computer applications, because the mode requires dividing the data into groups and counting the number of occurrences in each bar. "When interpreting the value of SK, the distribution leans to the right when SK is positive (the mean is greater than the median), and it is skewed left when SK is negative.
One last measurement, that of kurtosis, should be familiar to analysts. Kurtosis is the "peakedness" of a distribution, the analysis of "central tendency" For most cases a smaller standard deviation means that prices are clustered closer together; however, this does not always describe the distribution clearly. Because so much of identifying a trend comes down to deciding whether a price change is normal or likely to be a leading indicator of a new direction, deciding whether prices are closely grouped or broadly distributed may be useful. Kurtosis measures the height of the distribution.
The skewness of a data series can sometimes be corrected using a transformation on the data. Price data may be skewed in a specific pattern. For example, if there are lA of the occurrences at twice the price and % of the occurrences at three times the price, the original data can be transformed into a normal distribution by taking the square root of each data item. The characteristics of price data often show a logarithmic, power, or square-root relationship.
Because the lower price levels of most commodities are determined by production costs, price distributions show a clear boundary of resistance in that direction. At the high levels, prices can have a very long tail of low frequency. Figure 2-5 shows the change in the distribution of prices as the mean price (over shorter intervals) changes. This pattern indicates that a normal distribution is not appropriate for commodity prices, and that a log distribution would only apply to overall long-term distributions.
Choosing between Frequency Distribution and Standard Deviation
You should note that it is more likely that unreliable probabilities will result from using too little data thart from the choice of method. For example, we might choose to look at the distribution of one month of daily data, about 23 days; however, it is not much of a sample. The price or equity changes being measured might be completely different during the next month. Even the most recent five years of S&P data will not show a drop as large as October 1987.
FIGURE 2-5 Changing distribution at different price levels. A, B, and C are increasing mean values of three shorter-term distributions.
y Long-term distribution
Although we can identify and measure skewness, it is difficult to get meaningful probabilities using a standard deviation taken on very distorted distributions. It is simpler to use a frequency distribution for data with long tails on one side and truncated results on the other. To find the likelihood of returns using a trend system with a stop-loss, you can simply sort the data in ascending order using a spreadsheet, then count from each end to find the extremes. You will notice that the largest 10% of the profits cover a wide range, while the largest 10% of the losses is clustered together.
A standard deviation is very helpful for giving some indication that a price move, larger than any we have seen in the data, is possible. Because it assumes a normally shaped curve, a large clustering of data toward one end will force the curve to extend further. Although the usefulness of the exact probabilities is questionable, there is no doubt that, given enough time, we will see price moves, profits, and losses that are larger than we have seen in the past.
Throughout the development and testing of a trading system, we will want to know if the results we are seeing are as expected. The answer will keep referring back to the size of the sample and the amount of variance that is typical of the data during this period. Readers are encouraged to refer to other sections in the book on sample error and chi-square test. Another popular method for measuring whether the average price of the data is significantly different from zero, that is, if there is an underlying trend bias or if the pattern exhibits random qualities, is the student t-test, average of price changes — /-—-
t =--——5-------x VNo. of data items standard deviation of price changes and where degrees of freedom = number of data items — 1. The more trades in the sample, the more reliable the results. The values of t needed to be significant can be found in Appendix 1, Table A1.2, 'T-Distribution." The column headed ".10" gives the 90% confidence level, ".05" is 95%, and ".005" is 99.5% confidence.
If we separate data into two periods and compare the average of the two periods for consistency, we can decide whether the data has changed significantly. This is done with a 2-sample i-test:
where Xi and x2 are the averages of data periods 1 and 2, v, and v2 are_the variances of periods 1 and 2, and nl and n2 are the number of data items in periods 1 and 2. The degrees of freedom, df needed to find the confidence levels in Table Al-2, can be calculated as:
Thcstudent i-test can also be used to compare the profits and losses generated by a trading system to show that the underlying system process is sound. Simply replace the data items by the average profit or loss of the system, the number of data items by the number of trades, and calculate all other values using the profit/loss to get the student f-test value for the trading performance.
Was this article helpful?
We Are Not To Be Held Responsible If Your Online Trading Profits Start To Skyrocket. Always Been Interested In Online Trading? But Super-Confused And Not Sure Where To Even Start? Fret Not! Learning It Is A Cakewalk, Only If You Have The Right Guidance.