## Kernel Density Estimation

The empirical cdf with pj increasing in steps of (1/T) from 0 to 1 is too jagged in appearance to directly compare to theoretical, smooth distribution. Statisticians have tried to remove the jump discontinuities among them by smoothing and averaging the nearby values, Silverman (1986). If we choose parametric densities such as N(m, o2) or the logistic (4.3.2), we place observed data into a straightjacket of a particular form of the chosen density. The finan-

Theoretical quantiles Figure 4.3.3 Q-Q plot for AAAYX mutual fund

cial observed data (e.g., returns) may not fit into any parametric form and a parametric density may be a poor approximation.

Sometimes this problem is solved by choosing a parametric family of distributions (e.g., Pearson family), rather than only one member (e.g., normal) of the family. The appropriate family member is chosen by considering Pearson's plot of skewness versus kurtosis (see Figure 2.1.1). The problem of a parametric straightjacket may still persist and a flexible nonparametric density may be worth considering. Kernel density estimation is a popular method for this purpose.

The Kernel estimation begins with choosing a kernel weighting function to smooth the empirical cdf of the data. This weighting should yield a higher probability density for ranges where there are relatively more data points, and lower probability density where observations are sparse.

Let K denote a kernel function whose integral equals unity to represent weights. Often this is a density function since it will have a total area of 100%. The biweight kernel is defined for y e [-1,1] by Kbiw(y) = (15/16)(1 - y2)2. For the same range, Epanechnikov kernel is Kepa = (3/4)(1 - y2). The Gaussian kernel is defined over the range of the entire real line y e (-•, •) and equals the density of N(0,1) or Kgau = (1/V (2p) exp(-y2/2). These kernels provide the weight with which to multiply each observation to achieve smoothing of the data. Clearly, the weight suitable for a point x, whether observed or interpolated, should depend on the distance between x and each observed data point xt for t = 1,2,..., n. The density at x, f(x) then is given by

5 10

Figure 4.3.4 Nonparametric kernel density for the mutual fund AAAYX

where the kernel function can be any suitable function similar to Gaussian or biweight and where c is a smoothing bandwidth parameter. Since all distances of x from observable data points xt are measured in units of this smoothing bandwidth parameter c, it appears in the denominator used in the definition of the argument yt of the kernel function. See Silverman (1986) for further details on Kernel estimation. Sheather and Jones (1991) propose an automatic method for bandwidth selection.

Consider our AAAYX mutual fund data from Chapter 2. Now Rose and Smith's (2002) software called "mathStatica" can be used for nonparametric kernel density estimation in two steps. First, we specify the kernel as the Gaussian kernel, and our second step is to choose the bandwidth denoted here by c = 1.104 based on the Sheather-Jones method (see Chapter 9). The non-parametric density for this dataset given in Figure 4.3.4 clearly indicates that the density is nonnormal with long left tail, suggesting skewness or the fact that downside risks are different from upside potential for growth.

### 4.4 ALTERNATIVE DISTRIBUTIONS

This section discusses some probability distributions that are potentially useful for studying financial data, specifically those that can account for downside risk and skewness. We provide graphics for these distributions so that the reader can assess their applicability in any particular situation. Recent finance literature is using many of these nonnormal distributions and their generalizations. For example, Rachev et al. (2001) recommend VaR calculations based on a stable Pareto variable, because these can be readily decomposed into the mean or centering part, skewness part, and dependence (autocorrelation)

structure. Research papers in finance often assume that the reader is familiar with many nonnormal distributions. Hence we provide some of the basic properties of the distributions listed above, with an emphasis on properties for which analytical expressions are available.

## Lessons From The Intelligent Investor

If you're like a lot of people watching the recession unfold, you have likely started to look at your finances under a microscope. Perhaps you have started saving the annual savings rate by people has started to recover a bit.

Get My Free Ebook