With the Histogram analysis, you can view the distribution of values of one or more series. It’s typically presented as a report containing the main distribution statistics and as a category chart displaying the distribution among bins. Use it if you want to know whether values are within normal limits or if they’re skewed in one direction.
Density vs. Cumulative distribution
This setting determines if the density or the cumulative distribution should be estimated.
- Density: measures the items that fall into each individual bin.
- Cumulative distribution: measures the items that fall into each bin plus all previous bins.
Relative vs. Count
This setting determines the unit of the histogram.
- Relative: the unit is the proportion of items per bin.
- Count: the unit is the absolute number of items per bin.
Methods available when density is selected
- Uniform kernel: measures the number of items that falls exactly into each bin.
- Normal kernel: uses a smooth windowing function based on the standard normal density function when calculating the value for each bin.
Methods available when cumulative distribution is selected
- Uniform: measures the number of items until the end of each bin.
- Empirical: is calculated not by dividing the range into bins, but by using the actual points of observations.
Start & End
Here, you can specify a data sample range. If left blank, the whole series is used.
If checked, the application will automatically calculate a bin size as described in the theory section below.
Here, you can specify the bin size if the automatic bin size is turned off.
This contains options for outputting a normal distribution series.
- None does not calculate a normal distribution series.
- Automatic calculates and outputs a normal distribution series with the same mean and standard deviation as the empirical data.
- Manual calculates and outputs a normal distribution series with the specified mean and standard deviation.
The normal distribution is calculated by using the same function we use for NormDist(value, mean, stddev) in the Macrobond formula language.
If relative output is not selected, then the result is multiplied by n, the number of elements in the series.
The histogram analysis automatically generates a report. This report contains the following statistical measurements:
- excess kurtosis
- percentiles (10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%)
- tail expectations e.g. extreme percentiles 1% and 99%
- averages below and above these levels
Automatic bin size
The automatic bin size, h, is calculated as follows:
This can be shown to minimize the total estimation error in some situations. It’s often called the normal distribution approximation or Silverman's rule of thumb.
The density function measures the proportion of items that fall into each bin. One series contains the estimated density and the other the midpoints of the bins. We offer two ways of measuring this.
If y is the midpoint of a bin then the value of that bin is the number of values in the bin where
If relative output is selected, then the result is divided by n and by the bin size.
If y is the midpoint of a bin then the value of that bin is
If relative output is selected, then the result is divided by n.
Normal density function
The normal density for the midpoint is calculated for each bin as follows:
where µ is the estimated or specified mean value of the series.
Cumulative distribution methods
The estimated cumulative distribution function is calculated in one of two ways.
If y is the midpoint of a bin, then the value of that bin is the number of values where
If relative output is selected, the result is divided by n.
The other series contains the end points of the bins.
The empirical distribution is calculated not by dividing the range into bins, but by using the actual points of observations. The first series simply contains the number of the observation, starting at 1. The second series contains the values in ascending order. If relative output is selected, each value is divided by n.
In this example, we looked at the distribution of the S&P 500 daily performances. We used the uniform kernel method and set the bin size to 0.5.