Variable-width (dagonally cut) histogram

dhist(x, a = 5 * iqr(x), nbins = grDevices::nclass.Sturges(x),
  rx = range(x, na.rm = TRUE), eps = 0.15, xlab = "x", plot = TRUE,
  lab.spikes = TRUE)

Arguments

x

is a numeric vector (the data)

a

is the scaling factor, default is 5 * IQR

nbins

is the number of bins, default is assigned by the Stuges method

rx

is the range used for the left of the left-most bin to the right of the right-most bin

eps

used to set artificial bound on min width / max height of bins as described in Denby and Mallows (2009) on page 24.

xlab

is label for the x axis

plot

= TRUE produces the plot, FALSE returns the heights, breaks and counts

lab.spikes

= TRUE labels the % of data in the spikes

Value

list with two elements, heights of length n and breaks of length n+1 indicating the heights and break points of the histogram bars.

Details

When constructing a histogram, it is common to make all bars the same width. One could also choose to make them all have the same area. These two options have complementary strengths and weaknesses; the equal-width histogram oversmooths in regions of high density, and is poor at identifying sharp peaks; the equal-area histogram oversmooths in regions of low density, and so does not identify outliers. We describe a compromise approach which avoids both of these defects. We regard the histogram as an exploratory device, rather than as an estimate of a density.

References

Lorraine Denby, Colin Mallows. Journal of Computational and Graphical Statistics. March 1, 2009, 18(1): 21-31. doi:10.1198/jcgs.2009.0002.