There are two ways to represent a distribution. Normally, you do it like on the left diagram, with vertical bars (a histogram, pd.f.) which stands on the (X-axis) value and whose height is proportional to the likelihood/probability. It is lends nicely itself for integrating the probabilities to get into a range of values. It is however not suitable for finding the average. To find the average, you need to rotate the plot 90 degrees so that bar height is proportional to the value and bar width is the probability to get into it, like it is on the right.

Specify the distribution in [value, likelihood] format below:

Specify the distribution in [value, likelihood] format below:

I started to think however that if we make the height of the bar proportionally to the density on the interval, that is count of occurrence per interval length, then histogram becomes suitable for average computation. Making the bars wider, you will average the interval count over length. One large interval which spans the whole histogram width gives the average density, the number of counts per full range of values. What is the difference with the right diagram them? We compute the average density, how many counts are added with every value, whereas right diagram still computes the average value.
I am recalling that the probability (heights) turned horizontally, as on the right figure, represent a convenient way to turn uniform distribution [0, 1] into arbitrary one. Note that the width of the right diagram is 1 and a Math.random, available in any computer, thrown into it, falls into one of the intervals. The corresponding height of the bar, the value, is reported as rnd generator output.

p.d.f.

f(x): with range range:

bins:

## No comments:

Post a Comment