We get used to the idea that as our sample size increases, our model becomes more reliable.
We all ‘know’ that the sample average of most distributions is asymptotically Normal (by the central limit theorem) and that the sample average gets closer to the population mean and such.
This corresponds to a specific type of randomness, let us call it ‘mild’ randomness, as in a sense nothing wild is going on – although a random (stochastic) process is underneath everything, with more data comes convergence and more reliability to our claims. Our models come from this – linear regression and so on. They are not exact and they accept being approximations, but they still do a decent job.
However, what if they were totally wrong?
Cauchy has a different idea of probability and as a result a different idea of randomness, call it ‘wild’ randomness.
Cauchy’s idea is as follows.
Consider an archer that is blindfolded and has a bow and arrow. He is to shoot at a target located on a wall that is infinite in height and length. We are to measure how far his arrow is away the target. We assume that he always shoots at the wall, somewhere.
For example. if he hits the target, we record as he is exactly units away from the target.
We can formulate a probability distribution based on this example and without loss of generality, our assumptions need not hold.
Deriving Cauchy’s distribution
We can represent the idea about the archer above as a right angled triangle, as seen below.
This makes sense when we look at it: we are at some location (labelled archer), say center to the target (labelled target) and some distance (adjacent to the angle ) away. Then our actual shot (labelled actual shot) is just a line segment from the target. Wherever the bow lands – above the target, to the right, left, below the target, etc, without loss of generality we can represent it as a right angled triangle, assuming that our shot hits the wall.
Then say we are interested in how far we are away from the target (the line segment actual shot – target) and how are we ourselves (the archer) are away from the target (the line segment archer- target). This can be represented by trigonometry and the angle can be computed. We have
Let the ratio between the line segment between actual shot and target} and the line segment between archer and target be called , which is fine as it is just a real number. Then we have
To measure how far his arrow is away from the target, we are interested in varying by the ratio . This makes sense – we do not vary with respect to as we are already fixing what we are varying with. We take the inverse tangent to get
Varying with respect to corresponds to our problem – how we change the ratio answers the question of how far away we are from the target. This is just the derivative!The derivative of with respect to is
This derivative defines our distribution – the term is our probability density function of the random variable that measures how far away we are from the target, with taking all real number values, which corresponds to us asking, are we away from the target?
We then get the probability distribution function defined by integrating
over all real values of and finding a constant to make this integral equal to . The constant is and we have the probability density function to be
where is an real number.
This is the Cauchy distribution.
Consider a real life process that conforms to ‘mild’ randomness – the heights of humans, for example.
If I collect heights of say five humans, it may not be close to the average. As I collect more heights I should get closer to the average, assuming that I am picking people randomly and not based on geographical location and other factors.
I get an expected value of what the height should be. I also get an idea of how far away from the mean I expect to be – this is the variance .
Do these ideas hold for the blinded archer? Well.. not really.
We can have a sequence of shots that are close to the target but if the archer’s next shot is miles away, all that ‘work’ is wiped out in the sense that the average from those previous shots, now considering this shot, will be totally different.
Although I have a ‘bare’ idea on what I expect to get – units away from the target, I can go anywhere. This type of randomness is far more wild – I am not building on my earlier, smaller samples. I do not have any expectation of how far away my shots will be, nor do I know how far I fluctuate from my expectation (which I do not know in the first place). This corresponds to me not knowing or .
Formally this can be shown as the expectation of is not finite, nor is the variance of finite.
The Central Limit Theorem and the Law of Large Numbers do not hold here. Taking the sample average and using it to infer information about this distribution is useless, because the next shot can change all of what we are working with. With the Normal distribution, this is not the case.
Difference between ‘mild’ and ‘wild’ randomness
Perhaps the difference between these types of randomness (mild and wild) can be seen in the plots. Consider the plot below of the probability density function of the Cauchy distribution for between $ latex-10$ and in intervals (which is good enough to get a measure of what the distribution looks like)
It does not look so much different to the plot of a probability density function of a (standard) Normal distribution, which is plotted below.
The Cauchy distribution has heavier tails – they do not dip as quickly as they do for the Normal distribution. This corresponds to having a higher chance of an arrow being incredibly away from the target to be more significant in Cauchy’s distribution than in the Normal – this also makes sense. Yet the Cauchy distribution has no (finite) expected value, no (finite) variance and various intuition about it fails – say bye to the Central Limit Theorem and the Law of Large Numbers.