The sample is not equal to the population! A brief introduction to the law of large numbers.

I'm PLUS
4 min readSep 10, 2021

We often read some strange infographics on social media platforms. Especially in recent years, there are lots of reports about covid-19 and vaccine. (Maybe you can click here to refer the status of covid-19 vaccination in Taiwan.) Perhaps because my mathematical and statistical background, some news or articles are extremely painful for me to read.

The sample is not equal to the population! (Photo by Jeswin Thomas on Unsplash)

Why is the sample not equal to the population?

More and more people have heard about the law of large numbers(LLN), but their perception is that “This sample size is large enough to represent the population”.

The following information is citing form Wiki (law of large numbers) :

In probability theory, the law of large numbers (LLN) is a theorem that describes the result of performing the same experiment a large number of times. According to the law, the average of the results obtained from a large number of trials should be close to the expected value and will tend to become closer to the expected value as more trials are performed.

For example, when polling for an election, people may think that a “large enough sample” is achieved by conducting telephone interviews with 2,000 or 20,000 voters. But they always ignore the real theory of the law of large numbers, the conditions that need to be met for the sample size to be representative of the population, and even whether their “sample” uses “reasonable sampling method”.

The picture shows the information of “Presidential Primary Poll”.(source : TVBS news)

The picture above shows the results of a survey conducted by Taiwanese pollsters on the “presidential primary election”. Interestingly, the left and right sides of the picture show pollsters with different stances, and their survey results are quite different. Supporters of both sides may think that the sample size is large, and according to LNN’s theory, the survey results should be representative of the entire Taiwan population. So when the actual election results came out, they were very dismayed. With this in mind, we need to avoid overstating a small sample when interpreting the data in general.

Photo by ThisisEngineering RAEng on Unsplash

Simply put, it is not a matter of taking a sample size that you "think" is large enough and using it as the population. Also, let's put aside whether your sampling method is correct, you have to take LNN to endorse for you, at least you should have a basic understanding of it. You will know how serious it is. The folloing is a very brief introduction to the LLN.

LLN is very important in many fields, it guarantees stable long-term results for the averages of some random event. Let’s show you the form of LLN. (If you are interested in detailed theory, you can click here.)

source : wiki of LLN

There are two different versions of LLN. They are called the strong law and the weak law. Their difference is concerned with the mode of convergence.

Weak law : the sample average converges in probability towards the expected value

source : wiki of LLN

Strong law : the sample average converges almost surely to the expected value

source : wiki of LLN

So, you must notice the key point "Convergence of random variables". That’s another very informative and mathematical topic. You can start by below references :

--

--

I'm PLUS

Share experience and notes on marketing technology, data analysis, and data visualization. My email address: jliommm.jil@gmail.com