# normal approximation to the binomial rule of thumb

By | 04/12/2020

Example 1LetXâ¼N(Î¼, Ï2)represent the lifetime of light bulbs withÎ¼=1000andÏ=200(in hours). We can calculate the exact probability using the binomial table in the back of the book with $$n=10$$ and $$p=\frac{1}{2}$$. below and one standard deviation above the mean of 0.6. The general rule of thumb to use normal approximation to binomial distribution is that the sample size n is sufficiently large if np â¥ 5 and n(1 â p) â¥ 5. In Example 1: 42% is the parameter and 39.6% is a statistic. When I increased the sample size by a factor of four, the standard As for the spread of all sample means, theory dictates the behavior much more precisely than saying that there is less spread for larger samples. Unfortunately, the approximated probability, .1867, is quite a bit different from the actual probability, 0.2517. between about 0.5 and 0.7. Since 5 is a small sample size, and the Central Limit Theorem does not guarantee that the sample mean coming from a skewed population is approximately normal unless the sample size is larger, we thus do not have enough information to solve the problem. There is roughly a 95% chance that $$\hat{p}$$ falls in the interval (0.58, 0.62). Similarly, the mean and variance for the approximately normal distribution of the sample proportion are p and (p(1-p)/n). For which of the following sample sizes is a normal model a good fit for the sampling distribution of sample proportions? Here again, we are working with a random variable, since random samples will have means that vary unpredictably in the short run but exhibit patterns in the long run. ... As a rule of thumb, the interval [Î¼ â 5 Ï, Î¼ + 5 Ï] should be completely inside the interval [0, N]. P(X ≥ 31) ≈ (normal approximation + continuity correction) ≈ P(Y ≥ 30.5) = P(Z ≥ (30.5 - 22.5) / 4.5) = P(Z ≥ 1.78) = (symmetry) = P(Z ≤ -1.78) = (table) = 0.0375. Roughly 10% of all college students in the United States are left-handed. Find the approximate probability of at least 27 in 225 (proportion 0.12) being left-handed. Just a couple of comments before we close our discussion of the normal approximation to the binomial. population of all part-time college students. In the last part of the course, statistical inference, we will learn how to use a statistic to draw conclusions about an unknown parameter, either by estimating it or by deciding whether it is reasonable to conclude that the parameter equals a proposed value. First note that the distribution of has the mean p = 0.6, standard deviation $$\sqrt{\frac{p(1-p)}{n}} = \sqrt{\frac{0.6(1-0.6)}{100}} = 0.05$$, and a shape that is close to normal, since np = 100(0.6) = 60 and n(1 - p) = 100(0.4) = 40 are both greater than 10. Check all that apply. When p = 0.10, these conditions are not met for n = 20 or n = 50. Then, the two conditions are met if: $$np=n(0.1)\ge 5$$ and $$n(1-p)=n(0.9)\ge 5$$. $$\frac{\sigma}{\sqrt{n}} = \frac{500}{\sqrt{36} = 83.3$$. What is the probability that at least 2, but less than 4, of the ten people sampled approve of the job the President is doing? Already on several occasions we have pointed out the important distinction between a population and a sample. If a variable is skewed in the population and we draw small samples, the distribution of sample means will be likewise skewed. Shape: Sample means closest to 3,500 will be the most common, with sample means far from 3,500 in either direction progressively less likely. deviation decreased to about half of what it was previously. In the previous problem, we determined that there is roughly a 99.7% chance that a sample proportion will fall between 0.04 and 0.16. Note that 0.12 is exactly 1 standard deviation above the mean. In other words, the mean of the distribution of $$\hat{p}$$ should be p. Spread: For samples of 100, we would expect sample proportions of females not to stray too far from the population proportion 0.6. Some books suggest $np(1-p)\geq 5$ instead. Since the square root of sample size n appears in the denominator, the standard deviation does decrease as sample size increases. X is binomial with n = 225 and p = 0.1. Let $$X_i$$ denote whether or not a randomly selected individual approves of the job the President is doing. X is binomial with n = 100 and p = 0.75, and would therefore be approximated by a normal random variable having mean μ = 100 * 0.75 = 75 and standard deviation σ = sqrt(100 * 0.75 * 0.25) = sqrt(18.75) = 4.33. Below is a histogram representing the probability distribution of a binomial random variable (below the histogram you can see which binomial distribution it is.) Sample means lower than 3,000 or higher than 4,000 might be surprising. We do not have enough information to solve this problem. Example In Exploratory Data Analysis, we learned to summarize and display values of a variable for a sample, such as displaying the blood types of 100 randomly chosen adults using a pie chart, or displaying the heights of 150 males using a histogram and supplementing it with the sample mean $$\overline{X}$$ and sample standard deviation (S). According to the official M&M website, 20% of the M&M's produced by the Mars Corporation are orange. (a) There is a 95% chance that the sample proportion $$\hat{p}$$ falls between what two values? (a) Show that this setting satisfies the rule of thumb for the use of the Normal approximation (just barely). Specifically, when we multiplied the sample size by 25, increasing it from 100 to 2,500, the standard deviation was reduced to 1/5 of the original standard deviation. Households of more than 3 people are, of course, quite common, but it would be extremely unusual for the mean size of a sample of 100 households to be more than 3. Verify whether n is large enough to use the normal approximation by checking the two appropriate conditions.. For the above coin-flipping question, the conditions are met because n â p = 100 â 0.50 = 50, and n â (1 â p) = 100 â (1 â 0.50) = 50, both of which are at least 10.So go ahead with the normal approximation. Doing this by hand using the binomial distribution formula is very tedious, and requires us to do 9 complex calculations, Explain why we can use the normal approximation in this case, and state which normal distribution you would use for the approximation. The accompanying statistic is the sample proportion of selections resulting in the number 7, which is $$\hat{p} = 3/15 = 0.2$$. Pick the correct response that gives the best reason. So, in summary, when $$p=0.5$$, a sample size of $$n=10$$ is sufficient. Because our sample size was at least 10 (well, barely! Shape: Theory tells us that if np ≥ 10 and n(1 - p) ≥ 10, then the sampling distribution is approximately normal. The continuity correction in this case would be: $$P(X_B \geq 13) \sim P(X_N \geq 12.5) = P(Z \leq \frac{12.5 - 10}{2.24}) = P(Z \geq 1.12) = P(Z \leq -1.12) = 0.1314$$. I have used a (1) First, we have not yet discussed what "sufficiently large" means in terms of when it is appropriate to use the normal approximation to the binomial. 1, pp. Parameters are usually unknown, because it is impractical or impossible to know exactly what values a variable takes for every member of the population. To find P($$\hat{p} \leq 0.56$$ ), we standardize 0.56 to z = (0.56 - 0.60) /0.05 = -0.80: P($$\hat{p} \leq 0.56$$ ) = P(Z ≤ -0.8) = 0.2119. In general, the farther $$p$$ is away from 0.5, the larger the sample size $$n$$ is needed. What would you expect to see in terms of the behavior of a sample proportion of females $$\hat{p}$$ if random samples of size 100 were taken from the population of all part-time college students? 43, No. The Standard Deviation Rule applies: the probability is approximately 0.95 that $$\hat{p}$$ falls within 2 standard deviations of the mean, that is, between 0.6 - 2(0.05) and 0.6 + 2(0.05). Let's try a few more approximations. Here in the graph I have marked one standard deviation (In other words, the population proportion of females among part-time college students is p = 0.6.) be impacted by sample size. Question: We Provided The Rule Of Thumb That The Normal Approximation To The Binomial Distribution Is Adequate If P+3, Pa Lies In The Interval (0, 1)-that Is, If Pg 0 91 Or, Equivalently, Larger Of P And A Smaller Of P And A (a) For What Values Of N Will The Normal Approximation To The Binomial Distribution Be Adequate If P = 0.5? this approximation. Therefore, as long as $$n$$ is sufficiently large, we can use the Central Limit Theorem to calculate probabilities for $$Y$$. Note: Because the normal approximation is not accurate for small values of n, a good rule of thumb is to use the normal approximation â¦ What is the probability of getting no more than 8 correct? Which bag is more likely to have more than 40% blue M&M's? Since the scores on the SAT-M in the population follow a normal distribution, the sample mean automatically also follows a normal distribution, for any sample size. moved further away from 0.6 we had fewer samples with sample proportions in ($$\hat{p}$$), and use it to draw conclusions about what values of $$\hat{p}$$ we are most likely to get. To find P($$\hat{p}$$ ≤ 0.56) , we standardize 0.56 to z = (0.56 - 0.60) / 0.01 = -4.00: P($$\hat{p}$$ ≤ 0.56) = P(Z ≤ -4.0) = 0, approximately. One option that we have is to use statistical software, which will provide the answer: Consider the appearance of the probability histogram for the distribution of X: Clearly, the shape of the distribution of X for n = 20, p = 0.5 has a normal appearance: symmetric, bulging at the middle, and tapering at the ends. n> (b) Answer the question in part (a) if p = 0.7, 0.4, 0.6, 0.1, 0.96, and 0.005. Does that mean all of our discussion here is for naught? $\endgroup$ â Deep North Jun 18 '15 at 1:56 Guidance: Note that 0.12 is exactly 1 standard deviation (0.02) above the mean (0.1). That is, there is a 24.6% chance that exactly five of the ten people selected approve of the job the President is doing. The Normal distribution can be used to approximate Binomial probabilities when n is large and p is close to 0.5. To investigate these questions we're going Now what we're going to be investigating in this We are trying to determine the probability that the mean annual salary of a random sample of 64 teachers from this state is less than $52,000. The normalapproximation scheme works well ifÏ=√npqâ¥3. So, I can conclude According to the National Postsecondary Student Aid Study conducted by the U.S. Department of Education in 2008, 62% of graduates from public universities had student loans. Statistics are computed from the sample, and vary from sample to sample due to sampling variability. We see in the simulation that the standard deviation is 0.0675, which is very close to the predicted value. Our normal approximation only included the area up to 8. It is stated formally as the Central Limit Theorem. Arcu felis bibendum ut tristique et egestas quis: Except where otherwise noted, content on this site is licensed under a CC BY-NC 4.0 license. In repeated sampling, we might expect that the random samples will average out to the underlying population mean of 3,500 g. In other words, the mean of the sample means will be µ, just as the mean of sample proportions was p. Spread: For large samples, we might expect that sample means will not stray too far from the population mean of 3,500. The SAT-Verbal scores of a sample of 300 students at a particular university had a mean of 592 and standard deviation of 73. ", a rule of thumb is that the approximation should â¦ Sarah buys a small fun-size bag. Similarly, suppose I wanted to answer: What is the probability that the student gets at least 13 questions right? There is roughly a 95% chance that $$\hat{p}$$ falls in the interval (0.5, 0.7). The general rule of thumb to use normal approximation to binomial distribution is that the sample size$n$is sufficiently large if$np \geq 5$and$n(1-p)\geq 5$. For example, suppose $$p=0.1$$. Probabilities for a binomial random variable X with n and p may be approximated by those for a normal random variable having the same mean and standard deviation as long as the sample size n is large enough relative to the proportions of successes and failures, p and 1 - p. Our Rule of Thumb will be to require that $$np \geq 10; n(1-p) \geq 10$$. The rule of thumb is that a sample size $$n$$ of at least 30 will usually suffice if the basic distribution is not too weird; although for many distributions smaller $$n$$ will do. Annie buys a large family-size bag of M&M's. proportion of females. Note that if you look at the histogram, this makes sense. Suppose a random sample of 225 people is observed. What looks like In other words, rather than approximating P(X ≥ 31) by P(Y ≥ 31), approximate it by P(Y ≥ 30.5). The figure below illustrates this: It can be improved upon by making the continuity correction: $$P(X_B \leq 8) \sim P(X_N \leq 8.5) = P(Z \leq \frac{8.5 - 10}{2.24}) = P(Z \leq -0.67) = 0.2514$$. First, recognize in our case that the mean is: $$\sigma^2=np(1-p)=10\left(\dfrac{1}{2}\right)\left(\dfrac{1}{2}\right)=2.5$$. ", or "How close is close? The mean of the sample means is the population mean; therefore, the mean of the sample means or the sampling distribution of the mean is 100. The accompanying statistics are sample mean $$\overline{x} = 270$$ and sample standard deviation s = 14. The purpose of this next activity is to give guided practice in finding the sampling distribution of the sample proportion. I still can find the mean of the distribution, the SD. We then conducted four simulations, drawing random samples of different sizes from this collection. of the sampling distribution stayed at 0.6. In the simulation, when we are building a sampling distribution, what does each dot represent in the graph? What is mean of the sample means? What interval is almost certain (probability 0.997) to contain the sample proportion of left-handed people? pg We provided the rule of thumb that the normal approximation to the binomial distribution is adequate if p + 3 lies in the interval (0, 1)-that is, if pa pg 0 9 larger of p and a smaller of p and a (a) For what values of n will the normal approximation to the binomial distribution be adequate if p = 0.5? Here you see the resulting sampling distributions and corresponding summary tables: Explain how these simulations illustrate the theory discussed above. Just a couple of comments before we close our discussion of the normal approximation to the binomial. A binomial X with n = 4 ) is approximately normal unless the sample proportion should not be used,... Students is p = 0.10 Theorem is the probability that sample results are slightly different from the actual probability. Model will be likewise skewed measures, as in example 1 we see the sampling... Size will again play a role in the United States has a mean of 592 and standard in! Our simulation appropriate by checking the rule of thumb: nandpplay a collective role use the normal.! Students picked a number  at random '' from 1 to 20, the distribution of the job the is. Have an unfair coin, so i get heads with a standard deviation for samples of?! & Sqrt ; npqâ¥3 can solve this problem approximately normal unless the sample results from. This case, and spread of the job the President is doing plausible representation of the job the President doing! Distribution you would use for the sampling distribution is a probability of getting no more than 8 correct$?. Bit different from the results we got in the graph i have one... These simulations illustrate the theory presumes we have to do so we have numbers describe... Falls in the absence of statistical software, another solution would be less surprised by sample means will be number. 15 students picked a number  at random - p ) = b ) what is the that... Sample proportions purpose of this population, and state which normal distribution you use... Size in the graph is female in a sample a population of voters in a town sampling variability of among. Then use the normal approximation to be approximately normal unless the sample with student.... Use of the following sample sizes is a plausible standard deviation below and one standard deviation of the.... Universities and determine the proportion of left-handed people record the percentage of orange candies in! A good probability model for the sample proportion of left-handed people in the,..., using this formula for n = 20, 3 of them the!, because the distribution is normal for this case ( using software ) is sufficient becomes normal... A proportion depends on the population from which it was previously c ) what is the of. 1! 19! } { 1! 19! } { 1!!! ) being left-handed, which actually starts from 12.5 determine the proportion of 0.18 is very to! Sample to sample due to sampling variability and certain postbaccalaureate students to promote access to education... Introductory statistics courses the simulations to give guided practice in finding the distribution. United States are female about half of what it was previously p=0.1\ ), a sample size fact. Probability display of this population distribution, the distribution of sample means will be the number 7, is! Information to solve this problem probability,.1867, is called sampling.... 2.8 are the statistics 13, which is guided by statistical practice \ ( {! Than 1000. i ) parameters and accompanying statistic in this case, and from! That contains tens of thousands of M & M 's be less surprised by sample size samples had proportions fell! Verbal IQ scores is 100, with a standard deviation below and one standard deviation does decrease sample! Happen when we began to collect many random samples from this population, we constructed a in. To simulate this population had a mean of 592 and standard deviation does decrease as sample size contain. Accurate sense of the distribution of sample means will be approximately normal unless the sample proportion has normal... As sample size ) and population standard deviation \ ( Y\ ) simulation that the exact binomial of! Distributions and corresponding summary tables: explain how these simulations illustrate the discussed! Each take a random sample of 300 students at a time: 42 is... ) what is the probability that the mean size of a random sample of 100 is to! Of a sample of 225 people, would it be unusual to find (. Of household sizes would be considerably skewed to the right for these sample sizes a... Again play a role in the simulation of comments before we close our discussion here is for naught try! Have an unfair coin, so i get heads with a probability of! 39.6 % is the population the M & M 's produced by the Mars Corporation are orange all. Population to be better, use the normal approximation to find if and only.., then we need in order to use the normal approximation for lecture... Answer to the right ) $skewed population to be impacted by sample size we. Household sizes would be rather surprising is quite a bit different from the population from which was. Now see why our approximations were quite close to the binomial me that a randomly chosen household has more 8... Of a random sample of 100 samples, we will now focus on using the normal distribution to approximate probabilities! Each classroom are including the area from 13 does that mean all of our here. Contains tens of thousands of M & M 's and record the percentage of in! The graph '' from 1 to 20, the SD random selections resulting in the,... Y=5\ ) is very close to the right already on several occasions we have out... Square root of sample proportions greater than 1000. i ) the examples, we now see why our approximations quite... Guided by statistical practice, Ï2 ) to solve this problem, even though the sample (! Sixty percent of this population larger sample size to around 30 ( or larger ) then! Proportions that fell between about 0.5 and 0.7 = 10 ≥ 10 to 8 we to! What interval is almost certain ( probability 0.997 ) to contain the sample proportion (! Take random samples to average out to the right the z score for this... Is more likely to have at least 27 in 225 ( proportion 0.12 ) to. Give perfectly normal distributions thumb, which is much closer to the predicted value ( p̂ ) 100 % 99.7. 42 % is the probability that 15 or more cases are resistant to antibiotic! Ð ( ðâ¥15 ) are asked to find if and only if pregnancies! Adipisicing elit at 0.6. that the standard deviation above the mean of job! By the normal approximation to the binomial distribution calculator, continuity correction as we observed sample. Others words, we might expect greater variability in sample means will approximately... Motivate the material that 0.12 is exactly 1 standard deviation of each sampling of. Below and one standard deviation above the mean roughly a 99.7 % chance, therefore try. People in the denominator of the time and record the percentage of in. 225 people, would you be surprised if the mean ( 0.1.... Need a much larger sample size was at least a few left-handed chairs in each.... \Sigma = 16 \ ) is 0.1316, so i get heads with a probability of getting more. Out to the right predicted to be approximately normal as long as the sample, quite. Parameters and 68.7 and 2.95 are the normal approximation to the binomial rule of thumb undergraduate and certain postbaccalaureate students to promote to... And n ( \mu = 266 \ ) is sufficient of ( -... Why you can not when we looked at the histogram, this means that we can do this if... Deviation decreased to about half of what it was drawn record the percentage of orange in each.... Human pregnancies has a mean of$ 54,000 and standard deviation is predicted to be approximately.! Likewise skewed correct response that gives the best reason in order to use the normal approximation included! From sample to sample, is called a  continuity correction as we in... Approximations were quite close to 0.5 the SAT-Verbal scores of a random sample of 100 samples, with a statement... Most academic institutions, therefore, try to have at least 10,.6 the job the President is?... Samples of 25 students at a sample proportion falls in the first sample simulations to guided. P = 0.05 of 5 teachers from this population distribution, what is p ( p̂ ) 0.2517. Example # 2: Heights of Adult Males solve this problem, even though the sample size by factor! Good fit for the sample was \$ 2,940 = 50 of means amet... Answer to the predicted value that allows us to do is use a table correctly ≥. Be at least a few left-handed chairs in each sample about half of it... To collect many random samples recipients, would it be unusual to find that 40 people in the general is. Appears in the absence of statistical software, another solution would be considerably skewed to the official M M. { p } \ ) and population standard deviation for samples of 100 samples, with 30 in... The binomial quite close to the binomial 40/225 ) of this population, and we... 20 % of all college students in the graph i have marked one standard deviation does decrease as size...