Hypothesis testing and p-values (video)

Want to join the conversation?

Sort by:

robbimj
13 years agoPosted 13 years ago. Direct link to robbimj's post “Starting at 4:22, why do ...”
Starting at
4:22
, why do you need to estimate the sample standard deviation when you already have it(.5)? He goes on to say that you put a hat on it to show that you estimated the population standard deviation by using the sample but why does the sigma have a hat for population estimate and have an x bar for sample? Is the notation correct on that section?
Button navigates to signup page•Comment on robbimj's post “Starting at 4:22, why do ...”
(69 votes)
- byslkwd
  11 years agoPosted 11 years ago. Direct link to byslkwd's post “Don't forget, we don't re...”
  Don't forget, we don't really care about the st.dv. of the sampl, we care about it's relationship to the population. So we have to take measures that involve the actual population. You must first see the video "standard error of the mean" to get this one.
  Button navigates to signup page
  (8 votes)
MLKandigian
12 years agoPosted 12 years ago. Direct link to MLKandigian's post “Why are you not using a t...”
Why are you not using a t-distribution to find the probability of getting the sample result? I know that when the sample size is large (n = 100), a t-distribution is essentially the same as a normal distribution, but I think this lesson can be misleading when we are taught to use a t-distribution in the common case when the population standard deviation is not known and we are estimating it from the sample.
Button navigates to signup page•Button navigates to signup page
(65 votes)
- Cathy Antonakos
  12 years agoPosted 12 years ago. Direct link to Cathy Antonakos's post “The t-test is more conser...”
  The t-test is more conservative, if the sample size is small. I think you would opt for the more conservative test, knowing that with a larger sample size, there is essentially no difference between t and z. In general, when comparing two means, the t-test is used. Note from the results given above by ericp, that the conclusion from either test is the same. The two groups differ significantly. In scientific reports, p-value is reported to 2 decimal places. So using either the z or t test, you would report a significant difference "with p < .01".
  Comment on Cathy Antonakos's post “The t-test is more conser...”
  (22 votes)
aayman.farzand
13 years agoPosted 13 years ago. Direct link to aayman.farzand's post “SHouldn't it be the other...”
See Also
What Is Body Mass Index (BMI)?The ABCs of particulate matter: Meaning, sources, and impact What is APR? Here’s how loan annual percentage rates work How Much Do Internet and Wi-Fi Cost?
SHouldn't it be the other way around when calculating the Z value?
(1.05-1.2)/0.05 instead of (1.2-1.05)/0.05?
My professor always told me to do it that way. The final conclusion doesn't change in this case though, but just wanted to make sure if that's the proper way.
Button navigates to signup page•Comment on aayman.farzand's post “SHouldn't it be the other...”
(60 votes)
- ahmet
  11 years agoPosted 11 years ago. Direct link to ahmet's post “since normal probability ...”
  since normal probability distribution (bell curve) is symmetric around the mean, it doesnt matter. It gives same result in terms of area under curve, thats why prof. wanted to make it less complex in saying that. But if we were dealing with a non symmetric prob. distr. like F distr, then it would matter.
  hope that helps.
  Comment on ahmet's post “since normal probability ...”
  (16 votes)
Brandon French
13 years agoPosted 13 years ago. Direct link to Brandon French's post “Is it valid to assume the...”
Is it valid to assume the sample SD is close to the population SD? Even if the sample size is high, the rats in the sample have been injected, how do we know that doesn't affect the sample SD?
Button navigates to signup page•Button navigates to signup page
(15 votes)
- w_mhs
  12 years agoPosted 12 years ago. Direct link to w_mhs's post “It is an assumption you a...”
  It is an assumption you are making, justified by the fact that your Ho is that the drug has no effect, and that the populations (drug vs. no drug) will actually be identical. If the drug has no effect, then the standard deviation of drug and no drug rats should be the same. It is an assumption, justified with some logic, but not proven.
  In a research paper, this would be recognized as a weakness, but an unavoidable one, because it is impossible to know the true standard deviation of either population - you only know the samples.
  Button navigates to signup page
  (11 votes)
André Kovac
9 years agoPosted 9 years ago. Direct link to André Kovac's post “I have a very fundamental...”
I have a very fundamental question:
Short formulation of the question: Why is the hypothesis test designed the way it is? I want to know exactly why we can't calculate the probability of the alternative hypothesis given the sample directly and why we have to assume the null hypothesis is true?
Long formulation of the question: When conducting an experiment and setting up hypotheses about its outcome, what we actually want to know is whether our alternative hypothesis is true or at least how likely it is (i.e. the probability of the alternative hypothesis to be true), right?
The here presented hypothesis test only gives us the probability of the sample mean to be extreme, but not the probability of the real underlying population mean to be extreme.
So why do we have to go through this process of calculating the probability of the sample given the null-hypothesis is true and then use this result to infer the likelihood of the alternative hypothesis?
Unfortunately I have not found one textbook on statistics yet which answers this fundamental question but I was reading so many enlightening answers here and hope to get an answer!
From my understanding of the hypothesis test I would answer my own question like that:
Since we don't know anything about the underlying population except the tested sample, we just are not able to do any calculations of it. This includes calculating the probabilities of the alternative hypothesis because it is a hypothesis about the population.
We have to work under the assumption that the null hypothesis is true because otherwise we cannot really do anything, we wouldn't know where to center the normally distributed curve which we use to calculate significance.. ...but somehow I am not entirely convinced by my own answer..
Even if my own answer to the question happens to be not far away from the truth, I would appreciate it very much if someone could elaborate a bit.
Thank you!
Button navigates to signup page•Button navigates to signup page
(12 votes)
- deka
  2 years agoPosted 2 years ago. Direct link to deka's post “H_0: pop_mean that someon...”
  See Also
  Max: everything you need to know about HBO's streaming service
  H_0: pop_mean that someone (including you) insists it's true
  H_A: pop_mean what another (including you, again) insists H_0 can't be right, cause this, their own mean, is true
  sample_mean (and sample_std): the only evidence for both sides to check which is right
  in short, what you're doing with significance test is attacking someone's mean with a different mean based on a gathered data
  if it's good enough to support you, you can kill H_0 and insist your own H_A as the next H_0 (that's how scientifical theories have been developed and challenged and so forth)
  if not, you can't kill it. that's it. no more, no less (what about your precious H_A? just forget about that, not enough evidence)
  Button navigates to signup page
  (1 vote)
7speter
12 years agoPosted 12 years ago. Direct link to 7speter's post “I don't understand where ...”
I don't understand where Sal got 99.7%... can anyone explain? (
8:50
)
Button navigates to signup page•Button navigates to signup page
(8 votes)
- Richard Haans
  12 years agoPosted 12 years ago. Direct link to Richard Haans's post “He mentioned this a coupl...”
  He mentioned this a couple of videos ago, but he is using the empirical rule, which states that, for a normal distribution, 99,7% of all values lie within 3 standard deviations of the mean. Similarly, 68,27% lies within 1 standard deviation and 95,45% within 2. See: http://en.wikipedia.org/wiki/68-95-99.7_rule
  Button navigates to signup page
  (9 votes)
Charis Apostolidis
11 years agoPosted 11 years ago. Direct link to Charis Apostolidis's post “Shouldn't we say that the...”
Shouldn't we say that the alternative hypothesis is just μ<1.2s and not in both directions?
Button navigates to signup page•Button navigates to signup page
(8 votes)
- Matthew Daly
  11 years agoPosted 11 years ago. Direct link to Matthew Daly's post “That's an important quest...”
  That's an important question. In the end, it gets down to the reason that you are conducting the experiment. In this case, the null hypothesis is that the drug doesn't have an effect on response time, so you want to measure both tails. If your null hypothesis was that the drug doesn't have a negative effect on response time, then you would only measure one tail.
  Comment on Matthew Daly's post “That's an important quest...”
  (8 votes)
theoriginalkista
12 years agoPosted 12 years ago. Direct link to theoriginalkista's post “How do you calculate the ...”
How do you calculate the critical value? I cant find an explaination for it in your video list. Thank you!
Button navigates to signup page•Button navigates to signup page
(6 votes)
- Ricardo Saporta
  12 years agoPosted 12 years ago. Direct link to Ricardo Saporta's post “short answer: Critical v...”
  short answer: Critical values are generally chosen or looked up in a table (based on a chosen alpha).
  longer answer:
  --------------------
  In this video there was no critical value set for this experiment. In the last seconds of the video, Sal briefly mentions a p-value of 5% (0.05), which would have a critical of value of z = (+/-) 1.96. Since the experiment produced a z-score of 3, which is more extreme than 1.96, we reject the null hypothesis.
  Generally, one would chose an alpha (a percentage) which represents the "tolerance level for making a mistake.*" Then the corresponding critical value can be looked up from a table. [* the "mistake" being to incorrectly reject the null hypothesis. In other words, we made the error of claiming that the experiment had an effect when it did not.]
  The critical value is the cut-off point that corresponds to that alpha; any value beyond the critical value is less than alpha(%) likely to occur by chance.
  see the wikipedia page for a z-tables and how to read them
  http://en.wikipedia.org/wiki/Standard_normal_table
  note that for an alpha of 5%, in a cumulative table, you would first divide your alpha in half for a two-tailed test, then subtract that from 1. That is the value you are looking for in the table. So we get 1 - (.05/2) = 1 - .025 = 0.9750
  We find 0.9750 in our table, look at the row: 1.9; look at the column: 0.06; add the two together to get the corresponding z-score: 1.96.
  Button navigates to signup page
  (6 votes)
Lauren Gilbert
11 years agoPosted 11 years ago. Direct link to Lauren Gilbert's post “If we assume that the nul...”
If we assume that the null hypothesis is true, then why do we assume that the sample mean is 1.2 sec? We already know that it's 1.05 sec.
Button navigates to signup page•Button navigates to signup page
(5 votes)
- tyersome
  11 years agoPosted 11 years ago. Direct link to tyersome's post “Because that _*is*_ the...”
  Because that is the null hypothesis (H0).
  What we are testing is how likely we are to have seen the data, under the assumption that H0 is true. Null hypothesis testing follows a somewhat backward seeming logic, but this is apparently pretty standard in mathematics.
  1) We calculate how probable it is that we would have seen the observed data if H0 is true.
  2) We then either reject H0 (or fail to reject it) depending on how often we are willing to wrongly reject H0 (this is the Type I error rate).
  3) If we reject H0 then we provisionally conclude that our alternate hypothesis could be true ...
  Button navigates to signup page
  (7 votes)
Tombentom
8 years agoPosted 8 years ago. Direct link to Tombentom's post “Sal said: "Assuming the N...”
Sal said: "Assuming the Null Hyphothesis was true, if the probability of getting the result from the sample is very small, then we reject the Null!"
But how could this be? Because if this probability is very small, it means that the Null is indeed true as a common sense. This is somewhat counter intuitive and I really get confused!!
Someone please, speak the kind of language where newbie like me can understand ?? I dont see any relationship here in this?? It seems that he took for granted that we already understood sth...but i actually many of us dont! So plze point out the logic that you make here related to the null, the z score, and the rejection. thank you!
Button navigates to signup page•Comment on Tombentom's post “Sal said: "Assuming the N...”
(2 votes)
- Dr C
  8 years agoPosted 8 years ago. Direct link to Dr C's post “I think that your comment...”
  I think that your comment shows you came to the proper realization. Maybe this will help clarify or solidify the ideas, or in case others don't fully see it as you did:
  First: There is some population (of rats injected with this drug), and this population has a mean, µ. We don't know the value of µ, but we can use a hypothesis test to gives us some information about it (if we knew µ, we wouldn't have to do a hypothesis test at all).
  Second: We form a null hypothesis, in this case it is Ho: µ = 1.2. The 1.2 is just some value of interest to which we want to compare. In this case, it is the "status quo", the value of µ for rats NOT injected with the drug.
  Third: We calculate our test statistic (the z-score) and p-value, assuming that Ho is correct. That is, we are assuming that the rats injected with the drug have the same value of µ as the rats not injected with the drug. It's important to remember that this is an assumption, and it may not accurately reflect reality. The data values will follow the real value of µ, because that is reality. So if µ=1, then the values will tend to group around 1. If µ=1.2, then the values will tend to group around 1.2.
  Then, if the null hypothesis is wrong, then the data will tend to group at a point that is not the value in the null hypothesis (1.2), and then our p-value will wind up being very small. If the null hypothesis is correct, or close to being correct, then the p-value will be larger, because the data values will group around the value we hypothesized.
  Comment on Dr C's post “I think that your comment...”
  (6 votes)

Video transcript

A neurologist is testing theeffect of a drug on response time by injecting 100 rats witha unit dose of the drug, subjecting each to neurologicalstimulus and recording its response time. The neurologist knows that themean response time for rats not injected with thedrug is 1.2 seconds. The mean of the 100 injectedrats response times is 1.05 seconds with thesample standard deviation of 0.5 seconds. Do you think that the drug hasan affect on response time? So to do this we're going toset up two hypotheses. We're going to say, one, thefirst hypothesis is we're going to call it the nullhypothesis, and that is that the drug has no effecton response time. And your null hypothesis isalways going to be-- you can view it as a status quo. You assume that whatever yourresearching has no effect. So drug has no effect. Or another way to think aboutit is that the mean of the rats taking the drug should bethe mean with the drug-- let me write it this way-- with themean is still going to be 1.2 seconds evenwith the drug. So that's essentially saying ithas no effect, because we know that if you don't givethe drug the mean response time is 1.2 seconds. Now, what you want is analternative hypothesis. The hypothesis is no,I think the drug actually does do something. So the alternative hypothesis,right over here, that the drug has an effect. Or another way to think aboutit is that the mean does not equal 1.2 seconds whenthe drug is given. So how do we think about this? How do we know whether we shouldaccept the alternative hypothesis or whether we shouldjust default to the null hypothesis because thedata isn't convincing? And the way we're going to do itin this video, and this is really the way it's done inpretty much all of science, is you say OK, let's assume thatthe null hypothesis is true. If the null hypothesis was true,what is the probability that we would have gotten theseresults with the sample? And if that probability isreally, really small, then the null hypothesis probablyisn't true. We could probably reject thenull hypothesis and we'll say well, we kind of believe in thealternative hypothesis. So let's think about that. Let's assume that the nullhypothesis is true. So if we assume the nullhypothesis is true, let's try to figure out the probabilitythat we would have actually gotten this result, that wewould have actually gotten a sample mean of 1.05 seconds witha standard deviation of 0.5 seconds. So I want to see if we assumedthe null hypothesis is true, I want to figure out theprobability-- and actually what we're going to do isnot just figure out the probability of this, theprobability of getting something like this or evenmore extreme than this. So how likely of anevent is that? To think about that let's justthink about the sampling distribution if we assumethe null hypothesis. So the sampling distributionis like this. It'll be a normaldistribution. We have a good numberof samples, we have 100 samples here. So this is the samplingdistribution. It will have a mean. Now if we assume the nullhypothesis, that the drug has no effect, the mean of oursampling distribution will be the same thing as the meaningof the population distribution, which wouldbe equal to 1.2 seconds. Now, what is the standarddeviation of our sampling distribution? The standard deviation of oursampling distribution should be equal to the standarddeviation of the population distribution divided by thesquare root of our sample size, so divided by thesquare root of 100. We do not know what the standarddeviation of the entire population is. So what we're going to do isestimate it with our sample standard deviation. And it's a reasonable thing todo, especially because we have a nice sample size. The sample size isgreater than 100. So this is going to be a prettygood approximator. This is going to be a prettygood approximator for this over here. So we could say that this isgoing to be approximately equal to our sample standarddeviation divided by the square root of 100, which isgoing to be equal to our sample standard deviation is0.5, 0.5 seconds, and we want to divide that by squareroot of 100 is 10. So 0.5 divided by 10 is 0.05. So the standard deviation of oursampling distribution is going to be-- and we'll put alittle hat over it to show that we approximated it with--we approximated the population standard deviation with thesample standard deviation. So it is going to be equalto 0.5 divided by 10. So 0.05. So what is the probability--so let's think about it this way. What is the probability ofgetting 1.05 seconds? Or another way to think aboutit is how many standard deviations away from this meanis 1.05 seconds, and what is the probability of getting aresult at least that many standard deviations awayfrom the mean. So let's figure out how manystandard deviations away from the mean that is. Now essentially we're justfiguring out a Z-score, a Z-score for this resultright over there. So let me pick a nice color--I haven't used orange yet. So our Z-score-- you couldeven do the Z-statistic. It's being derived from theseother sample statistics. So our Z-statistic, how farare we away from the mean? Well the mean is 1.2. And we are at 1.05, so I'llput that less just so that it'll be a positive distance. So that's how far away we are. And if we wanted it in termsof standard deviations, we want to divide it by our bestestimate of the sampling distribution's standarddeviation, which is this 0.05. So this is 0.05, and what isthis going to be equal to? This result right here,1.05 seconds. 1.2 minus 1.05 is 0.15. So this is 0.15 in the numeratordivided by 0.05 in the denominator, and sothis is going to be 3. So this result right hereis 3 standard deviations away from the mean. So let me draw this. This is the mean. If I did 1 standard deviation,2 standard deviations, 3 standard deviations-- that'sin the positive direction. Actually let me drawit a little bit different than that. This wasn't a nicely drawnbell curve, but I'll do 1 standard deviation, 2 standarddeviation, and then 3 standard deviations in the positivedirection. And then we have 1 standarddeviation, 2 standard deviations, and 3 standarddeviations in the negative direction. So this result right here, 1.05seconds that we got for our 100 rat sample isright over here. 3 standard deviationsbelow the mean. Now what is the probabilityof getting a result this extreme by chance? And when I talk about thisextreme, it could be either a result less than this or aresult of that extreme in the positive direction. More than 3 standarddeviations. So this is essentially, if wethink about the probability of getting a result more extremethan this result right over here, we're thinking aboutthis area under the bell curve, both in the negativedirection or in the positive direction. What is the probabilityof that? Well we go from the empiricalrule that 99.7% of the probability is within 3standard deviations. So this thing right here-- youcan look it up on a Z-table as well, but 3 standard deviationis a nice clean number that doesn't hurt to remember. So we know that this area righthere I'm doing and just reddish-orange, that arearight over is 99.7%. So what is left for these twomagenta or pink areas? Well if these are 99.7% andboth of these combined are going to be 0.3%. So both of these combined are0.3-- I should write it this way or exactly-- are 0.3%. 0.3%. Or is we wrote it as a decimalit would be 0.003 of the total area under the curve. So to answer our question, if weassume that the drug has no effect, the probability ofgetting a sample this extreme or actually more extremethan this is only 0.3% Less than 1 in 300. So if the null hypothesis wastrue, there's only a 1 in 300 chance that we would havegotten a result this extreme or more. So at least from my point ofview this results seems to favor the alternativehypothesis. I'm going to reject thenull hypothesis. I don't know 100% sure. But if the null hypothesis wastrue there's only 1 in 300 chance of getting this. So I'm going to go with thealternative hypothesis. And just to give you a littlebit of some of the name or the labels you might see in somestatistics or in some research papers, this value, theprobability of getting a result more extreme than thisgiven the null hypothesis is called a P-value. So the P-value here, and thatreally just stands for probability value, the P-valueright over here is 0.003. So there's a very, very smallprobability that we could have gotten this result if the nullhypothesis was true, so we will reject it. And in general, most peoplehave some type of a threshold here. If you have a P-value less than5%, which means less than 1 in 20 shot, let's say, youknow what, I'm going to reject the null hypothesis. There's less than a 1 in 20chance of getting that result. Here we got much lessthan 1 in 20. So this is a very strongindicator that the null hypothesis is incorrect,and the drug definitely has some effect.

Hypothesis testing and p-values (video) | Khan Academy (2024)

Want to join the conversation?

Video transcript

References