INFERENTIAL STATISTICS - London Term Papers

Rewrite the paper in your own words and answer the last two question
6 and 7
1.. What does p = .05 mean? What are some misconceptions about the meaning of p =.05? Why are they wrong? Should all research adhere to the p = .05 standard for significance? Why or why not?
The measured probability of a finding occurring, ie rejecting the null hypothesis, by chance alone given that the null hypothesis is actually true. By convention, a p value < 0.05 is often considered significant. Thus p=0.05 would mean that if the study was repeated many times, and if the results are just random, not due to the intervention, then it would get the same (or better) reduction 5% of the time
.People interpret the p-value in many incorrect ways, and try to draw other conclusions from p-values, which do not follow. There are several common misunderstandings about p-values
1.The p-value is not the probability that the null hypothesis is true, nor is it the probability that the alternative hypothesis is false – it is not connected to either of these. Frequentist statistics does not, and cannot, attach probabilities to hypotheses.
2. The p-value is not the probability that a finding is “merely a fluke.” Calculating the p-value is based on the assumption that every finding is a fluke, that is, the product of chance alone. Thus, the probability that the result is due to chance is in fact unity.
3.The p-value is not the probability of falsely rejecting the null hypothesis. This error is a version of the so-called prosecutor’s fallacy.
4. The p-value is not the probability that replicating the experiment would yield the same conclusion. Quantifying the replicability of an experiment was attempted through the concept of p-rep.
Source(s):
www.musc.edu/dc/icrebm/statisticalsignif…
http://en.wikipedia.org/wiki/P-value
2. Compare and contrast the concepts of effect size and statistical significance.
http://meera.snre.umich.edu/plan-an-evaluation/related-topics/power-analysis-statistical-significance-effect-size
Effect Size is estimated by the ratio of the mean difference between the two groups divided by the standard deviation of the control group. Statistical significance testing for statistical significance helps you learn how likely it is that these changes occurred randomly and do not represent differences due to the program.
Statistical tests look for evidence that you can reject the null hypothesis and conclude that your program had an effect. Effect size is when a difference is statistically significant; it does not necessarily mean that it is big, important, or helpful in decision-making. It simply means you can be confident that there is a difference.
3. What is the difference between a statistically significant result and a clinically or “real world” significant result? Give examples of both.
http://pmj.bmj.com/content/77/905/201.full
Statistical significance measures how likely that any apparent differences in outcome between treatment and control groups are real and not due to chance. p Values and confidence intervals (CI) are the most commonly used measures of statistical significance. The p values give the probability that any particular outcome would have arisen by chance with the assumption that the new and the control treatments are equally effective as the null hypothesis. CI estimate the range within which the real results would fall if the trial is conducted many times. Hence, 95% CI of the difference in treatment outcomes between the two groups would indicate the range which the differences between the two treatments would fall on 95% of the occasions, if the trial is carried out many times.
Clinical significance measures how large the differences in treatment effects are in clinical practice. Different measures have been devised. Relative risk is independent of the prevalence of the disease and can be applied to populations with different prevalence of the disease. Relative risk is the ratio of the risks in the treatment group to the event rate in the control group. However, patients may not consider this measure relevant to them as it does not specify the size of the absolute risk. The measures absolute risk reduction (ARR) and numbers needed to treat (NNT) vary with the prevalence of the disease. ARR is simply the difference in the absolute risks between the treatment group and the control group. NNT is the number of patients needed to treat to prevent one adverse event, and is numerically equal to 1/ARR. NNT has been highlighted as a meaningful measure of clinical significance.3 The level of treatment effect regarded as clinically significant also depends on the severity of the disease and any potential side effects of the treatment.
4. What is NHST? Describe the assumptions of the model.
Null hypothesis significance testing (NHST) is the re- searcher’s workhorse for making inductive inferences.
5. Describe and explain three criticisms of NHST.
http://www.andrews.edu/~rbailey/Chapter%20two/7217331.pdf
Why was there a discrepancy between the many articles acknowledging prob- lems with NHST and the failure to recognize these problems by these research and statistics textbooks? We suggest three explanations for this apparent discrepancy. The first concerns textbook revisions. Most of these texts, especially the research texts, were in their third to sixth edition, and they had originally been published before 1990 when NHST was less controversial. Our speculation is that the authors of research textbooks in which we found little or no statement of the NHST controversy, focused their revisions on updating the content literature about the studies and methods they cited. They may not have updated the statis- tics chapters much, perhaps assuming that statistics do not change. In addition, adding information about how to compute and report effect size and confidence intervals would change too many sections of the text. Authors have to be practi- cal, considering publishing company deadlines and competition from other newer texts. Thus, textbook revisions in this area have generally been limited.
A second possible explanation for the failure to include NHST issues in text- books concerns the level of depth, difficulty of concepts, and students’ prior knowledge. The textbooks that we reviewed were intended for master’s- or doc- toral-level students in education, often as a first course in research or statistics or one taken many years after having an earlier such course. The logic of hypothe- sis testing is relatively difficult to understand, especially if students are not famil- iar with research design and statistics.
Our third possible explanation relates to best practice. Although there is gen- eral acknowledgement that each of the topics we explored should be covered in research and statistics textbooks, there is not general agreement about how it should be covered. This is especially the case with regard to how to decide whether a statistically significant finding has practical importance. There is also controversy about best practice for hypothesis significance testing (Harlow, Mulaik, & Steiger, 1997) and effect size reporting (Levin & Robinson, 2000; Robinson & Levin, 1997). Textbook authors (and publishers) are often reluctant to put best practices in print that may be changed in the next few years.
6. Describe and explain two alternatives to NHST. What do their proponents consider to be their advantages?
7. Which type of analysis would best answer the research question you stated in Activity 1? Justify your answer.
==============