Free Essay

Non Parametric Tests

In: Other Topics

Submitted By raja9899
Words 1170
Pages 5
6 - NONPARAMETRIC TESTS FOR COMPARING TWO POPULATIONS

In situations where the normality of the population(s) is suspect or the sample sizes are so small that checking normality is not really feasible, it is sometimes preferable to use nonparametric tests to make inferences about “average” value.

Wilcoxon Rank Sum Test (a.k.a. Mann-Whitney U Test)
This test is an alternative to the two-sample t-test for comparing the “average” value of two populations where the samples from each population are taken independently.

The hypotheses tested can be stated as follows:

[pic] The distribution of population 1 and population 2 are identical. If the populations are symmetric (but not necessarily normal) the null hypothesis can be expressed in terms of the population medians as: [pic]

[pic] The distribution of population 1 and population 2 are different. (two-tailed) [pic]

or

[pic] The distribution of population 1 is shifted to the right of the distribution for population 2, i.e. the population 1 values are generally larger than the population 2 values. (right-tailed) [pic]

or

[pic] The distribution of population 1 is shifted to the left of the distribution for population 2, i.e. the population 1 values are generally smaller than the population 2 values. (left-tailed) [pic]

The tests statistic is based on the sum of the ranks assigned to the observed data from each population when the combined sample is ranked from smallest to largest. We will always assume that the sample size (m) for population 1 is less than or equal to the sample size (n) from population 2.

Example: Anticipated Length of Office Visit and Weight Status of Patients
Researchers wanted to compare the anticipated office visit time for patients whose BMI indicates normal weight vs. those whose BMI indicates the patient is overweight. It is hypothesized that doctors will report a shorter anticipated office visit time for patients who are classified as overweight. Stating this hypothesis in terms of medians the research hypothesis would be that median office visit time for normal weight patients is greater than that for overweight patients.

[pic] vs. [pic]

The data below are the anticipated office visit time (min) for these two groups of patients.

Normal: 20 25 30 35 40 45 50

Overweight: 5 10 15 15 20 30

The sum of the ranked appointment lengths for normal weight patients is: _________.

The sum of the ranked appointment length for overweight patients is: ____________.

The sum of the ranks for the overweight patients is smaller than the rank sum for the normal weight patients but this would be expected even if the null hypothesis were true. Why?

The test statistic,[pic], is the sum of the ranks for population O, the overweight patients. We can use the table on the following page to determine whether to reject the null or not. Intuitively we will reject the null hypothesis if the sum of the ranked appointment lengths for the overweight patients is “small”. The table tells what “small” is for a given significance level ([pic]).

For m = 6 and n = 7 we find the following from the table:

From Wilcoxon Rank Sum Table: 1-tail α ’ .025 α ’ .050 2-tail α ’ .050 α ’ .100 m n W d P W d P
6 7 27 57 7 .0175 29 55 9 .0367

The table says we will reject the null at the [pic]level if:
[pic] for [pic] (
[pic] > 55 for [pic]
[pic]< 27 or [pic] for [pic]

We have evidence to conclude that the anticipated office visit times are generally smaller than the anticipated office times for patients with BMI’s considered normal (p < .05).

WILCOXON RANK SUM TEST IN JMP
[pic]

Wilcoxon Signed Rank Test
This test is an alternative to the paired t-test which is used when we do not wish to assume that the population of paired differences is normally distributed. As with the Mann-Whitney U test, the Wilcoxon Signed-Rank Test use ranks based on the paired differences rather than the actual values.

Example: Effect of Togetherness on the Heart Rate of Rats

|Rat |Alone Rate |Together Rate |di = Ti – Ai |Sign || di | |Rank |di| |Signed Rank |
|1 |463 |523 |60 | | | | |
|2 |462 |494 |32 | | | | |
|3 |462 |461 |-1 | | | | |
|4 |456 |535 |79 | | | | |
|5 |450 |476 |26 | | | | |
|6 |426 |454 |28 | | | | |
|7 |418 |448 |30 | | | | |
|8 |415 |408 |-7 | | | | |
|9 |409 |470 |61 | | | | |
|10 |402 |437 |35 | | | | |

We then calculate [pic] = the sum of the positive signed ranks = _______________

and [pic] = the sum of the negative signed ranks = _______________

Are hypotheses can be stated in terms of the median of the paired differences. Listed below are the hypotheses along with the test statistic based on the signed rank sums used to test it.

Intuitively we will conclude there has been a heart rate increase if….

Details of the Wilcoxon Signed Rank Test

Let [pic]

[pic] vs. [pic] (two-tailed) Test statistic [pic] In practical terms the null says there is no change in the rats heart rate after the change in environment, the alternative says there is a shift up or down in their heart rate.

[pic] vs. [pic] (right-tailed) Test statistic [pic]
In practical terms the alternative says that there is an increase or shift up in their heart rate as the difference is defined to be Together – Alone.

[pic] vs. [pic] (left-tailed) Test statistic [pic]
In practical terms the alternative says that there is a decrease or shift down in their heart rate as the difference is defined to be Together – Alone.

For this example, if had originally hypothesized that the heart rate of a rat will increase when it is placed in a social environment then we have the right-tailed alternative and our test statistic W = _______.

The Wilcoxon Signed Rank Test Table (handed out) give p-values associated with an observed test statistic value w for a given sample size, i.e. number of pairs, n.
[pic]

Here our p-value = ____________, thus we reject the null hypothesis and conclude that the heart rate of a rat will generally increase when it is taken from a solitary confinement and placed in a social environment with other rats.

WILCOXON SIGNED RANK TEST IN JMP

We first use JMP to form the paired differences as we did for the paired t-test.
[pic]

Select Distribution > Test Mean > Enter 0 for the hypothesized value and check the nonparametric test box. [pic]

The results of the test are shown below.
[pic]
Conclusion:
TABLE FOR WILCOXON RANK SUM TEST (Page 1)
[pic]

TABLE FOR WILCOXON RANK SUM TEST (Page 2)
[pic]

TABLE FOR WILCOXON SIGNED RANK TEST (Page 1)
[pic]

TABLE FOR WILCOXON SIGNED RANK TEST (Page 2)
[pic]

-----------------------
Data Table
[pic]

Select Nonparametric > Wilcoxon Test
[pic]

The p-values for the upper-tailed t-Test and the Wilcoxon signed-rank test have been highlighted. The test statistic reported by JMP for Wilcoxon test =[pic]
Why? I don’t know, but we only need the p-value anyway.

Similar Documents

Parametric and Non-Parametric Statistics Use in Research Methods

... Lilian Otieno, Resident Lecturer I am tasked to distinguish between parametric and non-parametric statistics and explain when to use each method in analysis of data. I shall first seek to define what parametric and non-parametric statistics mean and then compare and contrast them in the analysis of data. Parametric statistics is a branch of statistics that assumes that the data has come from a type of probability distribution and makes inferences about the parameters of the distribution. Most well-known elementary statistical methods are parametric. (According to Wikipedia, the online dictionary). In statistical analysis, parametric significance tests are only valid if certain assumptions are met. If they are not, nonparametric tests can be used. A parameter is a measure of an entire population, such as the mean height of every man in London. In statistical analysis, one practically never has measurements from a whole population and has to infer the characteristics of the population from a sample. Generally speaking parametric methods make more assumptions than non-parametric methods. If those extra assumptions are correct, parametric methods can produce more accurate and precise estimates. They are said to have more statistical power. However, if assumptions are incorrect, parametric methods can be very misleading. For that reason they are often not considered robust. On the other hand, parametric formulae are often simpler to write down and faster to compute. In some,...

Words: 3625 - Pages: 15

Statistics for Nursing

...Practical 2a Comments on Z-tests and t-tests 1. You should have realized from the lectures that in practice, a z-test is seldom used, while the ‘default’ test for single sample or two-samples mean(s) is the ttest. This is because in most practical situations, the population variance is seldom known and therefore we need to estimate that by the sample variance, thus justifying a t-test rather than a z-test. It is always good to perform the standard exploratory data analysis before commencing any hypothesis testing involving t-tests. It is often useful to check through summary statistics (like the minimum and maximum of the data), as well as a quick plot of the data (box-plots), to check for any problematic data or outliers. The use of a t-test requires the assumption that the data is distributed like a normal distribution – essentially a bell-shaped curve for the histogram. Therefore it is extremely informative to look at the histogram of the data before commencing on testing, as this will indicate whether the use of the t-test is justified. Before commencing any testing, evaluate what are your hypotheses that you are interested in. If you are testing the mean for a single sample, are you testing the mean to be 0, or are you testing the mean against some non-zero value. If so, do remember to change the input in SPSS correspondingly. Similarly if you are testing the means for two samples, are you testing for the difference to be zero, or against a non-zero difference.......

Words: 755 - Pages: 4

Applying Analysis of Variance (Anova) and Nonparametric Tests Simulation

...Applying Analysis of Variance (ANOVA) and Nonparametric Tests Simulation RES 342 William Modey   Applying Analysis of Variance (ANOVA) and Nonparametric Tests Simulation ANOVA and Non Parametric tests can help in business endeavors wherever there is two or more variables or hypothesis. The ANOVA and Non Parametric Tests Simulation showed the various ways to do hypothesis testing with two or more hypothesis. Being able to do the various types of testing that come along with ANOVA and Non Parametric data sets is key to making the right decision when having two or more choices. The three lessons that I have learned after doing the ANOVA and Non Parametric Tests Simulation were to thoroughly analyze the presented problem before attempting to make a decision, enlist the help of others when making a decision or choosing a course of action, and to continually improve on decision making skills based on learning from past mistakes made. As a result of using this simulation the concepts and analytic tools that I would be able to use in my workplace are that I am now able to approach a decision making scenario with appropriate knowledge and testing procedures to help make the best decision. The skills that I learned in the simulation, such as the different hypothesis testing procedures, could be key to helping me improve my managerial skills. Based on my passed experiences and current knowledge, I would recommend that the key decision maker take his or her time when......

Words: 396 - Pages: 2

Free Essay

Event Studies on a Small Stock Exchange

...heteroskedasticity correction. The abnormality is detected by analyzing difference between market return and expected return from CAPM model. Returns for event days are the subjects of test statistics. The examined statistics were based on t-test with cross sectional independence, t-test with standardized abnormal return and t-test with adjusted standardized abnormal return. These tests are the parametric tests for abnormality, the authors also conducted non-parametric test such as rank test, sign test and generalized sign test. The event days are specified by simulation and uniform distribution is assumed. After event day specification the impact of 0.5% and 2% are added to abnormal return on the event day. The simulation is repeated 1000 times to achieve a distribution of abnormal returns for each stock. The stock data is taken from Copenhagen Stock Exchange. Conclusion The paper suggests that event studies can be done after some adjustments. The statistical power of the abnormal return detection is material after 25 event day determination. Authors suggest the success of non-parametric tests over parametric t-tests because of non-normality of returns except the case of event caused volatility. But when non-normality, variance change, unknown event day and thin trade is considered no tests are superior to other was found. As a result, power is high in thick traded stocks and low in thin traded stocks....

Words: 409 - Pages: 2

Demand Forecasting

...background about parametric and nonparametric statistics and then show basic inferential statistics that examine associations among variables and tests of differences between groups. Parametric and Nonparametric Statistics In the world of statistics, distinctions are made in the types of analyses that can be used by the evaluator based on distribution assumptions and the levels of measurement data. For example, parametric statistics are based on the assumption of normal distribution and randomized sampling that results in interval or ratio data. The statistical tests usually determine significance of difference or relationships. These parametric statistical tests commonly include t-tests, Pearson product-moment correlations, and analyses of variance. Nonparametric statistics are known as distribution-free tests because they are not based on the assumptions of the normal probability curve. Nonparametric statistics do not specify conditions about parameters of the population but assume randomization and are usually applied to nominal and ordinal data. Several nonparametric tests do exist for interval data, however, when the sample size is small and the assumption of normal distribution would be violated. The most common forms of nonparametric tests are chi square analysis, Mann-Whitney U test, the Wilcoxon matched-pairs signed ranks test, Friedman test, and the Spearman rank-order correlation coefficient. These non-parametric tests are generally less powerful tests......

Words: 1788 - Pages: 8

Free Essay

Analyze the Results of Linker Scale

...standard deviations, and parametric statistics, which depend upon data that are normally distributed (F I G U R E 2), be used to analyze ordinal data? When conducting research, we measure data from a sample of the total population of interest, not from all members of the population. Parametric tests make assumptions about the underlying population from which the research data have been obtained—usually that these population data are normally distributed. Nonparametric tests do not make this assumption about the ‘‘shape’’ of the population from which the study data have been drawn. Nonparametric tests are less powerful than parametric tests and usually require a larger sample size (n value) to have the same power as parametric tests to find a difference between groups when a difference actually exists. Descriptive statistics, such as means and standard deviations, have unclear meanings when applied to Likert scale responses. For example, what does the average of ‘‘never’’ and ‘‘rarely’’ really mean? Does ‘‘rarely and a half’’ have a useful meaning?3 Furthermore, if responses are clustered at the high and low extremes, the mean may appear to be the neutral or middle response, but this may not fairly characterize the data. This clustering of extremes is common, for example, in trainee evaluations of experiences that may be very popular with one group and perceived as unnecessary by others (eg, an epidemiology course in medical school). Other non-normal......

Words: 1356 - Pages: 6

Free Essay

Hmw 8a

...and two-tailed hypotheses are both types of ______ hypotheses. a. null b. alternative c. directional d. non-directional 6. When using a ____ hypothesis, the researcher predicts the direction of the expected difference between the groups. a. null b. non-directional c. one-tailed d. two-tailed 7. A false alarm is to ____ as a miss is to _____. a. Type I error; Type II error b. Type II error; Type I error c. null hypothesis; alternative hypothesis d. alternative hypothesis; null hypothesis 8. Failing to reject Ho when we should have rejected it is a ____ error. a. Type I b. Type II c. null d. one-tailed 9. If researchers report that the results from their study were significant, p < .05, this means that: a. if they conducted the study over they would get the same results less than 5 times in 100. b. the results are untrustworthy. c. we would expect the results to occur by chance less than 95 times out of 100. d. we would expect the results to occur by chance less than 5 times out of 100. 10. _____ are most often used with interval or ratio data, whereas ______ are most often used with ordinal or nominal data. a. t tests; z tests b. z tests; t tests c. Parametric tests; nonparametric tests d. Nonparametric tests; parametric tests 11. Identify when it would be appropriate to use a parametric versus a nonparametric test. 12. What are inferential statistics and how do they differ from descriptive statistics?...

Words: 380 - Pages: 2

Web Analytic

...on each of these parts. For example, to help an airline schedule flights and decide what to charge for tickets, analysts may take into account the cities that have to be connected, the amount of fuel required to fly those routes, the expected number of passengers, pilots’ schedules, maintenance costs, and fuel prices. Extending the above example further on let us consider that airline has database which gives us schedule of their airliners, expected capacity, distance travelled, fixed cost & its variable operating cost etc. As these multiple & unorganized data pile up in your data base we need statistical tools for deriving the meaningful inference out of it. The rigor to ensure the appropriateness and accuracy of the statistical tests is the critical step as it can either lead to a path of reaching sub-optimal or misleading recommendations. OR analysts are less likely to fall into this category as the mathematical demands of the subject should make them well aware of the necessary issues and pitfalls. An analyst therefore can go ahead with prescriptive statistical analysis identifying best course of action & responses for the given situation of airline, giving an optimized solution for setting up a price per ticket across different routes and also scheduling the timetable for airlines. Statistic is mainly a collection of data and making a decision based on the analysis of data.Developing a good understanding of the kinds of data and data measurement is......

Words: 1057 - Pages: 5

Free Essay

Sosiology

...RSSTOTAL) * F values are calculated using the error from the same block * For t-test * Standard errors: * Error (b)n for interaction 9.78583 * 2 × Error (b)n for Factor M 2 × 9.78586 * 2 × Error (a)n for Factor PP 2 × 4.96756 * 2 critical-t values → t at 2 and t at 4 df i.e. 4.303 and 2.776 * Could ask: do ANOVA and t-test, or ANOVA and interpret results from F; Standard error for the difference (a or b); Conclusion: levels differ/do not differ at 1% etc. NS 13 – Non-parametric tests * Parametric tests for data with normal distribution (t, F or X2 distribution) * Non-parametric tests for * Categorical data, * Quantitative data divided into class intervals, * Small data sets, * Data sets without repetition of the TMTs. * Non- parametric tests * Medians, not Means * Usually rank your data * Single sample: * Sign test (No assumptions about distribution) * Rank test (assumes data comes from symmetrical distribution) * Wilcoxon’s symmetry test * For 2 independent samples * Mann-Whitney U test (Assumes distributions have same shape and equal variances) * For small experiments * For CR design → Kruskal-Wallis test * For RB design → Friedman test * Sign test * Equivalent to t-test for single mean for normal data * For small data sets obtained from a......

Words: 971 - Pages: 4

Essay on Research Critique Part 2

...and inferential statistics to scrutinize the data, the reason for the study was to evaluate the difference among the two groups, for the inferential statistics the McNemar test was used, which was suitable for the degree of enquiry due to the matched group, the Cohort design was also implemented as they intended to quantify between groups. Also as relates to the variables outcome. (Polit & Beck 2008). The paired T test was also proper for the study due to the fact its purpose is to test the differences among the two groups that are either paired or matched on the essence of the characteristics. The F test was also implemented, it occurs as the test used when apply multiple linear regression as this F statistics has been utilized while stipulating the influence between birth weight and preterm delivery (p1054). The purpose of linear regression is to describe the amount of the outcome variable is distinct to the independent variable (Burns & Groove, 2007) Descriptive statistics were used by the author to relate the dispensing of the statistical data among topics such as age, race, and parity, they also possess illustrative analysis to explain the issues of statistical variables among the distributing of outcomes variables among the specimens. A parametric as well as non-parametric statistics test was performed by Ickovics et al (2003), measurement of a few variable were done on the level of assessing, this includes numbers of prenatal visit , Ickovics.et al (2003)......

Words: 635 - Pages: 3

............................................................................ 9 TOPIC 6 HYPOTHESIS TESTING I 6.1 I ................................................................................................................................ 9 .................................................................................................................................................................................. 9 6.2 TEST ...................................................................................................................................................... 9 6.3 TEST .................................................................................................................................................... 10 6.4 TEST ........................................................................................................................................................................ 10 TOPIC 7 HYPOTHESIS TESTING II 7.1 TEST 7.2 CHI-SQUARED TEST 7.3 F TEST 7.4 TEST 7.5 TEST TOPIC 8 ANOVA II ............................................................................................................................ 10 INDEPENDENT ........................................................................................................................................ 10 ..........................................................................................................................................

Words: 1122 - Pages: 5

Patho

...based exam (20%) Peperiksaan akhir (60%) :Ujian bertulis (40%) :OBA (20%) RUJUKAN 1. Kirkwood B.R(1988) Essentials of Medical Statistics. Blackwell Scientific Publication. Oxford London 2. Sabin C. & Petrie A.(2005). Medical Statistics at a Glance. 2nd Ed. Blackwell Publishing USA 3. Md Idris M.N. Asas Statistik dan Penyelidikan Perubatan. Dewan Bahasa dan Pustaka 4. Dawson B& Trapp R.G (2004). Basic and Clinical Biostatistics. 4th Ed. Mc Graw Hill, New York. 5. Y.H. Chan, 2003-2005. Basic Statistics For Doctors Series. Singapore Medical Journal. Freely available from http://www.sma.org.sg/smj/ • 101: Data Presentation (June 2003) • 102: Quantitative Data - Parametric & Non-Parametric Tests (August 2003) • 103: Qualitative Data - Tests of Independence (October 2003) • 104: Correlational Analysis (December 2003) • 201: Linear Regression Analysis (Feb 2004) 6. T. D. V. Swinscow, 2001. Statistics At Square One. BMJ Publishing Group; 10th edition. Freely available from http://bmj.bmjjournals.com/collections/statsbk/index.shtml 7. Azmi MT, 1999. SPSS for Medical Statistics (online resource in Malay). http://161.142.92.104/spss/ 8. Azmi MT, 2000. Excel for Medical Statistics (online resource in Malay). http://161.142.92.104/excel |DATE |TOPIC |LECTURER | |14 March 2016 ...

Words: 454 - Pages: 2

Anova

...ANOVA Simulation Working as a quality control manager is no easy task. There are a lot of variables that are involved in it. Knowing how to research specific problems truned out to be crucial when it came to solve a specific problem, which opened my eyes as to the complication involved in it. There are two specific test used in this particular simulation which are Anova and Non-parametric. “Anova is defined as a statistical method for making simultaneous comparisons between two or more means; a statistical method that yields values that can be tested to determine whether a significant relation exists between variables.” (dictionary.com) “A non parametric test is a branch of statistics that are applied when populations are not normal or they are severely skewed data.” There were three lessons learned during this simulation which were how to monitor a situation. The importance to measuring the accumulated data and how to provide a reliable solution to the problem. It is important that we learn how to apply things that we learn in everyday life as well as the workplace. Using an appropriate test can help us whenever it is time to make a decision in business, which is one other thing learn while going over the simulation. This helps at the time of determining how to correct a stiuation in the workplace. Training in customer service and technical support has shown to be crucial in the company shown in the simulation this information is shown during......

Words: 423 - Pages: 2

Spss

...process, the easier it will be for you to write up the results of your analyses. Details on how to structure a report are available on AUTonline. Part A 1. A market researcher is interested in the coffee drinking habits of males and females. He asks a sample of male and female office workers to record the number of cups of coffee they consume during a week. (a) Which parametric statistical technique could the researcher use to determine if males and females differ in terms of the number of cups of coffee consumed in a week? Justify your answer and describe how you would obtain this statistic using SPSS. Independent-samples t-tests (b) What are the key values you would look for in the output? (c) What assumptions should you check for when using the technique that you chose in question 2(a) above. Interval scaled data with normally distributed scores Random sampling data (d) What non-parametric technique (chapter 16) could be used to address this research question? (p109) Mann-Whitney U Test 2. The following output was obtained using SPSS. [pic] (a) Which parametric statistical technique was used to obtain this output? (Two way ANOVA) p276 (b) What research question/s could be addressed using this output? (Meaning of degree of freedom) (c) Interpret this output in terms of the research question/s you gave in question 3(b), above. [pic]...

Words: 369 - Pages: 2