Premium Essay

In: Science

Submitted By hanashin

Words 66826

Pages 268

Words 66826

Pages 268

Larry Winner Department of Statistics University of Florida February 23, 2009

2

Contents

1 Introduction 1.1 Populations and Samples . . . . . . . . . . . 1.2 Types of Variables . . . . . . . . . . . . . . . 1.2.1 Quantitative vs Qualitative Variables 1.2.2 Dependent vs Independent Variables . 1.3 Parameters and Statistics . . . . . . . . . . . 1.4 Graphical Techniques . . . . . . . . . . . . . 1.5 Basic Probability . . . . . . . . . . . . . . . . 1.5.1 Diagnostic Tests . . . . . . . . . . . . 1.6 Exercises . . . . . . . . . . . . . . . . . . . . 7 7 8 8 9 10 12 16 20 21 25 25 29 29 29 32 32 32 32 32 35 35 37 38 38 39 40 42 42 44 48

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

2 Random Variables and Probability Distributions 2.1 The Normal Distribution . . . . . . . . . . . . . . . . . . 2.1.1 Statistical Models . . . . . . . . . . . . . . . . . 2.2 Sampling Distributions and the Central Limit Theorem 2.2.1 Distribution of Y . . . . . . . . . . . . . . . . . . 2.3 Other Commonly Used Sampling Distributions . . . . . 2.3.1 Student’s t-Distribution . . . . . . . . . . . . . . 2.3.2 Chi-Square Distribution . . . . . . . . . . . . . . 2.3.3 F -Distribution . . . . . . . . . . . . . . . . . . . 2.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . .

. . . . . . . . ....

Premium Essay

... Cases Used All non-missing data are used. Syntax DESCRIPTIVES VARIABLES=Income /STATISTICS=MEAN STDDEV VARIANCE RANGE MIN MAX SKEWNESS. Resources Processor Time 00:00:00.00 Elapsed Time 00:00:00.02 [DataSet0] Descriptive Statistics N Range Minimum Maximum Mean Std. Deviation Statistic Statistic Statistic Statistic Statistic Statistic Three-Year-Average Median Income(2008-2010) 51 $29,453 $36,850 $66,303 $50,734.18 $7,555.310 Valid N (listwise) 51 Descriptive Statistics Variance Skewness Statistic Statistic Std. Error Three-Year-Average Median Income(2008-2010) 57082705.308 .389 .333 Valid N (listwise) EXAMINE VARIABLES=Income /PLOT BOXPLOT STEMLEAF /COMPARE GROUPS /PERCENTILES(5,10,25,50,75,90,95) HAVERAGE /STATISTICS DESCRIPTIVES EXTREME /CINTERVAL 95 /MISSING LISTWISE /NOTOTAL. Explore Notes Output Created 05-SEP-2012 16:32:55 Comments Input Active Dataset DataSet0 Filter Weight Split File N of Rows in Working Data File 51 Missing Value Handling Definition of Missing User-defined missing values for dependent variables are treated as missing. Cases Used Statistics are based on cases with no missing values for any dependent variable or factor used. Syntax EXAMINE VARIABLES=Income /PLOT BOXPLOT STEMLEAF /COMPARE GROUPS /PERCENTILES(5,10,25,50,75,90,95) HAVERAGE /STATISTICS DESCRIPTIVES EXTREME /CINTERVAL 95 /MISSING LISTWISE ......

Words: 519 - Pages: 3

Premium Essay

...To investigate if the mean JSL differs between the branches of the company. The data set used for the analysis: Variable | How the variable is measured | Branch | Branches of the company:1= TESS-Nizhnevartovsk, TESS-Kogalym2= TESS Head Office, TESS-Surgut3=TESS-Tyumen, TESS-Khanty-Mansiysk | Number | Number of the respondent | Work_Exp | Work Experience in JSC “TESS”:1= 2 year or less 2= more than 2 years | JSL | Job Satisfaction Level:Ratings from 1 to 5 where 1= very unsatisfied, 5= very satisfied and 0= no answer/blank | 1.2. Revised Data. Test for Normal Distribution To proceed with the analysis it is necessary to determine if the data are distributed normally. The Histogram below as well as the Descriptive Statistics (Appendix 1, Table 1b) show that the data distribution is leptokurtic (kurtosis is 2,021) and negatively skewed (skewness -,240). We can determine several outliers (Appendix 1, Table 1c, Table 1d) with extreme ratios. In cases #46 and #178 JSL is more than the highest option provided in the questionnaire. That could be a mistake in data entering or the respondent wanted to emphasise his/her satisfaction level. These cases were delisted. Cases with “0” responses are to...

Words: 2253 - Pages: 10

Premium Essay

...Question catalogue: Statistics Self-Study Module Master's programme Media and Communication Science If you are master student of the master programme “Media and Communication Science” and have to fulfill the additional requirement: Self-Study Module Statistics, you have to answer these list of 42 questions. Please answer the following questions concerning statistical methods in social science briefly. Helpful information concerning the questions can be found in the Reader: “Statistics”. Enjoy yourself while answering the questions. Chapter 1 1. A client rates her satisfaction with her vocational counselor on a 4-point scale from 1 = not at all satisfied to 4 = very satisfied. What is the (a) variable, (b) possible values, and (c) score? 2. Give the level of measurement for each of the following variables: (a) ethnic group to which a person belongs, (b) number of times an animal makes a wrong turn in a maze, and (c) position one finishes in a race. 3. Fifty students were asked how many hours they had studied this weekend. Here are their answers: 11, 2, 0, 13, 5, 7, 1, 8, 12, 11, 7, 8, 9, 10, 7, 4, 6, 10, 4, 7, 8, 6, 7, 10, 7, 3, 11, 18, 2, 9, 7, 3, 8, 7, 3, 13, 9, 8, 7, 7, 10, 4, 15, 3, 5, 6, 9, 7, 10, 6 Make (a) a frequency table and (b) a frequency polygon. (c) Make a grouped frequency table using intervals of 0-5, 6-10, 11-15, 16-20. Based on the grouped frequency table, (d) make a histogram and (e) describe the general shape of the distribution. 4. Below are the number......

Words: 3576 - Pages: 15

Premium Essay

...approximately equal to the variance of the population divided by each sample's size. This statistical theory is very useful when examining returns for a given stock or index because it simplifies many analysis procedures. An appropriate sample size depends on the data available, but generally speaking, having a sample size of at least 50 observations is sufficient. Due to the relative ease of generating financial data, it is often easy to produce much larger sample sizes. • Null Hypothesis: States the assumption (numerical) to be tested, for Example: The average number of TV sets in U.S. Homes is at least three (H0: μ ≥ 3). 1. Is always about a population parameter, not about a sample statistic. ✓ H0: μ ≥ 3 X H0: [pic] ≥ 3 Always begins with the assumption that the null hypothesis is true, similar to the notion of innocent until proven guilty. Refers to the status quo. Always contains “=”, “≤” or “≥” sign. May or may not be rejected. 1. • The Alternate Hypothesis : Is the opposite of the null hypothesis e.g.: The average number of TV sets in U.S. homes is less than 3 ( HA: μ< 3 ) Challenges the status quo...

Words: 1168 - Pages: 5

Premium Essay

...the following variables (all measured in billions USD) and estimate the corresponding model (Model 1):(Use α=0.05 for references) Yt: Defense budget outlay for year t X2t: GNP for year t X3t: US military sales in year t X4t: Aerospace industry sales in year t D1t: Dummy variable presenting the military conflict involving more than 100,000 troops; D1t=1 if more than 100,000 troops are involved and equal to 0 if fewer than 100,000 troops are involved. |Dependent Variable: Y Sample: 1962 1981 | |Method: Least Squares Included observations: 20 | |Variable |Coefficient |Std. Error |t-Statistic |Prob. | |C |21.40251 |1.496947 |14.29744 |0.0000 | |D1 |-48.21987 |6.871544 |-7.017328 |0.0000 | |X2 |0.013879 |0.003207 |4.328062 |0.0008 | |X3 |0.073146 |0.203805 |0.358902 |0.7254 | |X4 |1.389753 |0.130197 |10.67423 |0.0000 | |X4*D1 |1.540792 |0.325005 |4.740818 |0.0004 | |X2*D1 |0.022406 |0.005781 |3.876038 ......

Words: 636 - Pages: 3

Free Essay

...Statistical Information Paper I will describe the use of statistic at Veterans hospital in Loma Linda that has 142 Hospital beds and 108 beds of Community Living Center. Employs 2,436 staff. The VA hospital Provided 546,017 outpatients visits in 2008.In 2010 Outpatients visits 584,028 it is increase 38011 or increase 1.07%. Statistics is data use to compare and analysis. Hospital statistics Includes current and historical data on utilization revenue, expenses, person and mush morel Will describe numerical data, numerical count, statically analysis, and four levels of Measurement. Numerical data. Bennett, Briggs, and Troika (2009). Numerical Numerical data is identified, measured, and numerical scale. Numerical data can be Displayed using charts, tables, and graphs. Example I work at medical floor is a busy floor. The Physician is always order many test for the new admit patient. Such as Order the patient, take X-Ray, EKG, CAT scan, GI lab so on. For example, if the patients come back for GI lab.Nurse has To take vital sign every 15 minutes times four, every 30 minutes times two, and one-hour time One. This Vital sign was taken to compare how the vital sign are difference between them. If the vital Sign Drop too low or too high that will nurse alert nurse to check the patient and report to the Physician right away. This entire vital sign nurse has to record in the computer that will show in Line graph. The line graph is easy to...

Words: 813 - Pages: 4

Premium Essay

...1. Introduction Poverty, which is measured by the household income lower than poverty line has been identified as the dependent variable in this project. It is important to know which elements are associated with poverty. The purpose of this paper is to evaluate the key determinants of American household poverty in 1980. The four possible determinants will be analyzed in this project, the average numbers of every family (FAMSIZE), URB is the percent of people live in urban, UR is the level of people have no job over 16 years and the median family income in US dollars (INCOME). Descriptive statistics, correlation and regression will be used in this project. 2. Descriptive statistics Variable | Mean | Median | Mode | VAR | STDEV | URB | 58.76034483 | 66.15 | 0 | 1012.828049 | 31.82495953 | FAMSIZE | 3.140172414 | 3.135 | 2.93 | 0.033377163 | 0.182694178 | UR | 9.293103448 | 8.95 | 5.8 | 10.92696915 | 3.30559664 | INCOME | 19240.43103 | 18512 | N/A | 10889936.04 | 329.990309 | POV | 9.120689655 | 9.05 | 8.8 | 6.230792498 | 2.496155544 | 3. Correlation Correlation and regression are techniques for investigating the statistical relationship between two, or more, variables (Barrow, 2013, pp. 238). * Correlation defines the degree to which there is a linear relationship between pairs of variables. Firstly, it is useful to graph the variables to see if anything useful is revealed. In this case, XY graphs are the most suitable and they are shown in......

Words: 1666 - Pages: 7

Premium Essay

...Unit 1 - Fundamentals of Statistics ReneeCarina Benavente American InterContinental University BUSN311-12005B-11 Abstract In many organizations surveys are done to determine the job satisfaction of their employees. Job satisfaction is important for theses organizations large or small because it makes the aspects of the job easy for employees. Analyzing the data within these surveys is to find the overall job satisfaction using qualitative and quantitative variables. Introduction A word wide study of job satisfaction has been assembled by a large organization called American Intellectual Union (AIU). I have been chosen to be a part of this massive global undertaking. I will be analyzing the data from this study and results survey using AIU’s data set. Chosen Variables In examining the data set and results of AIU’s employees I chose to analyze the positions of the employees as my qualitative variables and the intrinsic job satisfaction as my quantitative variables. I chose to analyze these two specific variables because as an hourly or salary paid employee their internal job satisfaction is very important to know. It is best to understand the job satisfaction of employee position within the organization to better the work environment. Qualitative and Quantitative Variables Using qualitative and quantitative variables you have to know and understand the difference between the two variable or the results would not add up. Quantitative data is data......

Words: 1010 - Pages: 5

Premium Essay

...Download Share Add to Flag Embed Views: 292 Category: Education License: All Rights Reserved Presentation Description No description available. Comments Presentation Transcript Quality Associates : Case 1 Quality Associates Introduction : Introduction It is a case of a consulting firm which consults its clients regarding statistical procedures that is used to control the production process. In this case, Quality Associates has taken example with random sample size 30 of 4 samples i.e. 120 out of 800 given observations to explain the quality control process. Hypothesis : Hypothesis H0 : µ = 12 Ha : µ ≠ 12 Level of Significance = 0.01 Z test : Z test z = Z values : Z values Test statistic (z value) for all the samples P value : P value P values (2*(1-z score))for all the samples Rejection of null hypothesis : Rejection of null hypothesis Rejection rule for two tailed test using p-value approach Reject H0 if p-value ≤ α Standard Deviation : Standard Deviation Computed standard deviation for each of the samples Quality Associates utttsav Download Share Add to Flag Embed Views: 292 Category: Education License: All Rights Reserved Presentation Description No description available. Comments Presentation Transcript Quality Associates : Case 1 Quality Associates Introduction : Introduction It is a case of a consulting firm which consults its clients......

Words: 332 - Pages: 2

Premium Essay

...Exercise: 11 1. What demographic variables were measured at least at the interval level of measurements? Number of hours working per week and Length of labor 2. What statistics were used to describe the length of labor in this study? Were these appropriate? Descriptive Yes, Frequency (30) and mean (14.63) are used to describe the data. 3. What other statistic could have been used to describe the length of labor? Provide a rationale for your answer. Length of labor was described for both the experimental and control groups using means (14.63) and standard deviations (7.78). The exact length of labor was obtained, providing ratio level data that are descriptively analyzed with means and standard deviations. 4. Were the distributions of scores similar for the experimental and control groups for the length of labor? Provide a rationale for your answer. No, the distributions of scores were not similar for the two groups. Experimental group has slightly higher dispersion (n=30 and SD= 7.78) than control group (N=33 and SD=7.2). Standard deviation decreases with larger sample sizes. 5. Were the experimental and control groups similar in their type of feeding? Provide a rationale for your answer. Yes. Bottle-feeding was the mode for the experimental (53.1%) and the control (50%) groups since it was the most frequent type of feeding used by both groups 6. What was the marital status mode for the subjects in the experimental and control groups? Provide both the......

Words: 792 - Pages: 4

Premium Essay

...Statistics: Q # 4 I used Wages data set. Hypothesis Test: Independent Groups (t-test, pooled variance) | | | | | Married Age | No Married Age | | | 42.31 | 32.61 | mean | | 11.84 | 11.61 | std. dev. | | 67 | 33 | n | | | | | | 98 | df | | | 9.707 | difference (Married Age - No Married Age) | 138.411 | pooled variance | | 11.765 | pooled std. dev. | | 2.502 | standard error of difference | | 0 | hypothesized difference | | | | | | 3.880 | t | | | .0002 | p-value (two-tailed) | | The quantitative variable is Age in years The qualitative variable is Married that it split to two different category: 1 = yes, 0 = no These are independent samples, because they are not the same people, also not equal hypothesis. H0: µM = µn/M H1: µM ≠ µn/M α = 0.05 (significant level) There are 98 degrees of freedom. The critical t-value is -1.984 and 1. 984 because it is two-tailed with (α = 0.05), (by using t-distribution table) So p-value is less than significance level: p-value< significance level 0.0002< 0.05 The decision rule is: Reject the null hypothesis if the computed t is not between -1.984 < t < 1.984, but here t = 3.880, and t is out of the mentioned area, also by p-value = 0.0002 < 0.05 Therefore, reject the null (H0), and accept the alternate hypothesis (H1). Interpret: there is a difference in the mean age of married people and no married people. It is reasonable to conclude that......

Words: 356 - Pages: 2

Premium Essay

...Statistics Name Institution Question 1 of 20 | 5.0 Points | When comparing two population means with an unknown standard deviation you use a t test and you use N-2 degrees of freedom. A. True | B. False | | Reset Selection Question 2 of 20 | 5.0 Points | Pretend you want to determine whether the mean weekly sales of soup are the same when the soup is the featured item and when it is a normal item on the menu. When it is the featured item the sample mean is 66 and the population standard deviation is 3 with a sample size of 23. When it is a normal item the sample mean is 53 with a population standard deviation of 4 and a sample size of 7. Given this information we could use a t test for two independent means. A. True | B. False | | Reset Selection Question 3 of 20 | 5.0 Points | The alternative hypothesis can be proven if the alternative hypothesis is rejected. A. True | B. False | | Reset Selection Question 4 of 20 | 5.0 Points | You want to determine if your widgets from machine 1 are the same as machine 2. Machine 1 has a sample mean of 50 and a population standard deviation 5 and a sample size of 100. Machine 2 has a sample mean of 52 and a population standard deviation of 6 with a sample size of 36. With an alpha of .10 can we claim that there is a difference between the output of the two machines. Which of the following statements are true? A. We will reject the null hypothesis and prove there is a difference between...

Words: 1999 - Pages: 8

Premium Essay

...Name Instructor’s name Course Date Statistics 1a. P (red ∩ rugged) = P(red)*P(rugged) = 40/200*85/200 = 17/200 b. P (standard) = 46/200 P (not standard) = 1- 46/200= 77/100 P (not standard) = P (DELUXE U RUGGED) = 69/200+85/200 = 77/100 2. P (A) =0.3 P(S) = 0.39 P (M) = 0.63 P (A∩S∩M) = 0.3*0.39*0.63 = 0.07371 ASSUMPTION The events are all independent of each other. 3. P(X=7) 1-(1/8)*(7/8)7= 0.95 b. P(X>7) 1- (1/8)*(7/8)7+ (1/8)2*(7/8)6 = 0.944 5 a Z = x-µ/σ Where the absolute value of z represents the distance between the raw score and the population means in units of standard deviation. b. 42-37/2 = 2.5 p(z>2.5) = 0.9938 a baking of 42 minutes is 2.5 times a standard deviation 0.9938 the mean baking time of 37for a lemon drizzle cake made using this recipe. 6. a. σm = σ/√N = 3.5/√48 = 0.5052 b. µ = 0.5052*48 = 24.2496kg 7. a. scientific hypothesis bH0: maximum weight that can be suspended using each adhesive is different H1: maximum weight that can be suspended using each adhesive is not different c. S.E= √ (σ21/n1 +σ22/n2) = √16.62/38+19.22/46 = 3.907 d. z= statistic – hypothesized mean/estimated standard error but hypothesized mean =0 63.8 – 76.4-0/3.907 = -3.23 P(z>-3.23) = 0.9994 e. assuming we fail to reject the null hypothesis we conclude that maximum weight that can be suspended using each adhesive is different 8. | Regularly watch...

Words: 817 - Pages: 4

Premium Essay

...from Empowerment Intervention in the future. 5. Which group’s score had the least variability or dispersion? Provide a rationale for your answer. The control group had the least amount to variability of dispersion. The control group only had one are of dispersion that was self-care/ self efficacy for the baseline and posttest. 6. Did the empowerment variable or self-care self efficacy variable demonstrate the greatest amount of dispersion? Provide a rationale for your answer. Self-care self efficacy SD baseline 14.02 posttest 12.24: empowerment SD baseline 9.02 posttest 8.91 7. The mean is a measurement of central tendency of a distribution while the SD is measure of dispersion of its scores. Both X and SD are descriptive statistics. 8. What was the mean severity for renal disease for the research subjects? What was the dispersion or variability of the renal disease severity scores? Did the severity score vary significantly between the control group and the experimental group? Is this important? Provide a rationale for your answer. The mean severity was moderately severe ( mean= 6.74, SD= 2.97, range 0-10). This study found that there were...

Words: 448 - Pages: 2

Premium Essay

...BUSINESS STATISTICS ASSIGNMENT Project Title: Employee retention at D&Y consulting firm Section E: Group 2:Anshul Garg (11FN-015)-Finance Gokul Sudhakaran(11DM-039)-Marketing Kaviya .A. (11DM-057)- Marketing Nikhil Gagrani(11DM-089)- Marketing Sheth Dharmil Nirupam(11DM-147)-Marketing Taru(11IB-061)-International Business Submission Date:- 9th September,2011 TABLE OF CONTENTS 1. Case 2. Objective of the problem 3. Methodology used 4. Analysis 5. Excel output 6. Conclusion 7. Managerial implications CASE: EMPLOYEE RETENTION AT D&Y CONSULTING FIRM Demand for systems analysts in the consulting industry is very strong. Graduates with experience in the consulting business and those who have extensive computer knowledge are getting great offers from consulting companies. Once these people are hired, they frequently switch from one company to another as competing companies lure them away with even better offers. One consulting company, D&Y, has collected data on a sample of system analysts they hired with an undergraduate degree several years ago. Following are the variables in the attached excel file: StartSal: Employee's starting salary at D&Y. OnRoadPct: Percentage of time employee has spent on the road with clients. StateU: Whether the employees graduated from the State University. CISDegree: Whether the employee majored in computer......

Words: 1889 - Pages: 8