Free Essay

# Statistics Hw

In: Science

Submitted By scot4999
Words 1031
Pages 5
11 6l | 034 | | 6h | 667899 | | 7l | 00122244 | | 7h | | Stem=Tens | 8l | 001111122344 | Leaf=Ones | 8h | 5557899 | | 9l | 03 | | 9h | 58 | |

This display brings out the gap in the data: There are no scores in the high 70's.
13.
a. | | | | 12 | 2 | Leaf = ones | 12 | 445 | Stem = tens | | 12 | 6667777 | | | 12 | 889999 | | | 13 | 00011111111 | | | 13 | 2222222222333333333333333 | | | 13 | 44444444444444444455555555555555555555 | 13 | 6666666666667777777777 | | | 13 | 888888888888999999 | | | 14 | 0000001111 | | | 14 | 2333333 | | | 14 | 444 | | | 14 | 77 | | |

The observations are highly concentrated at 134 – 135, where the display suggests the typical value falls.

b.

The histogram is symmetric and unimodal, with the point of symmetry at approximately 135.
15
Crunchy | | Creamy | | 2 | 2 | 644 | 3 | 69 | 77220 | 4 | 145 | 6320 | 5 | 3666 | 222 | 6 | 258 | 55 | 7 | | 0 | 8 | |

Both sets of scores are reasonably spread out. There appear to be no outliers. The three highest scores are for the crunchy peanut butter, the three lowest for the creamy peanut butter.

17 a Number Nonconforming Frequency RelativeFrequency(Freq/60) 0 7 0.117 1 12 0.200 2 13 0.217 3 14 0.233 4 6 0.100 5 3 0.050 6 3 0.050 7 1 0.017 8 1 0.017 doesn't add exactly to 1 because relative frequencies have been rounded 1.001 b The number of batches with at most 5 nonconforming items is 7+12+13+14+6+3 = 55, which is a proportion of 55/60 = .917. The proportion of batches with (strictly) fewer than 5 nonconforming items is 52/60 = .867. Notice that these proportions could also have been computed by using the relative frequencies: e.g., proportion of batches with 5 or fewer nonconforming items = 1- (.05+.017+.017) = .916; proportion of batches with fewer than 5 nonconforming items = 1 - (.05+.05+.017+.017) = .866.

c. The following is a Minitab histogram of this data. The center of the histogram is somewhere around 2 or 3 and it shows that there is some positive skewness in the data. Using the rule of thumb in Exercise 1, the histogram also shows that there is a lot of spread/variation in this data.

19. a. From this frequency distribution, the proportion of wafers that contained at least one particle is (100-1)/100 = .99, or 99%. Note that it is much easier to subtract 1 (which is the number of wafers that contain 0 particles) from 100 than it would be to add all the frequencies for 1, 2, 3,… particles. In a similar fashion, the proportion containing at least 5 particles is (100 - 1-2-3-12-11)/100 = 71/100 = .71, or, 71%.

b. The proportion containing between 5 and 10 particles is (15+18+10+12+4+5)/100 = 64/100 = .64, or 64%. The proportion that contain strictly between 5 and 10 (meaning strictly more than 5 and strictly less than 10) is (18+10+12+4)/100 = 44/100 = .44, or 44%.

c. The following histogram was constructed using Minitab. The data was entered using the same technique mentioned in the answer to exercise 8(a). The histogram is almost symmetric and unimodal; however, it has a few relative maxima (i.e., modes) and has a very slight positive skew.

21 a. A histogram of the y data appears below. From this histogram, the number of subdivisions having no cul-de-sacs (i.e., y = 0) is 17/47 = .362, or 36.2%. The proportion having at least one cul-de-sac (y 1) is (47-17)/47 = 30/47 = .638, or 63.8%. Note that subtracting the number of cul-de-sacs with y = 0 from the total, 47, is an easy way to find the number of subdivisions with y 1.

b. A histogram of the z data appears below. From this histogram, the number of subdivisions with at most 5 intersections (i.e., z 5) is 42/47 = .894, or 89.4%. The proportion having fewer than 5 intersections (z < 5) is 39/47 = .830, or 83.0%.

29. Complaint | Frequency | Relative Frequency | B | 7 | 0.1167 | C | 3 | 0.0500 | F | 9 | 0.1500 | J | 10 | 0.1667 | M | 4 | 0.0667 | N | 6 | 0.1000 | O | 21 | 0.3500 | | 60 | 1.0000 |

37 , , . The median or the trimmed mean would be good choices because of the outlier 21.9.

38 a. The reported values are (in increasing order) 110, 115, 120, 120, 125, 130, 130, 135, and 140. Thus the median of the reported values is 125.

b. 127.6 is reported as 130, so the median is now 130, a very substantial change. When there is rounding or grouping, the median can be highly sensitive to small change.

39

a. so ;

b. 1.394 can be decreased until it reaches 1.011(the largest of the 2 middle values) – i.e. by 1.394 – 1.011 = .383, If it is decreased by more than .383, the median will change.

45. a. = = 577.9/5 = 115.58. Deviations from the mean: 116.4 - 115.58 = .82, 115.9 - 115.58 = .32, 114.6 -115.58 = -.98, 115.2 - 115.58 = -.38, and 115.8-115.58 = .22. b. s2 = [(.82)2 + (.32)2 + (-.98)2 + (-.38)2 + (.22)2]/(5-1) = 1.928/4 =.482, so s = .694. c. = 66,795.61, so s2 = = [66,795.61 - (577.9)2 /5]/4 = 1.928/4 = .482. d. Subtracting 100 from all values gives , all deviations are the same as in part b, and the transformed variance is identical to that of part b.

57 a. 1.5(IQR) = 1.5(216.8-196.0) = 31.2 and 3(IQR) = 3(216.8-196.0) = 62.4. Mild outliers: observations below 196-31.2 = 164.6 or above 216.8+31.2 = 248. Extreme outliers: observations below 196-62.4 = 133.6 or above 216.8+62.4 = 279.2. Of the observations given, 125.8 is an extreme outlier and 250.2 is a mild outlier. b. A boxplot of this data appears below. There is a bit of positive skew to the data but, except for the two outliers identified in part (a), the variation in the data is relatively small.

### Similar Documents

#### Statistics Hw

...Statistics 4/29/15 Homework Question 1 – What is the essence of the confidence interval? Analyze the relationship between the confidence interval and central limit theorem. Question 2 – Explain the essence of Hypothesis testing. How related are null hypothesis and Alternative Hypothesis. How do you apply confidence interval in hypothesis testing? Question 3 – Explain the difference between T distribution and Z distribution. When and how do we use T distribution? What is the meaning of the number of degrees of freedom? Left Tail, right tail, 2 tail test: Try to understand the idea of hypothesis testing! Understand how all are participating. The confidence interval is used by statisticians to express the degree of uncertainty associated with a statistic. It is an interval estimate combined with a probability statement. For example, an interval estimate may be described as 95% confidence interval. This means that if we used the same sampling method to select different samples and we computed an interval estimate for each sample, we would expect the true population range to fall within the interval estimates 95% of the time. Confidence intervals indicate the precision of the estimate and the uncertainty of the estimate. The Central Limit Theorem allows us to define an interval within the sample’s expected range. If samples are drawn from a normal population or if the sample is large enough that xbar is approximately normal by the central limit theorem and standard......

Words: 783 - Pages: 4

...QBUS 215 HW #1 Due 07/15/15 @7:55 am Name: ──────────────── Based on the content of the Online Detailed Examples presentations along with other online resources and your textbook, plus the posted set of solved problems, complete and fill in the blanks below. All questions are based upon the Required Textbook: Statistics for Business and Economics by Anderson, Sweeney and Williams, 11th Ed., 2012, Thomson/South-Western. Ch-3 ( Learning Objectives) 1. Understand the purpose of measures of location. 2. Be able to compute the mean, median, mode, quartiles, and various percentiles. 3. Understand the purpose of measures of variability. 4. Be able to compute the range, interquartile range, variance, standard deviation, and coefficient of variation. 5. Understand skewness as a measure of the shape of a data distribution. Learn how to recognize when a data distribution is negatively skewed, roughly symmetric, and positively skewed. 6. Be able to compute and interpret covariance and correlation as measures of association between two variables. Ch-5 (Learning Objectives) 1. Understand the concepts of a random variable and a probability distribution. 2. Be able to distinguish between discrete and continuous random variables. 3. Be able to compute and interpret the expected value, variance, and standard deviation for a discrete random variable. Ch-8: ( Learning Objectives: Only Section 2) 1. Know how to construct and interpret an interval estimate of a population mean and / or a......

Words: 1093 - Pages: 5

#### Syllabus

...COURSE SYLLABUS BMGT 230 - BUSINESS STATISTICS Summer Session 0301 - 2014 Instructor Information Professor: Frank B. Alt (falt@rhsmith.umd.edu ) Office: 4323 Van Munching Hall (VMH) Office Hours: After all teaching days (2:00-3:00 p.m.) and by appointment Office Phone: 301-405-2231 Course Assistant Mr. Daniel Klein Office Hours: After all class days (except 6/19) from 3:00pm – 4:30pm Office: 4308 Email: dklein99@terpmail.umd.edu Class Information Classroom: Van Munching Hall, Room 1330 Meeting Times: 10:00 a.m. - 1:10 p.m. Meeting Dates: June 2 - 5 (Monday – Thursday) June 9 - 12 (Monday – Thursday) June 16 – 19 (Monday – Thursday) Information regarding official university closings and delays can be found at the campus website or by calling the weather emergency phone line (301-405-7669). If a class is cancelled, the dates on the Course Outline will be changed to reflect this. Students will be notified of such changes by an email from me. Please refer to the inclement weather policy on page 3. Required Course Materials Text: Basic Statistical Ideas for Managers, 2nd ed, D. Hildebrand, R. Ott and J. Gray, Duxbury Press (Thompson-Brooks/Cole), 2005, ISBN 0-534-37805-6. The text comes with a CD-ROM containing an Excel Add-in and Data Sets. If your text does not have the CD, that is okay since I can post the data sets and we will......

Words: 2335 - Pages: 10

#### Hw 1 Q 4 Intro to Stats

...Wessel Section 2 HW 1 4. (a). The observational units are the people that went on diets. (b). The explanatory variable is the type of follow up to the diet that is randomly assigned to the observational units, and it is categorical. The response variable is whether or not the dieter regained 5 or more pounds, and this variable is categorical. (c). This a randomized experiment because the treatments are randomly assigned to the subjects. (d). The null hypothesis is that there is no difference between the proportion of people who regained at least 5 pounds after dieting and being treated with each of these follow-up treatments. The alternative hypothesis is that there is at least one of the proportions that are different than the other two, in that the proportion of dieters who regained at least 5 pounds is not the same as the other two proportions. (e). The estimate of the p-value of the MAD statistic is .0008. (f). The p-value of the chi squared statistic in this simulation is .0008, which is the same as the MAD statistic p-value. (g). The chi squared distribution seems to fit fairly well, though it could be a little better and fit a little closer to the data. Based on the two-way table, I would expect the distributions to be pretty similar, because all of the expected counts are above 5, which means that this dataset fits the requirements for a chi squared distribution. (h). The degrees of freedom is 2, the chi squared test statistic is 13.773......

Words: 518 - Pages: 3

#### Noodles

...QMM 241 – HW#5 – Multiple Regression 13.21 – Data It is cross-sectional and the unit of observation is a single Noodles restaurant. 13.22 – Data conditioning Yes the X and Y data is well-conditioned. The Y variable will be the Sales/SqFt which we will be able to figure out with a formula containing all the X variables and the variable magnitudes are all similar in all categories. 13.23 – Predicted signs Variable | Sign | Reason | Seats-Inside | Positive | The larger the size of the restaurant, the higher the sales will be. | Seats-Patio | Positive | The higher number of seats, the higher the revenue will be. | MedIncome | Positive | The higher the income of customers, the higher the sales will be. | MedAge | Positive | The older the potential customers, the higher the revenue will be. | BachDeg% | Positive | More education would be positively correlated with higher income so there will be higher revenue like Income. | 13.24 – Sample Size 75/5=14.8. The data set meets both Evan’s and Donae’s Rules. 13.25 – Regression The estimated regression equation is = 429.5114 − 1.8149Seats-Inside + 1.2719Seats-Patio − 2.1021MedIncome − 0.0158MedAge + 8.6604BachDeg%. These signs do not match our a priori reasoning for Seats-Inside, MedIncome, and MedAge. Regression Analysis | | | | | | | | | | | | | | R² | 0.233 | | | | | | Adjusted R² | 0.177 | n | 74 | | | | R | 0.483 | k | 5 | | | | Std. Error | 124.529 | Dep.......

Words: 1239 - Pages: 5

#### Hhsiosdn

...COURSE INFORMATION Class Days: Friday Class Times: 1:00 to 3:40 PM Class Location: EBA 345 Blackboard: blackboard.sdsu.edu Office Hours Times (and by appointment): TH 3:30 – 5:00 F 3:45 – 5:00 Office Hours Location: EBA 322 Units: 3 Course Overview Statistical methods applied to business decision making. (Formerly numbered Information and Decision Systems 301.) The objective of this course is for students to achieve an understanding of fundamental statistical techniques and how they are applied to decision making and the scientific method. Greater emphasis is placed on the application and interpretation, as opposed to the mathematical derivation, of the techniques covered. The content of this course is essential for any student pursuing an undergraduate business major and any person involved in organizational decision making. This course is intended to help satisfy the Association to Advance Collegiate Schools of Business (AACSB) curriculum criterion for management specific knowledge in the area of “Statistical data analysis and management science as they support decision-making processes throughout an organization.” Student Learning Outcomes BSBA students will graduate being: • Effective Communicators • Critical Thinkers • Able to Analyze Ethical Problems • Global in their perspective • Knowledgeable about the......

Words: 1837 - Pages: 8

#### Ergg

...Statistical Tool | Use/s | Level of Measurement | Formula | 1. Z – test | to determine whether two population means are different when the variances are known and the sample size is large.  Source:  http://www.investopedia.com/terms/z/z-test.asp#ixzz2LEqfeJnN | IV – NominalDV – Interval | | 2. T – test | to compare the means when the population mean is known but the population variance is unknown.Also when the population standard deviation is unknown but the sample standard deviation can be computed.Source:Basic Statistics Book | OrdinalInterval | | 3. F – test | used when comparing statistical models that have been fitted to a data set, in order to identify the model that best fits the population from which the data were sampled.Source:http://en.wikipedia.org/wiki/F-test | Ordinal Interval | | 4. Spearman rank | measures the strength of association between two ranked variablesSource:https://statistics.laerd.com/statistical-guides/spearmans-rank-order-correlation-statistical-guide.php | NominalOrdinal | | 5. Pearson R | used in the sciences as a measure of the strength of linear dependence between two variables.Source:http://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient | Interval | | 6. Chi – square | to test the difference between an actual sample and another hypothetical or previously established distribution such as that which may be expected due to chance or probabilitycan also be used to test differences......

Words: 257 - Pages: 2

#### Regression Analysis

...the AIU data set in order to complete a regression analysis for benefits & intrinsic, benefits & extrinsic and benefit and overall job satisfaction. Plus giving an overview of these regressions along with what it would mean to a manager (AIU Online).   Introduction Regression analysis can help us predict how the needs of a company are changing and where the greatest need will be. That allows companies to hire employees they need before they are needed so they are not caught in a lurch. Our regression analysis looks at comparing two factors only, an independent variable and dependent variable (Murembya, 2013). Benefits and Intrinsic Job Satisfaction Regression output from Excel SUMMARY OUTPUT Regression Statistics Multiple R 0.018314784 R Square 0.000335431 The portion of the relations explained Adjusted R Square -0.009865228 by the line 0.00033% of relation is Standard Error 1.197079687 Linear. Observations 100 ANOVA df SS MS F Significance F Regression 1 0.04712176 0.047122 0.032883 0.856477174 Residual 98 140.4339782 1.433 Total 99 140.4811 Coefficients Standard Error t Stat P-value Lower 95% Upper 95% Intercept 4.731133588 1.580971255 2.992549 0.003501 1.593747586 7.86852 Intrinsic -slope 0.055997338 0.308801708 0.181338 0.856477 -0.5568096 0.668804 Line equation is benefits =4.73 + 0.0559 (intrinsic) ...

Words: 830 - Pages: 4

#### Asdfghjkl

...statistical analysis is descriptive analysis. Descriptive statistics can summarize responses from large numbers of respondents in a few simple statistics. When a sample is obtained, the sample descriptive statistics are used to make inferences about characteristics of the entire population of interests. Descriptive analysis is the transformation of data in a way that describes the basic characteristics such as tendency, distribution, and variables. A examples of this would be if a company wanted to find out what type of bonus employees prefer. Descriptive statistics are used to explain the basic properties of these variables. One descriptive statistics that is used to explain the basic properties of variables is Mean, Median, and Modes. These terms all would be descriptive statistics for the above example by describing the central tendency in different ways. The mean would reflect the average answer that is given. The Median would provide the answer that is the central or middle range answer. The mode would be the answer that was given the most often. A second descriptive statistic that is used to explain the basic properties of variables is Tabulation. This refers to the orderly arrangement of data in a table or other summary format. When the tabulation process is done by hand, the term tallying is used. Simple tabulation tells how frequently each response or bit of information occurs. A third descriptive statistic used to explain the basic properties of variables......

Words: 470 - Pages: 2

...Hypothesis Testing – Two Sample * H0 : µ1 = µ2 H1 : µ1 ≠ µ2 * Case 1 -- If you know population variances, use this and Normal table * Case 2 -- If you know only sample variances, and samples are large, use this and Normal table * Case 3 -- If you know only sample variances, and samples are small, and unknown population variances can be assumed identical, use this and t-table, with n1 + n2 - 2 df. (sp is called “pooled estimate of σ”) * We use standard error of difference to compute (actual) t Hypothesis Testing – Means Of Dependent (Paired) Samples tActual * pooled estimate of population proportion Regression and Correlation Simple Linear Regression (Only 1 independent variable, and linear relationship) Regression Coefficients Using Method Of Least Squares, we get: Standard Error Of Estimate Correlation: * Variation of y around the regression line * Variation of y around its own mean * Coefficient of Determination Direct Computation of r:   Chi-Square * Make working table as follows: * List observed frequency cells, fo , in 1st column. * Compute expected frequency, fe , for each cell, and write in 2nd column. * fe = RT*CT/n where RT = row total, CT = column total, n = total no of observations in all cells of data table. * Compute (fo – fe ) for each row of working table in column 3 * Compute...

Words: 283 - Pages: 2

#### Elementary Statistics

...TERM END EXAMINATIONS,MARCH-2013 BACHELOR OF COMMERCE, YEAR – III ELEMENTARY STASTISTICS Time: 3 hours M.Marks:60 SECTION A Note: - Attempt any 4 questions. All questions carry equal marks. (4 X 5) The answer should be limited upto 200 words. 1) What is statistics? Explain the nature and limitations of statistics? 2) What is frequency distribution? What are the different types of frequency distribution? 3) What is frequency curve? Explain cumulative frequency curve with example? 4) Suppose mean of a series of 5 item is30.four values are respectively, 10, 15, 30 and 35.estimate the missing 5th value of the series. ANSWER : Mean = (10+15+30+35+x)/5=30 Therefore, x=(30*50)-( 10+15+30+35) i.e x = 150-90, hence x=60 5) Calculate median of the following distribution of data. Class interval | 0-5 | 5-10 | 10-20 | 20-30 | 30-50 | 50-70 | 70-100 | frequency | 12 | 15 | 25 | 40 | 42 | 14 | 8 | n= 12+15+25+40+42+14+8=156 Hence median is at the average of n/2 & (n/2 +1) positon i.e 78th & 79th position Class interval | 0-5 | 5-10 | 10-20 | 20-30 | 30-50 | 50-70 | 70-100 | frequency | 12 | 15 | 25 | 40 | 42 | 14 | 8 | Position 12 27 52 92 134 148 156 6) Calculate the coefficient of correlation...

Words: 1424 - Pages: 6

#### Econometrics Problem Set 4 Solution School of Business

...Problem 1: i) All the coefficients are significant, because t (crit) = 1,96 is smaller than the absolute values of these three coefficients β1, β2 and β3. Estimated equation is: Log (wage) = 0.128 + 0.0904educ + 0.041exper – 0.000714exper2 (0.106) (0.0075) (0.0052) (0.000116) n = 526, R2 = 0.30 ii) Yes, the coefficient is significant because t-statistics absolute value 6,16 is greater than t (critical value) at 1 % significance level which is 2,586 in this case. iii) Return to the fifth year of experience: 100 * [0.041-2*(0.000714)*4] = 3,53% Return to the 20th year of experience: 100 * [0.041-2*(0.000714)*19] = 1,39% iv) x* = 0.0410089/(2*(-0.0007136)) = -28.7338 28.7338 There are 121 people in the sample with at least 29 years of experience. Problem 2: a) SSE + SSR = SST SST – SSE = SSR SSR = 7160,41429–10.6243285= 7149,79 b) n =524 c) R2 = SSE/SST = 10.62/7160.41 = 0,001484 d) t = (-0,4682478/0,5306473) = -0,88241 e) t = coefficient/ std. error coefficient / t = (5,944174/34,96) = 0,170028 f) F = t^2 = (-0,88241)^2 = 0,778645 Problem 3: Model 1: a) Coefficient on variable cigs indicates that one cigarette smoked per day reduces birth weight by 0,44 %. Therefore, the effect on birth weight from smoking 10 more cigarettes will be that it reduces birth weight by 4,4 %. b) In model 1, a white child is predicted to weight 5,5 % more than a non-white child......

Words: 821 - Pages: 4

#### New Life

...LLR 1st Quarter Report Project Name: Address: Project Manager: Area Manager: Staff Team: Volunteers: Contents 1. Introduction 2. Service Activity 3. Referrals 4. Outcomes 5. Engagement 6. Incidents 7. Feedback 8. Staff Development 9. Project Development 10. Conclusion 1. Introduction This report is based on the activities undertaken by ------- for the period between This initial introductory period has been a very successful initiation period in terms of the increasing number of referrals and assessments received and conducted, in addition to the rising number of service user (SU) engagements. During this reporting period, LLR inducted four new staff members who all completed LLR’s in-house training on the LLR, Health and Safety as well as File and Data Management training. Referrals over the few months have grown steadily with positive client engagement in groups, 1-2-1 counselling and 1-2-1 Recovery Plan Sessions. During this short period we have already observed an increasing number of SU’s being very committed to their recovery journey and we expect their commitment to be reflected in their continued growth and change. The staff and management team have also been very supportive and continue to provide us with regular group space ensuring group activities got underway. Although actual attendance numbers for the group have been......

Words: 452 - Pages: 2

#### The Effects of Marijuana on Problem Solving Ability

...Research Design (Assignment 2) I will be conducting a study using a true experimental research design in order to investigate the effects of marijuana use on an individual’s problem solving ability. Marijuana use is the independent variable which is operationally defined as consumption in the form of smoking 0.5 grams of cannabis in a marijuana cigarette. Problem solving abilities is the dependent variable which is operationally defined as the total score on various math problems as well as time taken to complete said math problems. Scores on the math test can range from 0 to 100. 40 participants, 20 males and 20 females all of whom are 18 year old Freshmen taking Math 131 at Pasadena City College, will be utilized in this study. 10 males and 10 females will be randomly assigned to group A and will all smoke marijuana, while 10 males and 10 females will be randomly assigned to group B and will not smoke marijuana. In order to ensure that every participant in group A is affected by the marijuana equally, only students that consent to a drug test prior to the study and are found to have no traces of THC present in their blood will be eligible to participate in the study. This will ensure that all participants in group A are affected by marijuana equally by eliminating the possibility of one participant having a higher tolerance than another. In order to eliminate all possible plausible alternative explanations for the relationship observed between marijuana use and problem......

Words: 560 - Pages: 3