Free Essay

# Statistics

Submitted By blastumsun
Words 1067
Pages 5
Nyke- The Final Project
Yvonne Pearce
Thomas Edison State College
Principles of Statistics (2014Feb STA-201)
April 20, 2014

Thesis
People are the same. We all have similar body characteristics. We all generally have two eyes, two hands and two feet. Most of us also wear shoes. But producing shoes of varying sizes is expensive and time consuming. Nyke is proposing a plan to cut the cost of their production of shoes by only making one universal size shoe for their customer base. We will look at a sample of data obtained by this company and determine if this is statistically the best plan moving forward for this company.
Data Observations
A statistical study is only as good as its raw data. Nyke has provided a sample of thirty five of their customer base shoe sizes, heights and genders. This is a really small random sample based on the gross production of shoes already being produced by Nyke, however, we can make some recommendations based on these numbers. We will assume these numbers are correct, and the process of picking these data points were random.
Statistical Testing
There are many tests we can run on this sample of data to determine if it is viable. First we need to turn this data into numbers that we can use in our equations to calculate our values. So we need to convert the numbers into the mean, median, mode, range of values, x, x2 and the standard deviation of the data. Second, we will use the data to construct several graphs, such as a boxplot, histogram, stem and leaf plot and a scatter plot to determine if or how the data is skewed and if there are any outliers. If we identify any outliers we will determine whether these are errors. Third we will use a t-test, Chi squares and binomial approximation to construct a hypothesis about the data. Fourth, we will construct a linear regression and look for correlation. The analyzed data from these test should give us an idea of whether this plan of a universal shoe size will be right for the Nyke company.
Converting data into usable numbers First I put the data into an excel spreadsheet for ease of use. Then I broke up the data into the following categories: All shoe sizes, all heights, Women only shoe sizes, Women only heights, Men only shoe sizes, and Men only heights. Now we can calculate the mean, median, and mode of each of the different categories. We also can calculate the range, the standard deviation, the deviation from the mean, and the squared deviation from the mean. This will allow us to plug in these calculated numbers into various equations. One of the equations I was able to use now was the IQR range, which allows me to see how the data is skewed, and also gives me the upper and lower limits. These limits allow me to see if there are any potential outliers. This data only had one potential outlier; one data point of a women’s shoe size 10. Everything else fell between the limits. Shoe size- Whole Group | | Sample Mean | 9.14285 | Median | 9 | Mode | 7 | Range | 9 | Sample Stdev | 2.58266685 | Number of observations | 35 | Shoe size- Women Only | | Sample Mean | 7.11111 | Median | 7 | Mode | 6.5, 7, 7.5 | Range | 4 | Sample Stdev | 1.13183 | Number of observations | 18 |
Figure 1 Shoe size- Men Only | | Sample Mean | 11.29411 | Median | 10.5 | Mode | 11, 12 | Range | 6.5 | Sample Stdev | 1.803285404 | Number of observations | 17 |
Figure 1 (continued) 5 Number Summary | | | | | | | | | | Data Points | Min # | Q1 | Q2 | Q3 | Max | IQR | LL | UL | Outliers? | Shoe Size-All | 5 | 7 | 9 | 11 | 14 | 4 | 1 | 17 | No | Height- All | 60 | 66 | 70 | 72 | 77 | 6 | 57 | 81 | No | Shoe Size- Women Only | 5 | 6.5 | 7 | 7.5 | 10 | 1 | 5 | 9 | Yes,- 10 | Height- Women Only | 60 | 63.5 | 66.5 | 70 | 72 | 6.5 | 53.75 | 79.75 | No | Shoe Size- Men Only | 7 | 10.25 | 11 | 12.5 | 14 | 2.25 | 6.875 | 15.875 | No | Height- Men Only | 64 | 69 | 72 | 73 | 77 | 4 | 63 | 79 | No |
Figure 2
Graphing the data We now have a set of numbers associated with our data. We can use those along with our data points to graph. We have several graphs we can construct to help us understand our numbers better. I am going to make a box plot, histogram, stem plot and a scatter plot graphs. All four of these graphs will help point out the potential outliers, as well as give us an approximate shape of the distribution of the data. The graphs of “all shoes” and “women only shoes” were both skewed slightly to the left. The graphs of “all heights” and “men only heights” were both skewed slightly to the right. The “women height” and “men shoes only” were both bell shaped.

Figure 3

Figure 4

Figure 5
Relative frequency Next I took the data to see how frequently the shoe sizes for all the data occurred. The data shows a cluster for female shoes around a mode of size 7, and a cluster for male shoes around a mode of shoe size 11.5.
Conclusion
The data provided does not support the manufacture of just one shoe size. There is no common shoe size that would provide for a sizeable portion of both male and female customers. However, there is sufficient data to show that the manufacture of one shoe size for fermales, and one shoe size for males could reduce production costs with the least effect on profitability. The female shoe fit loosely as size 7 such that it could be worn by females sized at 6.5 and 7.5, would meet the needs of approximately 66% of the female population based on the sample data. The male shoe fit loosely at a size 11.5 such that it could be worn by males sized at 11 and 12, would meet the needs of approximately 41% of the male population. This being said, the total needs of 54% of the customer base can be met, while reducing the number of shoe sizes manufactured by 87.5% (from 16 sizes down to 2 sizes). This number is different if there are currently other shoe sizes manufactured not listed in the provided data.

### Similar Documents

#### Statistics

...This paper will explain what statistics are. Statistics are used in so many ways, including business. This paper will thoroughly defining statistics, the types and levels of statistics, the role of statistics in a business and examples of how statistics may be used. The most common definition for statistics would be the collection of numerical data. Examples of numerical data could be the percentage of how many African-Americans passed, dropped out or failed out of high school in Vallejo, CA in 2013. Another would be how many slam dunks did LeBron James have last year or asking how many assist did LeBron James average last season; would give you numerical data. In this coarse, statistics is described as the science of collecting, organizing, presenting, analyzing, and interpreting data to assist in making more effective decisions. Statistics are usually divided into two catagories, descriptive and inferential statistics. Discriptive statistics are the methods of organizing, summarizing and presenting data in an informative way. Inferential statistics (also known as statistical inference) are the methods used to estimate a property of a population on the basis of a sample. There are two variables/attributes involved in statistics, qualitative and quantitative. Qualitative variables are when the characteristics being studied are categorical or nonnumeric. Quantitative variables are when a variable is studied and the study describes how much or how many. There are four......

Words: 563 - Pages: 3

#### Statistics

... Cases Used All non-missing data are used. Syntax DESCRIPTIVES VARIABLES=Income /STATISTICS=MEAN STDDEV VARIANCE RANGE MIN MAX SKEWNESS. Resources Processor Time 00:00:00.00 Elapsed Time 00:00:00.02 [DataSet0] Descriptive Statistics N Range Minimum Maximum Mean Std. Deviation Statistic Statistic Statistic Statistic Statistic Statistic Three-Year-Average Median Income(2008-2010) 51 \$29,453 \$36,850 \$66,303 \$50,734.18 \$7,555.310 Valid N (listwise) 51 Descriptive Statistics Variance Skewness Statistic Statistic Std. Error Three-Year-Average Median Income(2008-2010) 57082705.308 .389 .333 Valid N (listwise) EXAMINE VARIABLES=Income /PLOT BOXPLOT STEMLEAF /COMPARE GROUPS /PERCENTILES(5,10,25,50,75,90,95) HAVERAGE /STATISTICS DESCRIPTIVES EXTREME /CINTERVAL 95 /MISSING LISTWISE /NOTOTAL. Explore Notes Output Created 05-SEP-2012 16:32:55 Comments Input Active Dataset DataSet0 Filter Weight Split File N of Rows in Working Data File 51 Missing Value Handling Definition of Missing User-defined missing values for dependent variables are treated as missing. Cases Used Statistics are based on cases with no missing values for any dependent variable or factor used. Syntax EXAMINE VARIABLES=Income /PLOT BOXPLOT STEMLEAF /COMPARE GROUPS /PERCENTILES(5,10,25,50,75,90,95) HAVERAGE /STATISTICS DESCRIPTIVES EXTREME /CINTERVAL 95 /MISSING LISTWISE ......

Words: 519 - Pages: 3

#### Statistics

...Descriptive and Inferential Statistics ________________________________________ Statistics can be broken into two basic types. The first is known as descriptive statistics. This is a set of methods to describe data that we have collected. Ex. Of 350 randomly selected people in the town of Luserna, Italy, 280 people had the last name Nicolussi. An example of descriptive statistics is the following statement : "80% of these people have the last name Nicolussi." Ex. On the last 3 Sundays, Henry D. Carsalesman sold 2, 1, and 0 new cars respectively. An example of descriptive statistics is the following statement : "Henry averaged 1 new car sold for the last 3 Sundays." These are both descriptive statements because they can actually be verified from the information provided. The second type of statistics in inferential statistics. This is a set of methods used to make a generalization, estimate, prediction or decision. Ex. Of 350 randomly selected people in the town of Luserna, Italy, 280 people had the last name Nicolussi. An example of inferential statistics is the following statement : "80% of all people living in Italy have the last name Nicolussi." We have no information about all people living in Italy, just about the 350 living in Luserna. We have taken that information and generalized it to talk about all people living in Italy. The easiest way to tell that this statement is not descriptive is by trying to verify it based upon the information provided. Ex....

Words: 2703 - Pages: 11

#### Statistics

...determine if the data are distributed normally. The Histogram below as well as the Descriptive Statistics (Appendix 1, Table 1b) show that the data distribution is leptokurtic (kurtosis is 2,021) and negatively skewed (skewness -,240). We can determine several outliers (Appendix 1, Table 1c, Table 1d) with extreme ratios. In cases #46 and #178 JSL is more than the highest option provided in the questionnaire. That could be a mistake in data entering or the respondent wanted to emphasise his/her satisfaction level. These cases were delisted. Cases with “0” responses are to be excluded from the further analysis as irrelevant data too. After the data revision there are 194 cases left in the dataset. Although the distribution is still negativeley skewed we may observe the distribution is closer to normal in terms of kurtosis. (Appendix 2, Picture 1a, Table 1c). We checked the significance of non normal distribution by comparing the numeric value of kurtosis with twice the Std. Error of kurtosis. Looking at the range from minus twice the Std.Error of kurtosis to plus twice the Std.Error of kurtosis, we see that the kurtosis value falls within this range. Thus the non normal distribution is considered to be insignificant. The JSL variable was also tested for the the distrubution normality depending on “Branch” and “Work Exp” variables. (Appendix, Histogram). Descriptive statistics of the subsamples shows that the skewness and kurtosis is acceptable , no outliers were......

Words: 2253 - Pages: 10

#### Statistics

...Question catalogue: Statistics Self-Study Module Master's programme Media and Communication Science If you are master student of the master programme “Media and Communication Science” and have to fulfill the additional requirement: Self-Study Module Statistics, you have to answer these list of 42 questions. Please answer the following questions concerning statistical methods in social science briefly. Helpful information concerning the questions can be found in the Reader: “Statistics”. Enjoy yourself while answering the questions. Chapter 1 1. A client rates her satisfaction with her vocational counselor on a 4-point scale from 1 = not at all satisfied to 4 = very satisfied. What is the (a) variable, (b) possible values, and (c) score? 2. Give the level of measurement for each of the following variables: (a) ethnic group to which a person belongs, (b) number of times an animal makes a wrong turn in a maze, and (c) position one finishes in a race. 3. Fifty students were asked how many hours they had studied this weekend. Here are their answers: 11, 2, 0, 13, 5, 7, 1, 8, 12, 11, 7, 8, 9, 10, 7, 4, 6, 10, 4, 7, 8, 6, 7, 10, 7, 3, 11, 18, 2, 9, 7, 3, 8, 7, 3, 13, 9, 8, 7, 7, 10, 4, 15, 3, 5, 6, 9, 7, 10, 6 Make (a) a frequency table and (b) a frequency polygon. (c) Make a grouped frequency table using intervals of 0-5, 6-10, 11-15, 16-20. Based on the grouped frequency table, (d) make a histogram and (e) describe the general shape of the distribution. 4. Below are the number......

Words: 3576 - Pages: 15

#### Statistics

...Breakdown Grouping customers on the basis of marital status and gender as shown on Table 2, it seems that Pelican stores have more married customers than single customers. Of those married customers, the majority of them are female. It appears the relevant frequency for married females and everyone else is 80% and 20% respectively. Also it seems married customers spend more money on purchases at the store more than single customers, average spend by married and singles appears to be \$78.03 and \$75.35 respectively (Table 6 on page 5). Pie Chart of Genders with Marital Status 4% 3% Married Female 13% Married Male Single Female 80% Single Male +Pelican Stores Managerial Report 3 Data on Regular Customer Table 3 is a descriptive statistics for net sales by regular and promotion customers. It appears that customers taking advantage of the promotional coupons spent more on average. The mean amount spent by regular customers is \$61.99 versus \$ 84 by promotional customers. The relative frequency of regular and promotional customers is 30% and 70% respectively. Net Sales by Customer Type Regular Promotions Mean 61.99 Mean Median 51 Median Sample Variance 1229.76 Sample Variance Standard Deviation 35.07 Standard Deviation First Quartile 39.5 Minimum Third Quartile 74 Maximum Count 30 Count Table 3. 84 63.42 3777.61 61.46 13.23 287.59 70 The 80th percentile of Net Sales for Each Payment Method Table 4 represent date on sales for all card categories as used by......

Words: 753 - Pages: 4

Free Essay

#### Statistics

...Statistical Information Paper I will describe the use of statistic at Veterans hospital in Loma Linda that has 142 Hospital beds and 108 beds of Community Living Center. Employs 2,436 staff. The VA hospital Provided 546,017 outpatients visits in 2008.In 2010 Outpatients visits 584,028 it is increase 38011 or increase 1.07%. Statistics is data use to compare and analysis. Hospital statistics Includes current and historical data on utilization revenue, expenses, person and mush morel Will describe numerical data, numerical count, statically analysis, and four levels of Measurement. Numerical data. Bennett, Briggs, and Troika (2009). Numerical Numerical data is identified, measured, and numerical scale. Numerical data can be Displayed using charts, tables, and graphs. Example I work at medical floor is a busy floor. The Physician is always order many test for the new admit patient. Such as Order the patient, take X-Ray, EKG, CAT scan, GI lab so on. For example, if the patients come back for GI lab.Nurse has To take vital sign every 15 minutes times four, every 30 minutes times two, and one-hour time One. This Vital sign was taken to compare how the vital sign are difference between them. If the vital Sign Drop too low or too high that will nurse alert nurse to check the patient and report to the Physician right away. This entire vital sign nurse has to record in the computer that will show in Line graph. The line graph is easy to...

Words: 813 - Pages: 4

#### Statistics

...1. Introduction Poverty, which is measured by the household income lower than poverty line has been identified as the dependent variable in this project. It is important to know which elements are associated with poverty. The purpose of this paper is to evaluate the key determinants of American household poverty in 1980. The four possible determinants will be analyzed in this project, the average numbers of every family (FAMSIZE), URB is the percent of people live in urban, UR is the level of people have no job over 16 years and the median family income in US dollars (INCOME). Descriptive statistics, correlation and regression will be used in this project. 2. Descriptive statistics Variable | Mean | Median | Mode | VAR | STDEV | URB | 58.76034483 | 66.15 | 0 | 1012.828049 | 31.82495953 | FAMSIZE | 3.140172414 | 3.135 | 2.93 | 0.033377163 | 0.182694178 | UR | 9.293103448 | 8.95 | 5.8 | 10.92696915 | 3.30559664 | INCOME | 19240.43103 | 18512 | N/A | 10889936.04 | 329.990309 | POV | 9.120689655 | 9.05 | 8.8 | 6.230792498 | 2.496155544 | 3. Correlation Correlation and regression are techniques for investigating the statistical relationship between two, or more, variables (Barrow, 2013, pp. 238). * Correlation defines the degree to which there is a linear relationship between pairs of variables. Firstly, it is useful to graph the variables to see if anything useful is revealed. In this case, XY graphs are the most suitable and they are shown in......

Words: 1666 - Pages: 7

#### Statistics

...The article I chose suggests that Roman Catholics in the United States hope that the next pope could be younger and lead the church in a more liberal way regarding issues, such as birth control and same-sex marriage. The statistics is based on the latest New York Times/CBS News telephone poll conducted on landlines and cellphones from February 23 to 27 with 1,585 adults nationwide. Three-fourths of interviewers thought Pope Benedict should step down because he did a bad job dealing with sexual abuse. The poll suggested that many American Catholics are not confident about the church’s hierarchy since the bishops and cardinals do not understand their needs. The sample of telephone poll was randomly selected to ensure that each region of the country has almost same share of all telephone numbers, so it seems justified geographically. However, there is uncertainty in the survey because the margin of sampling error is plus or minus 4% for the 580 Catholics. An effective way to reduce the margin of error is to increase the sample size. There is also nonresponse error in the survey because some people do not answer calls from strangers, or refuse to answer the poll. In addition, due to selection bias, the opinions of those who agreed to be interviewed only reflect a certain part of the population. The article did well in analyzing from a series of data, mainly on the percentage of sample size. For example, the poll showed that 40 percent thought the pope is......

Words: 332 - Pages: 2

#### Statistics

...variable Measurement units? EXAMPLE. Which variables are quantitative and which are categorical? Employee # Age (years) Annual Income (in Performance 1,000s of dollars) Rating (1-5 scale) 5543 48 50 – 100 4.5 2431 34 20 – 49 3.9 7281 31 0 – 19 3.4 Job Type Management Clerical Maintenance 2. SURVEYS AND SAMPLING Population: _______ individuals with a common characteristic that you want to generalize about Parameter: fact or characteristic about _____________ Sample: ________ of population Statistic: fact or characteristic about ______________ EXAMPLE. Mattel claims that less than 5% of all its Hot Wheels toys are defective. When testing 100 Hot Wheels toys from a production run of 7000 toys, 7% were found to be defective. What is the: a) Population? c) Parameter? Poor (Biased) Sampling   Convenience sampling: Choosing respondents that are __________ to obtain Voluntary response: Respondents volunteer, so those with __________ opinions are more likely to respond b) Statistic? d) Sample? Sampling Designs 1. Simple Random Sampling (SRS): Every individual has an equal chance of being selected 2. Stratified Random Sampling: Divide population into ______________ subgroups and randomly select from each stratum 3. Cluster Random Sampling: Divide population into ______________ subgroups that are representative of population and select a few clusters 4. Systematic Sampling: with a random starting point, select at regular intervals COMM 291 Review Package prepared......

Words: 2677 - Pages: 11

#### Statistics

...explain the quality control process. Hypothesis :  Hypothesis H0 : µ = 12 Ha : µ ≠ 12 Level of Significance = 0.01 Z test :  Z test z = Z values :  Z values Test statistic (z value) for all the samples P value :  P value P values (2*(1-z score))for all the samples Rejection of null hypothesis :  Rejection of null hypothesis Rejection rule for two tailed test using p-value approach Reject H0 if p-value ≤ α Standard Deviation :  Standard Deviation Computed standard deviation for each of the samples Quality Associates utttsav Download Share  Add to  Flag Embed Views: 292   Category: Education         License:   All Rights Reserved Presentation Description No description available. Comments Presentation Transcript Quality Associates :  Case 1 Quality Associates Introduction :  Introduction It is a case of a consulting firm which consults its clients regarding statistical procedures that is used to control the production process. In this case, Quality Associates has taken example with random sample size 30 of 4 samples i.e. 120 out of 800 given observations to explain the quality control process. Hypothesis :  Hypothesis H0 : µ = 12 Ha : µ ≠ 12 Level of Significance = 0.01 Z test :  Z test z = Z values :  Z values Test statistic (z value) for all the samples P value :  P value P values (2*(1-z score))for all the samples Rejection of null hypothesis :  Rejection of null hypothesis Rejection......

Words: 332 - Pages: 2

#### Statistics

...plays a role in certain areas where the recidivism rates are increasing. Lastly we will discuss several implications that could possibly reduce these rates. Introduction: A criminal career is a sequence of offenses during a period of an individual’s life. This repetitive criminal behavior is called recidivism, and indicates the proportion that becomes involved in criminal behavior, at what age criminal behaviors begins, how long the criminal career lasts and the number of offenses typically committed during the course of the career (Farrington, 1992) Also defined as an estimate of the percentage of released prisoner who commit another offense. There are three different measures of recidivism according to the Bureau of Justice Statistics. These are rearrests, reconviction, and reincarceration. Rearrests is described as any arrest that was reported to state identification bureau after release from a correctional bureau after release from a correctional institute. Reconviction can be referred to as a conviction on at least one charge after the release date. Reincarceration refers to any return to prison or any admission to a local jail with a new offense. The Problem: Recidivism is associated with increases in crime, homelessness, and family destabilization. In many cases, high recidivism results to failure to provide useful rehabilitation for offenders. Recidivism rates in urban centers tend to be higher than In rural areas. Also some factors we must......

Words: 648 - Pages: 3

#### Statistics

...Statistics Name Institution Question 1 of 20 | 5.0 Points | When comparing two population means with an unknown standard deviation you use a t test and you use N-2 degrees of freedom.  A. True |  B. False | | Reset Selection Question 2 of 20 | 5.0 Points | Pretend you want to determine whether the mean weekly sales of soup are the same when the soup is the featured item and when it is a normal item on the menu. When it is the featured item the sample mean is 66 and the population standard deviation is 3 with a sample size of 23. When it is a normal item the sample mean is 53 with a population standard deviation of 4 and a sample size of 7. Given this information we could use a t test for two independent means.  A. True |  B. False | | Reset Selection Question 3 of 20 | 5.0 Points | The alternative hypothesis can be proven if the alternative hypothesis is rejected.  A. True |  B. False | | Reset Selection Question 4 of 20 | 5.0 Points | You want to determine if your widgets from machine 1 are the same as machine 2. Machine 1 has a sample mean of 50 and a population standard deviation 5 and a sample size of 100. Machine 2 has a sample mean of 52 and a population standard deviation of 6 with a sample size of 36. With an alpha of .10 can we claim that there is a difference between the output of the two machines. Which of the following statements are true?  A. We will reject the null hypothesis and prove there is a difference between...

Words: 1999 - Pages: 8

#### Statistics

...Statistics Homework 1. Section 3.1, Exercise #14, p. 125 Finding Probabilities consider a company that selects employees for random drug tests. The company uses a computer to select randomly employee numbers that range from 1 to 6296. Find the probability of selecting a number greater than 1000. P(E) = Number of outcomes in E / Total number of Outcomes in sample space Number of outcomes in E = 6296 – 100 = 5296 The probability = P(E) = 5296 / 6296 = 0.841 = 84.1% There is an 84.1 percent probability of selecting a number greater than 1000 2. Section 3.1, Exercise #20, p. 126 Using a Frequency Distribution to Find Probabilities in use the frequency distribution, which shows the number of American voters (in millions) according to age. 18 to 20 years old 4.8 21 to 24 years old 7.3 25 to 34 years old 20.4 35 to 44 years old 28.4 45 to 64 years old 43.7 Find the probability that a voter chosen at random is between 35 and 44 years old. Probability = 28.4 / (4.8 + 7.3 + 20.4 + 28.4 + 43.7 + 24.9) = 0.2193 = 21.93% 3. Section 3.2, Exercise #16, p. 136 A doctor gives a patient a 60% chance of surviving bypass surgery after a heart attack. If the patient survives the surgery, he has a 50% chance that the heart damage will heal. Find the probability that the patient survives and the heart damage heals. Let BS be......

Words: 450 - Pages: 2

Free Essay

#### Statistics

...hospital. Lee – 20.1 I also feel that women without health insurance have a higher rate of infant mortality due to the fact that most do not have good prenatal care. While the first 4 counties on my list have a rate ratio of 18.0 without insurance, Lee county residents are at 26.6 without insurance. When comparing ethnicities, African American women had the highest rate of infant mortality with a rate ratio of 13.31 and Hispanic women from Central or South America had the lowest rate of 4.57. The United States as a whole has a rate of 2.37 with Wisconisin as having one of the highest rates. In the US unmarried mothers of all ethnicity had one of the highest rates due to the fact that prenatal care is very expensive. World statistics show at a rate ratio of 49.4 for infant mortality. Afghanistan rates at 119.41 making it one of the highest countries for infant mortality. I feel that being the 15th least developed country in the world and most people live way outside of any town causing them not to be able to seek healthcare is the reason why their mortality rate is so high. Singapore has the lowest infant mortality rate in the world. The government there provides safety nets for citizens without insurance making it possible for most to receive healthcare. The government pays 31.9% of the citizens healthcare....

Words: 410 - Pages: 2