STAT 2000 Midterm Exam # 1 Review Sheet Descriptive Statistics 1. Data, categorical variable, quantitative variable, identiﬁer 2. Population, sample, census, parameter, statistic 3. Sampling designs: simple random, stratiﬁed and cluster samplings 4. Data presentation (a) Categorical variable: frequency tables, bar chart, pie chart (b) Quantitative variable: histogram, ﬁve-number summary, boxplots 5. Descriptive statistics: mean x, median, mode, range, interquartile range (IQR), variance s2 , ¯ standard deviation s. They are used to describe shape, center and spread of the distribution. 6. Percentiles: 25th = Q1 , 50th = median, 75th = Q3 etc. Correlation and Linear Regression 1. Scatterplot, y (dependent, response) variable, x (independent, explanatory) variable 2. Correlation (linear association between 2 variables) and correlation coeﬃcient r = interpretations, properties, conditions, lurking variables, eﬀect of outliers 3. Linear model: y = b0 + b1 x, where b1 = r(sy /sx ) and b0 = y − b1 x. ˆ ¯ ¯ 4. Interpretations of predicted value y , slope b1 and intercept b0 in problem context. ˆ 5. R2 = r2 : fraction of y’s variability accounted for by linear regression on x Probability 1. Trial, sample space, sample points, events 2. Three types of probability: theoretical, empirical and personal 3. Contingency table: joint probability and marginal probability 4. Notation: A, Ac , A ∩ B, A ∪ B, Venn diagram 5. Complement rule: P (Ac ) = 1 − P (A) 6. Addition rule: P (A ∪ B) = P (A) + P (B) − P (A ∩ B) 7. Conditional probability P (A | B) = P (A ∩ B) P (B)
(x−¯)(y−¯) x y (n−1)sx sy :

8. Multiplicative rule: P (A ∩ B) = P (A)P (B | A) or P (B)P (A | B) 9. Independence: P (A ∩ B) = P (A)P (B) or P (A) = P (A | B) or P (B) = P (B | A) 10. Mutually exclusive (disjoint) events: P (A ∩ B) = 0

