Introduction to Statistics
Statistical Problems
1. A pharmaceutical Co. wants to know if a new drug is superior to already existing drugs, or possible side effects.
2. How fuel efficient a certain car model is?
3. Is there any relationship between your GPA and employment opportunities? 4. If you answer all questions on a (T, F) (or multiple choice) examination completely randomly, what are your chances of passing? 5. What is the effect of package designs on sales?
6. …………………..

Question???
1.

What is Statistics?

2.

Why we study Statistics?

Larson & Farber, Elementary Statistics: Picturing the World, 3e

2

STA 13- SYLLABUS
Instructor

Phone:

MsC. Pham Thanh Hieu

mobile:0917.522.383, email: hieuphamthanh@gmail.com

Goals of
 To learn how to interpret statistical summaries appearing the course in journals, newspaper reports, internet, television …..and many real-world problems.
 To learn about the concepts of probability and probabilistic reasoning
 Understand variability and sampling distributions
 To learn how to interpret and analyze data arising in your own work (coursework and research)

STA 13- SYLLABUS
Grading: - One Midterms : 30% total, multiple choice exams, closed book exam, one sheet with handwritten notes (no larger than 9 ½ x 11, two sided) is allowed
- Final Exam : 50% (multiple choice + short answer exam) comprehensive; closed book exam, two sheets with handwritten notes (no larger than 9 ½ x 11, two sided) are allowed Homework: 20%. Submit homework in discussion sessions. On homework, please print your name.

STA 13- SYLLABUS
Course
 Attend all class meetings (lectures and discussion
Requireme sessions): Exams include material covered in class nts whether or not it is in the textbook.

 Do homework: Some of the exam questions will be closely related to the homework problems.

Text book: 1. Introduction to Probability and Statistics
Authors: Mendenhall, Beaver and Beaver
2. The First course in Statistic
Authors: James T. McClave

SYLLABUS
Lecture 1: Chapter 1. Introduction to Statistics
Lecture 2: Chapter 2. Descriptive Statistics
Lecture 3: Discussion 1
Lecture 4, 5: Chapter 3+4. Probability
Lecture 6: Chapter 5. Normal Probability Distribution
Lecture 7: Discussion 2 + Review for Midterm Exam
Lecture 8: Midterm Exam
Lecture 9: Chapter 6. Confidence Interval
Lecture 10: Chapter 7. Hypothesis Testing
Lecture 11: Discussion 3 + Review for Final Exam
Lecture 12: Final Exam
Larson & Farber, Elementary Statistics: Picturing the World, 3e

6

Chapter 1

Introduction to Statistics
1. 1. The science of statistics
1.2. Types of statistical application
1.3. Fundamental elements of statistics
1.4. Types of data
1.5. Methods of data collection

1.1. The Science of Statistics
Statistics is sciences of data. This involves collection, organization, analysis, and interpretation of data .
Data consists of information coming from observations, counts, measurements, or responses.

Statistics

Data

Information

Populations & Samples
A population is a set of units (people, objects…) that we are

interesting in studying.
A sample consists only of observations/elements drawn from the population. E.g. In a recent survey, 100

TUAF students were asked if they smoked cigarettes regularly. 35 of the students said yes. Identify the population and the sample.

All students at TUAF
(population)

100 students in survey
(sample)

Studying the whole population is usually impractical. Only part of it can be examined, and this subset of the population is called the sample.  Sample needs to be representative  Therefore “how to sample” is important 1.2. Types of Statistical Applications
The study of statistics has two major branches: descriptive statistics and inferential statistics.
Statistics

Descriptive statistics Involves the numerical and graphical method to organize summarize, and display the data contained in a sample.
E.g. frequency histogram, sample mean, sample median,…

Inferential statistics Involves using a sample data to make estimates, predictions or draw conclusions about a population.
E.g. estimate mean of population from mean of sample,…

E.g. Descriptive Statistics
In a survey concerning public education, 400 school administrators were asked to rate the quality of education in the United States. Their responses are summarized in Table
1.1. Construct a bar chart and pie chart for this set of data.

Table 1.1
Larson & Farber, Elementary Statistics: Picturing the World, 3e

11

E.g. Descriptive Statistics

Larson & Farber, Elementary Statistics: Picturing the World, 3e

12

E.g. Inferential Statistics
Animal Assisted Therapy: A team from the Medical Center and
School of Nursing conducted a study to gauge whether animal-assisted therapy can improve the physiological responses of heart failure patients. Researchers studied 76 heart failure patients, randomly divided into 3 groups.

Visited by volunteer with a trained dog

Visited by volunteer only

Larson & Farber, Elementary Statistics: Picturing the World, 3e

No visits
13

E.g. Inferential Statistics
Results: Those patients with animal-assisted therapy had significantly greater drops in levels of anxiety, stress and blood pressure.
Conclusion: The researchers concluded that “ pet therapy” has the potential to be an effective treatment for patients hospitalized with heart failure.

Larson & Farber, Elementary Statistics: Picturing the World, 3e

14

1.3. Fundamental elements of statistics
1.
2.
3.
4.
5.
6.

Experimental Unit
Measurement
Variable
Population
Sample
Measure of Reliability

Larson & Farber, Elementary Statistics: Picturing the World, 3e

15

1.3. Fundamental elements of statistics
An experimental unit (is statistical unit, sampling unit or observation unit): is an object (person, thing…) about which we collect the data.
- Any two experimental units must be capable of receiving different treatments.
- Experimental unit can be individual object (person, animal, plant) or group of objects (cage of animal, plot of land …).

A measurement is the measured value of a variable on an experimental unit. A set of measurements is called data.

Practice: Experimental unit
- In order to study the relationship between height and weight of students in statistical class, 12 students are randomly selected to do measurement. Experimental unit ? Individual student
- Counting the number of apples on each tree in a garden

Experimental unit ?

Individual tree

- Studying effect of different dose of one drug on mice, each cage of mice are treated with the same dose of drug.
Experimental unit ? Cage of mice
- Same the last experiment, but if the treatments are given to individual mouse Experimental unit ? idividual mouse

Variable
A variable is characteristic or property of an individual population unit (a person, place, thing, or idea...)

e.g: Age; Weight ; Height ; Gender ; Marital status ; Annual income ,…

Two characteristics of Variable
(1): It is characteristic/property of an person/object
(2) The value of the variable can "vary" from person/object to another
Example: Population size of VN (changes over time)
Hair color (varies from person to person)
Blood pressure (varies over time and from person to person )

Variable
A variable have values
10 years
50 years

Male
Female

1.5 m
1.8 m

Blond
Black

Population
A population is a set of experimental units that we are interested in studying.
E.g. 1. all employed workers in Vietnam
2. all registered voters in New York
3. everyone who is afflicted with AIDS.
4. all canned milks produced in a year
5. all accidents occuring on a particular highway during a holiday period.

Larson & Farber, Elementary Statistics: Picturing the World, 3e

20

Sample
A sample is a subset of the unit of a population.

Larson & Farber, Elementary Statistics: Picturing the World, 3e

21

Measure of Reliability
A measure of reliability is a statement (usually quantitative) about the degree of uncertainty associated with the statistical inference.

Larson & Farber, Elementary Statistics: Picturing the World, 3e

22

1.4. Types of Data
Data
(Variable)

Qualitative
(Categorical)

Quantitative
(Numerical)

Words

Numbers

Eye color (black, brown, blue, hazel,…)
Gender (Male, Female)
Place of birth (California,

Height, weight, length,… Identify the following variables

Numerical

the color of fruit candies selected at random from a bag

Categorical the TOEFL test score of students in TUAF
Numerical
the place of birth of an individual student
Categorical
the birth weights of female babies born at a large hospital over the course of a year
Numerical

Qualitative or Quantitative
Variables
Height
Short
1.8 m
Tall
1.5 m

Qualitative

Quantitative

Types of Quantitative Variable
Quantitative

Continuous

Discrete

Any value

Integers

Qualitative or Quantitative
Variables
Weight

Continous

Discrete

45 kg
65.2 kg
45
40

65.2 kilogram 80

Qualitative or Quantitative
Variables
Population

Continous

Discrete

Summary
Variables
Quantitative

Qualitative

• A variable are attributes of things

• Qualitative variables use words
•Quantitative variable use number

Continuous
Discrete

• Continuous variables can have any value from min to max
• Discrete variables can have gaps between min and max

Qualitative and Quantitative
Example:
The grade point averages of five students are listed in the table. Which data are qualitative data and which are quantitative data?
Qualitative
variable

Qualitative data

Student

GPA

Sally
Bob
Cindy
Mark
Kathy

3.22
3.98
2.75
2.24
3.84

Quantitative variable Quantitative data

How Many Variables in Your Study?
Univariate data: Data that describes a single characteristic of the population (One variable is measured on a single experimental unit)
- Example, conducting a survey to estimate the weight of high school students only one variable (weight) univariate data.
Bivariate data: Data that describes two characteristics of the population. (Two variables are measured on a single experimental unit).

Conducting a study to see the relationship between the height and weight of high school students two variables (height and weight) bivariate data. Multivariate data: Data that describes more than two characteristics
(Height, weight, hair color…) (More than two variables are measured on a single experimental unit)

Summarize Data in a Statistical Table

-What values of the variable have been observed in your data ?
- How often each value has occurred ?
“How often” can be answered in terms of
Frequency: exact number of occurrences
Relative frequency = Frequency / Total number of measurements

Percent = Relative frequency x 100%

Parameters & Statistic
A parameter is a numerical description of a population characteristic. A statistic is a numerical description of a sample characteristic.
Population

Sample

Parameters

Statistic

: mean
N: No of values
: Standard deviation
P: Probability

x : mean

n : No of values
S : Standard deviation

p : Probability

Parameters & Statistics
Example:
Decide whether the numerical value describes a population parameter or a sample statistic.
a.) A recent survey of a sample of 450 college students reported that the average weekly income for students is \$325.
Because the average of \$325 is based on a sample, this is a sample statistic.
b.) The average weekly income for all students is \$405.
Because the average of \$405 is based on a population, this is a population parameter.

Descriptive and Inferential Statistics
Example:
In a recent study, volunteers who had less than 6 hours of sleep were four times more likely to answer incorrectly on a science test than were participants who had at least 8 hours of sleep. Decide which part is the descriptive statistic and what conclusion might be drawn using inferential statistics.
The statement “four times more likely to answer incorrectly” is a descriptive statistic.
An inference drawn from the sample is that all individuals sleeping less than 6 hours are more likely to answer science question incorrectly than individuals who sleep at least 8 hours. 1.5. Methods of Data Colletion
Resources
1.

From published source

2. From observation study
3. From Survey
4. From designed experiment

Data collection
1. Published sources: Books, journals, newspaper…
2. Observation study: The researcher observed the experimental units in their natural setting and record the variable(s) of interest.
(reseacher make no attempt to control the any aspect of the exp. Unit.
Example: a doctor may observed and measure the weight of newborn male and female babies.

3. Survey: the researcher samples a group of people, ask one or more questions, and record the responses.
Example: A survey on income of household in Thai Nguyen provinces

4. Designed Experiment: in which the researcher exerts strict control (treatment) over the units.
E.g. Study the effect of fertilizer doses on growth of corn

Sampling method
A sampling method is a procedure for selecting sample elements from a population.
Representative sample exhibit characteristics typical of those possessed by the target population.

Requirement of sample:

1) must be representative
2) must be random sample

Random Samples
Random sample of n experimental units is a sample selected from the population in such a way that every different sample of size n has an equal chance of selection.
Larson & Farber, Elementary Statistics: Picturing the World, 3e

39

Stratified Samples
A stratified sample has members from each segment of a population.
This ensures that each segment from the population is represented.

First year
Students

Second year
Students

Third year
Students

Fourth year
Students

Cluster Samples
A cluster sample has all members from randomly selected segments of a population. This is used when the population falls into naturally occurring subgroups.

All members in each selected group are used.

Hanoi city is divided into city blocks.

Systematic Samples
A systematic sample is a sample in which each member of the population is assigned a number. A starting number is randomly selected and sample members are selected at regular intervals.

Every fourth member is chosen.

Convenience Samples
A convenience sample consists only of available members of the population.
Example:
You are doing a study to determine the number of years of education each teacher at your college has. Identify the sampling technique used if you select the samples listed.
1.) You randomly select two different departments and survey each teacher in those departments.

2.) You select only the teachers you currently have this semester.
3.) You divide the teachers up according to their department and then choose and survey some teachers in each department. Continued.

Identifying the Sampling Technique
Example continued:
You are doing a study to determine the number of years of education each teacher at your college has. Identify the sampling technique used if you select the samples listed.
1.) This is a cluster sample because each department is a naturally occurring subdivision.
2.) This is a convenience sample because you are using the teachers that are readily available to you.
3.) This is a stratified sample because the teachers are divided by department and some from each department are randomly selected. Example
Example 1: Trout
Study goal - investigate weight of trout in Tahoe Lake;

100 trout are captured and measured
Variable: weight (varies from fish to fish)
Population: all trout in Tahoe

Sample: the 100 captured trout
An experimental unit: one trout from the catch
A measurement: the weight of that trout
Data: weights of all 100 captured trout
Representative? Depends on how these 100 trout were captured.
Larson & Farber, Elementary Statistics: Picturing the World, 3e

45

Objectives
1. Variable

2. Population and sample
3. Central tendency
4. Variability
5. Position

Designing a Statistical Study
GUIDELINES
1. Identify the variable(s) of interest (the focus) and the population of the study.
2. Develop a detailed plan for collecting data. If you use a sample, make sure the sample is representative of the population. 3. Collect the data.
4. Describe the data.
5. Interpret the data and make decisions about the population using inferential statistics.
6. Identify any possible errors.

