Chapter 1: Populations, Samples and Processes
Chapter 1: Populations, Samples and Processes
Chapter 1: Populations, Samples and Processes
Applied Probability and Statistics for Engineering and Science Chapter 1: Populations, Samples and Processes 1
Populations, Samples and Variable
Visual Displays for Univariate Data
Describing distributions
The Normal Distribution
Other Continuous Distributions
Several Useful discrete distributions
Outline of Chapter 1
1.1 Populations, Samples and Variable
1.2 Visual Displays for Univariate Data
1.3 Describing Distributions
1.4 The Normal Distribution
1.5 Other Continuous Distributions
1.6 Several Useful Discrete Distributions
Applied Probability and Statistics for Engineering and Science Chapter 1: Populations, Samples and Processes 2
Populations, Samples and Variable
Visual Displays for Univariate Data
Describing distributions
The Normal Distribution
Other Continuous Distributions
Several Useful discrete distributions
Introduction
Statistics theory and techniques are powerful and
indispensable means in understanding the world around
us.
The means can help one to make intelligent judgments
and decisions in the presence of uncertainty and
variation.
Without uncertainty or variation, there would be little
need for statistical techniques and statisticians.
Applied Probability and Statistics for Engineering and Science Chapter 1: Populations, Samples and Processes 3
Populations, Samples and Variable
Visual Displays for Univariate Data
Populations
Describing distributions
Sample
The Normal Distribution
Branches of statistics
Other Continuous Distributions
Several Useful discrete distributions
Populations
Engineers and scientists are constantly exposed to
collections of facts/data in their work.
Population is a well-defined collection of objects.
Examples:
Students in Class ECE08
People in Vietnam
...
Applied Probability and Statistics for Engineering and Science Chapter 1: Populations, Samples and Processes 4
Populations, Samples and Variable
Visual Displays for Univariate Data
Populations
Describing distributions
Sample
The Normal Distribution
Branches of statistics
Other Continuous Distributions
Several Useful discrete distributions
Sample
When desired information is available for all objects in the
population, we have what is called a census.
Practical constraints (e.g., money, time and other limited
resources) usually make a census impractical or infeasible.
Sample: a (random) subset of the population.
For instance, we might select a sample of last year’s
engineering graduates to obtain feedback about the
quality of the engineering curricula.
Applied Probability and Statistics for Engineering and Science Chapter 1: Populations, Samples and Processes 5
Populations, Samples and Variable
Visual Displays for Univariate Data
Populations
Describing distributions
Sample
The Normal Distribution
Branches of statistics
Other Continuous Distributions
Several Useful discrete distributions
Sample: variable
Variable: is any characteristic whose value may change
from one object to another in the population. Examples:
X = gender of a graduating engineer, Y = age of a
graduating engineer, Z = temperature of a certain time
instance in a day.
Univariate data set: consists of observations on a single
variable.
Bivariate data: observations are made on each of two
variables.
Multivariate data: observations are made on more than
two variables.
Applied Probability and Statistics for Engineering and Science Chapter 1: Populations, Samples and Processes 6
Populations, Samples and Variable
Visual Displays for Univariate Data
Populations
Describing distributions
Sample
The Normal Distribution
Branches of statistics
Other Continuous Distributions
Several Useful discrete distributions
Branches of statistics
Descriptive Statistics: methods to summarize and
describe important features of the data. Examples:
Graphical: the construction of histogram, stem-and-leaf
display, dot plot
Calculation: numerical measures of means, variances,
correlation,...
Inferential Statistics: techniques for generalizing from a
sample to a population
Applied Probability and Statistics for Engineering and Science Chapter 1: Populations, Samples and Processes 7
Populations, Samples and Variable
Visual Displays for Univariate Data
Stem-and-leaf displays
Describing distributions
Dotplots
The Normal Distribution
Histograms
Other Continuous Distributions
Several Useful discrete distributions
Stem-and-leaf displays
Stem-and-leaf display: an effective way to organize
numerical data into two parts:
Stem: one or more leading digits
Leaf: the remaining digits
The display can provide the following information:
Identification of a typical or representative value
Extent of spread about the typical value
Presence of any gaps in the data
Extent of symmetry in the distribution of values
Number and location of peaks
Applied Probability and Statistics for Engineering and Science Chapter 1: Populations, Samples and Processes 8
Populations, Samples and Variable
Visual Displays for Univariate Data
Stem-and-leaf displays
Describing distributions
Dotplots
The Normal Distribution
Histograms
Other Continuous Distributions
Several Useful discrete distributions
Applied Probability and Statistics for Engineering and Science Chapter 1: Populations, Samples and Processes 9
Populations, Samples and Variable
Visual Displays for Univariate Data
Stem-and-leaf displays
Describing distributions
Dotplots
The Normal Distribution
Histograms
Other Continuous Distributions
Several Useful discrete distributions
Dotplots
Dotplot: a summary of data when the data set is
reasonably small or there are relatively few distinct data
values.
Each observation is represented by a dot above the
corresponding location on a a horizontal measurement
scale.
When a value occurs more than once, there is a dot for
each occurrence, and these dots are stacked vertically.
a dotplot provides information about location, spread,
extremes, and gaps.
Applied Probability and Statistics for Engineering and Science Chapter 1: Populations, Samples and Processes 10
Populations, Samples and Variable
Visual Displays for Univariate Data
Stem-and-leaf displays
Describing distributions
Dotplots
The Normal Distribution
Histograms
Other Continuous Distributions
Several Useful discrete distributions
Dotplot: an example
Here is an example to show what a dotplot looks like and how to interpret it. Suppose 30 first
graders are asked to pick their favorite color. Their choices can be summarized in a dotplot, as
shown below.
*
*
* *
* *
* * *
* * *
* * * * *
* * * * * *
* * * * * * *
Red Orange Yellow Green Blue Indigo Violet
Each dot represents one student, and the number of dots in a column represents the number of first
graders who selected the color associated with that column. For example, Red was the most popular
color (selected by 9 students), followed by Blue (selected by 7 students). Selected by only 1
student, Indigo was the least popular color.
Applied Probability and Statistics for Engineering and Science Chapter 1: Populations, Samples and Processes 11
Populations, Samples and Variable
Visual Displays for Univariate Data
Stem-and-leaf displays
Describing distributions
Dotplots
The Normal Distribution
Histograms
Other Continuous Distributions
Several Useful discrete distributions
Histograms
Construct a histogram for:
discrete data:
Determine the (relative) frequency of each x value in a
sample set
Mark possible x values on a horizontal scale
Above each value, draw a rectangle whose height is the
relative frequency of that value.
continuous data:
Determine the (relative) frequency of each class
Mark the class boundaries on a horizontal measurement
axis
Above each class interval, draw a rectangle whose height
is the corresponding frequency.
Applied Probability and Statistics for Engineering and Science Chapter 1: Populations, Samples and Processes 12
Populations, Samples and Variable
Visual Displays for Univariate Data
Stem-and-leaf displays
Describing distributions
Dotplots
The Normal Distribution
Histograms
Other Continuous Distributions
Several Useful discrete distributions
Histogram: an example
1500
Gaussian Histogram
Number of values in each interval
1000
500
0
−4 −3 −2 −1 0 1 2 3 4
Variable value
Applied Probability and Statistics for Engineering and Science Chapter 1: Populations, Samples and Processes 13
Populations, Samples and Variable
Visual Displays for Univariate Data
Describing distributions Continuous distributions
The Normal Distribution Discrete distributions
Other Continuous Distributions
Several Useful discrete distributions
Density function
A density function f (x) is used to describe
(approximately) the population distribution of a
continuous variable x.
The graph of f (x) is called the density curve.
The following properties of f (x) must be satisfied:
fR (x) ≥ 0
−∞
−∞ f (x)dx = 1 (i.e., the total area under the density
curve is 1)
For any two numbers a and b withR b a < b, the proportion
of x values between a and b = a f (x)dx.
Applied Probability and Statistics for Engineering and Science Chapter 1: Populations, Samples and Processes 14
Populations, Samples and Variable
Visual Displays for Univariate Data
Describing distributions Continuous distributions
The Normal Distribution Discrete distributions
Other Continuous Distributions
Several Useful discrete distributions
Density function
Applied Probability and Statistics for Engineering and Science Chapter 1: Populations, Samples and Processes 15
Populations, Samples and Variable
Visual Displays for Univariate Data
Describing distributions Continuous distributions
The Normal Distribution Discrete distributions
Other Continuous Distributions
Several Useful discrete distributions
Mass function
A mass function p(x) is used to describe (approximately)
the population distribution of a discrete variable x.
The following properties of p(x) must be satisfied:
P ≥0
p(x)
p(x) = 1
Applied Probability and Statistics for Engineering and Science Chapter 1: Populations, Samples and Processes 16
Populations, Samples and Variable
Visual Displays for Univariate Data
Describing distributions Definition
The Normal Distribution The standard normal distribution
Other Continuous Distributions
Several Useful discrete distributions
Definition
A continuous variable x is said to have a normal distribution
with parameters µ and σ, where −∞ < µ < ∞ and σ > 0, if
the density function of x is
1 2 2
f (x) = √ e−(x−µ) /(2σ ) with − ∞ < x < ∞ (1)
2πσ
0.35
0.3
0.25
f(x)
0.2
0.15
0.1
0.05
0
−6 −4 −2 0 2 4 6
x
Applied Probability and Statistics for Engineering and Science Chapter 1: Populations, Samples and Processes 18
Populations, Samples and Variable
Visual Displays for Univariate Data
The lognormal distribution
Describing distributions
The Weibull distribution
The Normal Distribution
Selecting an appropriate distribution
Other Continuous Distributions
Several Useful discrete distributions
Applied Probability and Statistics for Engineering and Science Chapter 1: Populations, Samples and Processes 19
Populations, Samples and Variable
Visual Displays for Univariate Data
The lognormal distribution
Describing distributions
The Weibull distribution
The Normal Distribution
Selecting an appropriate distribution
Other Continuous Distributions
Several Useful discrete distributions
0.014
σ=1
0.012 µ =4
0.01
lognormal distribution
0.008
0.006
0.004
0.002
0
0 50 100 150 200 250 300 350 400 450 500
x
Applied Probability and Statistics for Engineering and Science Chapter 1: Populations, Samples and Processes 20
Populations, Samples and Variable
Visual Displays for Univariate Data
The lognormal distribution
Describing distributions
The Weibull distribution
The Normal Distribution
Selecting an appropriate distribution
Other Continuous Distributions
Several Useful discrete distributions
Applied Probability and Statistics for Engineering and Science Chapter 1: Populations, Samples and Processes 21
Populations, Samples and Variable
Visual Displays for Univariate Data
The lognormal distribution
Describing distributions
The Weibull distribution
The Normal Distribution
Selecting an appropriate distribution
Other Continuous Distributions
Several Useful discrete distributions
2
β=1, α=1
1.8
β=1, α=1.5
1.6 β=1, α=5
1.4
Density function
1.2
0.8
0.6
0.4
0.2
0
0 0.5 1 1.5 2 2.5 3
x
Applied Probability and Statistics for Engineering and Science Chapter 1: Populations, Samples and Processes 22
Populations, Samples and Variable
Visual Displays for Univariate Data
The lognormal distribution
Describing distributions
The Weibull distribution
The Normal Distribution
Selecting an appropriate distribution
Other Continuous Distributions
Several Useful discrete distributions
Applied Probability and Statistics for Engineering and Science Chapter 1: Populations, Samples and Processes 23
Populations, Samples and Variable
Visual Displays for Univariate Data
Describing distributions The Binomial distribution
The Normal Distribution The Poisson distribution
Other Continuous Distributions
Several Useful discrete distributions
Applied Probability and Statistics for Engineering and Science Chapter 1: Populations, Samples and Processes 24
Populations, Samples and Variable
Visual Displays for Univariate Data
Describing distributions The Binomial distribution
The Normal Distribution The Poisson distribution
Other Continuous Distributions
Several Useful discrete distributions
Applied Probability and Statistics for Engineering and Science Chapter 1: Populations, Samples and Processes 25
Populations, Samples and Variable
Visual Displays for Univariate Data
Describing distributions The Binomial distribution
The Normal Distribution The Poisson distribution
Other Continuous Distributions
Several Useful discrete distributions
0.35
Binomial histogram
0.3
0.25
Proportion
0.2
0.15
0.1
0.05
0
0 1 2 3 4 5 6 7 8
x
Applied Probability and Statistics for Engineering and Science Chapter 1: Populations, Samples and Processes 26
Populations, Samples and Variable
Visual Displays for Univariate Data
Describing distributions The Binomial distribution
The Normal Distribution The Poisson distribution
Other Continuous Distributions
Several Useful discrete distributions
Applied Probability and Statistics for Engineering and Science Chapter 1: Populations, Samples and Processes 27
Populations, Samples and Variable
Visual Displays for Univariate Data
Describing distributions The Binomial distribution
The Normal Distribution The Poisson distribution
Other Continuous Distributions
Several Useful discrete distributions
0.35
λ=2
0.3
0.25
Poisson histogram
0.2
0.15
0.1
0.05
0
0 1 2 3 4 5 6 7 8
x
Applied Probability and Statistics for Engineering and Science Chapter 1: Populations, Samples and Processes 28