Intro
to Stats, Test 1
R.
Sinn, Spring 2007 Student Solutions
& Points Scoring Guide
Test is out of 90 Points Total
Directions
Answer each question in full.
In some cases in statistics, more than one answer may be counted correct. Please justify your statements if in doubt.
You may use a graphing calculator on any problem.
Please mark “CALC” in the
margin next to any steps performed on
the calculator.

Questions 1 and 2 relate to Histogram
1 shown at right.
1.
[5 Points] Check all the statements that apply:
q
The distribution is
approximately normal.
q
The distribution is skewed
left.
q The distribution is skewed
right.
q The mean and median are approximately
equal.
2.
[5 Points] Check all
the statements that apply:
q The mean is likely to be less
than the median.
q The mean is likely to be greater
than the median.
q
The data set is not
likely to contain outliers.
q Outliers likely exist, and
the majority are on the left.
q Outliers likely exist, and
the majority are on the right.
Questions 3 and 4 relate to Histogram 2 shown at right.
3.
[5 Points] Check all the statements that apply:
q
The distribution is
approximately normal.
q The distribution is skewed
left.
q The distribution is skewed
right.
q
The mean and median are approximately
equal.
4.
[5 Points] Check all the statements that apply:
q The mean is likely to be
less than the median.
q The mean is likely to be
greater than the median.
q
The data set is not
likely to contain outliers.
q Outliers likely exist, and the
majority are on the left.
q Outliers likely exist, and the
majority are on the right.
[Grading for Q’s 1 - 4: 5 Pts -1 Pt for each correct option not
marked, and -1 Pt for each incorrect option marked.]
|
Regression Statistics |
Line of Best Fit y = a x + b |
||
|
r |
-0.4827 |
a |
-0.372 |
|
R2 |
0.2230 |
b |
7.483 |
5.
[10 Points] Essay: During the Fall semester 2006, a team of NGCSU
students found a correlation between age (x-variable,
measured in years) and interest in rock climbing (y-variable, measured on a scale from 1 – 10, with 10 indicating
high interest and 1 indicating no interest).
Write 2 – 5 complete sentences explaining everything you know
and can infer based upon the output shown to the right. Be sure to focus your analysis upon
real-world implications.
The researchers found a strong, negative
real-world (moderate negative book) correlation between age and interest in
rock climbing [2 Pts]. They found that
22% of the variance in interest in rock climbing was accounted for by age [2
Pts], with older people less interested than younger [1 Pt]. The slope from the prediction equation
indicates that for each one year older a person is, his or her interest in rock
climbing will decrease by .37 units [2 Pts].
Following directions [3 Pts]: awarded for writing in complete sentences
and for connecting statistical output to the real world research
scenario.
6.
[5 Points] A statistician computes a regression comparing a father’s
height (in inches) to his daughter’s height (in inches). She finds that R2 =
.39. Interpret this finding.
This means that 39% of the variance in a
daughter’s height is accounted for by father’s height.

7.
[5 Points] Given Scatter Plot 1 shown at right, check all the
statements that apply:
q
Linear Regression is
appropriate.
q Linear Regression is not
appropriate.
q The (linear) correlation is positive.
q
The (linear) correlation is negative.
q
The (linear) correlation is strong.
q The (linear) correlation is weak.
q No linear correlation
exists.
[Grading for Q 7 & 8: 5 Pts -1 Pt for
each correct option not marked, and -1 Pt for each incorrect
option marked.]

8.
[5 Points] Given Scatter Plot 2 shown at right, check all the
statements that apply:
q Linear Regression is
appropriate.
q
Linear Regression is not
appropriate.
q The (linear) correlation is positive.
q The (linear) correlation is negative.
q The (linear) correlation is strong.
q
The (linear) correlation is weak.
q No linear correlation
exists.
9.
[10 Points] A tallish female college student was studying dating
patterns. She wonders if tall women tend
to date taller men than do short women. She measures the height of several
women in her dorm; then she measures the next man each woman dates. Here are
the data (heights in inches), and linear regression is appropriate:
|
Women (x) |
66 |
64 |
66 |
65 |
70 |
65 |
|
Men (y) |
72 |
68 |
70 |
68 |
71 |
65 |
See calculator screen shots at end of test
for correct calculator steps.
a.
Find the correlation coefficient and analyze it.
r = 0.565 which indicates a strong,
positive relationship (real-world) [3 Pts]
b. If a woman is 5’5” tall,
estimate the height of her boyfriend.
Plug in 65” for x
in line of best fit: y = 0.68 x + 24 → Boyfriend will be approximately
68.2” tall [2 Pts]
c.
Analyze R2 in the context of this problem.
R2 =
0.32 meaning that 32% of the variance in boyfriend height is accounted for by the
girl’s height [2 Pts]
d. Analyze the slope of the
prediction equation. How meaningful is
this relationship (the one between the variables, not the dating
relationships)? Hint: refer to your
answer in part (c).
For every 1 inch taller the woman is, her
boyfriend is about 0.68 inches taller.
This is moderately meaningful, since a third of the variance is
accounted for [3 Pts]
|
Record year x-variable |
Time
(seconds) y-variable |
|
1967 |
2286.4 |
|
1970 |
2130.5 |
|
1975 |
2100.4 |
|
1975 |
2041.4 |
|
1977 |
1995.1 |
|
1979 |
1972.5 |
|
1981 |
1950.8 |
|
1981 |
1937.2 |
|
1982 |
1895.3 |
|
1983 |
1895.0 |
|
1983 |
1887.6 |
|
1984 |
1873.8 |
|
1985 |
1859.4 |
|
1986 |
1813.7 |
|
1993 |
1771.8 |
10. [10 Points] The table to the
right shows the progress of women’s world record times (in seconds) for the
10,000-meter run. Answer the following:
See calculator screen shots at end of test
for correct calculator steps.
a.
Find and analyze the correlation between record time and year.
r = -0.97 indicating a strong, negative
relationship (real-world) [3 Pts].
b. Analyze R2 in the
context of this problem.
R2 = 0.94 meaning that 94% of
the variance in world record times is accounted for by year [3 Pts].
c.
In what year (according to your prediction equation) will the women’s
10,000-meter world record time be 5 minutes?
Plug 5 x 60 = 300 in for y in the line of
best fit: y = -19.9 x + 41373 → The world record will be 5 minutes in the
year 2064 [3 Pts].
d. Explain why the correct
answer to part (c) makes very little sense in a real-world context.
The problem is the scope of the model
[1 Pt]. The linear relationship is valid
for the very narrow range of years covered in the data set (see scatter plot),
but obviously there will come a time when the records reach some limit beyond
which significant gains are not humanly possible.
11.
[5 Points] Five
hundred drivers were asked about the age car they drive (in years). The boxplot below shows the data collected on
the ages of the 500 cars. For parts (a)
through (c), select the one best answer choice.

a.
The median
age of cars in the study is: [2 Pts]
q
4
q 8
q 12
q not available in the
information provided
b.
The mean
age of cars in the study is: [2 Pts]
q 4
q 8
q 12
q
not available in the
information provided
c.
The percentage of
cars reported to be more than 12 years old was:
[1 Pt]
q
0
q 25
q 50
q 75
|
|
Cause of Death |
||
|
|
Cancer |
Heart Disease |
Other |
|
Smoker |
135 |
310 |
205 |
|
Non-Smoker |
55 |
155 |
140 |
12.
[5 Points] Is smoking related to cause of
death? Use the Chi-Square test on the
following 2-way table. The mortality
rates are for 1000 males in the 45 – 64 year old age category. Test at the 0.10 level (just as we did in all
3 class examples).
See calculator screen shots at end of test
for correct calculator steps.
Since p = .0154 which is less than 0.1, we have evidence for a
relationship between smoking and cause of death.
13. [10 Points] The data given below are the ages
of students in a class:
31 17 21 40 19 21 26 37 24 18 29 25 37 48 21 28 34 23 18 22 32
See calculator screen shots at end of test
for correct calculator steps.
a.
Provide a
standard data table for this data set, i.e. mean (
), standard deviation (s)
and sample size (n). [3 Pts]
|
Min |
17 |
|
Q1 |
21 |
|
Med |
25 |
|
Q3 |
33 |
|
Max |
48 |
b.
Provide the 5
number summary for this data set. [5 Pts]
|
|
27.2 |