aIntro to Stats, Test 1

R. Sinn, Spring 2007                                                                             Name ______________________________

 

Test is out of 90 Points Total

Directions

*       Answer each question in full. In some cases in statistics, more than one answer may be counted correct.  Please justify your statements if in doubt.

*       You may use a graphing calculator on any problem. 

*       Please mark “CALC” in the margin next to any steps performed on the calculator. 

Questions 1 and 2 relate to Histogram 1 shown at right.

 

1.        [5 Points] Check all the statements that apply:

q       The distribution is approximately normal.

q       The distribution is skewed left.

q       The distribution is skewed right.

q       The mean and median are approximately equal.

 

2.        [5 Points] Check all the statements that apply:

q       The mean is likely to be less than the median.

q       The mean is likely to be greater than the median.

q       The data set is not likely to contain outliers.

q       Outliers likely exist, and the majority are on the left.

q       Outliers likely exist, and the majority are on the right.

Questions 3 and 4 relate to Histogram 2 shown at right.

 

3.        [5 Points] Check all the statements that apply:

q       The distribution is approximately normal.

q       The distribution is skewed left.

q       The distribution is skewed right.

q       The mean and median are approximately equal.

 

4.        [5 Points] Check all the statements that apply:

q       The mean is likely to be less than the median.

q       The mean is likely to be greater than the median.

q       The data set is not likely to contain outliers.

q       Outliers likely exist, and the majority are on the left.

q       Outliers likely exist, and the majority are on the right.

 

 

Regression

 Statistics

Line of Best Fit

  y = a x + b

r

-0.4827

a

-0.372

R2

0.2230

b

7.483

5.         [10 Points] Essay: During the Fall semester 2006, a team of NGCSU students found a correlation between age (x-variable, measured in years) and interest in rock climbing (y-variable, measured on a scale from 1 – 10, with 10 indicating high interest and 1 indicating no interest).  Write 2 – 5 complete sentences explaining everything you know and can infer based upon the output shown to the right.  Be sure to focus your analysis upon real-world implications.


6.        [5 Points] A statistician computes a regression comparing a father’s height (in inches) to his daughter’s height (in inches).  She finds that R2 = .39.  Interpret this finding.

 

 

 

7.        [5 Points] Given Scatter Plot 1 shown at right, check all the statements that apply:

q       Linear Regression is appropriate.

q       Linear Regression is not appropriate.

q       The (linear) correlation is positive.

q       The (linear) correlation is negative.

q       The (linear) correlation is strong.

q       The (linear) correlation is weak.

q       No linear correlation exists.

 

 

 

8.        [5 Points] Given Scatter Plot 2 shown at right, check all the statements that apply:  

q       Linear Regression is appropriate.

q       Linear Regression is not appropriate.

q       The (linear) correlation is positive.

q       The (linear) correlation is negative.

q       The (linear) correlation is strong.

q       The (linear) correlation is weak.

q       No linear correlation exists.

 

 

 

 

9.        [10 Points] A tallish female college student was studying dating patterns.  She wonders if tall women tend to date taller men than do short women. She measures the height of several women in her dorm; then she measures the next man each woman dates. Here are the data (heights in inches), and linear regression is appropriate:

 

Women (x)

66

64

66

65

70

65

Men (y)

72

68

70

68

71

65

 

a.        Find the correlation coefficient and analyze it.

 

 

 

b.       If a woman is 5’5” tall, estimate the height of her boyfriend.

 

 

 

 

 

c.        Analyze R2 in the context of this problem.

 

 

 

 

d.       Analyze the slope of the prediction equation.  How meaningful is this relationship (the one between the variables, not the dating relationships)?  Hint: refer to your answer in part (c).

Record year

x-variable

Time (seconds)

y-variable

1967

2286.4

1970

2130.5

1975

2100.4

1975

2041.4

1977

1995.1

1979

1972.5

1981

1950.8

1981

1937.2

1982

1895.3

1983

1895.0

1983

1887.6

1984

1873.8

1985

1859.4

1986

1813.7

1993

1771.8

10.     [10 Points] The table to the right shows the progress of women’s world record times (in seconds) for the 10,000-meter run.  Answer the following:

 

a.        Find and analyze the correlation between record time and year.

 

 

 

b.       Analyze R2 in the context of this problem.

 

 

 

 

c.        In what year (according to your prediction equation) will the women’s 10,000-meter world record time be 5 minutes?

 

 

 

 

 

d.       Explain why the correct answer to part (c) makes very little sense in a real-world context.

 

 

 

 

11.     [5 Points] Five hundred drivers were asked about the age car they drive (in years).  The boxplot below shows the data collected on the ages of the 500 cars.  For parts (a) through (c), select the one best answer choice.


a.        The median age of cars in the study is:

q         4

q         8

q       12

q       not available in the

        information provided

 


b.       The mean age of cars in the study is:

q         4

q         8

q       12

q       not available in the information provided


c.        The percentage of cars reported to be more than 12 years old was:

q         0

q       25

q       50

q       75

 

 


12.     [5 Points] Is smoking related to cause of death?  Use the Chi-Square test on the following 2-way table.  The mortality rates are for 1000 males in the 45 – 64 year old age category.  Test at the 0.10 level (just as we did in all 3 class examples).

 

Cause of Death

 

Cancer

Heart Disease

Other

Smoker

135

310

205

Non-Smoker

55

155

140


 

13.      [10 Points] The data given below are the ages of students in a class:

 

        31    17    21    40    19    21    26    37    24    18    29    25    37    48    21    28    34    23    18    22    32

 


a.        Provide a standard data table for this data set, i.e. mean (), standard deviation (s) and sample size (n).

 

 

b.       Provide the 5 number summary for this data set.

 


 

 

 

c.        Basing your conclusion only on the three numbers mean, median and standard deviation, do you think the distribution is skewed?  If so, explain why you think so and in which direction it is skewed (left or right)?

 

 

 

 

 

 

14.     [5 Points] You are given the data set below.

 

6

8

10

8

10

18

12

5

7

9

12

7

 

 

a.          Identify any outliers, and state how you did so.

 

 

 

 

 

 

b.       Compute the z-score for x = 18.