45-733 PROBABILITY AND STATISTICS I Practice Final



Probability and Statistics                      Name__________________________
Spring 1999 Flex-Mode and Flex-Time 45-733
Practice Final
Keith Poole

(10 Points)

1. In a random sample of 400 consumers in a given city, 210 preferred Fords to Chevrolets. Use this sample to perform the hypothesis test

H0: p = .50
H1: p ¹ .50

with a = .20.

The decision rule for this problem is:

    Ù
If (p - p0)/[p0(1 - p0)/n]1/2 > za/2 or 
    Ù  
   (p - p0)/[p0(1 - p0)/n]1/2 < -za/2 then Reject H0:
            Ù  
If  -za/2 < (p - p0)/[p0(1 - p0)/n]1/2 < za/2 then Do Not Reject H0:
               Ù
 -z.10 = 1.28,  p = 210/400 = .525

Test Statistic = (.525 - .5)/[(.5*.5)/400]1/2 = 1.0
Hence, Do Not Reject H0:

The P-Value of the Test = 2*.1587 = .3174 Using Normal Distribution Function Table.



Probability and Statistics                      Name__________________________
Spring 1999 Flex-Mode and Flex-Time 45-733
Practice Final
Keith Poole

(10 Points)

2. A manufacturer makes parts under a government contract which specifies a mean weight of 100 grams and a tolerance of 1 gram. The government will reject parts unless the weight is between 99 and 101 grams. To be safe, the manufacturer wants the variance of the weight to be no more than .5 grams. A sample of 5 parts yields weights of 99.6, 101.2, 100.4, 98.8, and 100.0 grams, respectively. Assume that the weights are normally distributed and perform the following hypothesis test with a = .05.

H0: s2 = .50
H1: s2 ¹ .50



The decision rule for this problem is:

If (n-1)s2/ s02 > C2 or (n-1)s2/ s02 < C1 Then Reject H0:

Where P(c2n-1 > C2) = a/2 and P(c2n-1 < C1) = a/2

s2 = .8, 4df, C1 = .484419, C2 = 11.1433

Test Statistic: (n-1)s2/ s02 = (4*.8)/.5 = 6.4

So Do Not Reject H0:

The P-Value of the Test = 2*.1712 = .3424
In EViews, the command:
SCALAR PVALUE=@CHISQ(6.4,4)
gives us the tail area above 6.4 which equals .1712.



Probability and Statistics                      Name__________________________
Spring 1999 Flex-Mode and Flex-Time 45-733
Practice Final
Keith Poole

(10 Points)

3. Suppose we have the discrete bivariate probability distribution:

                           æ c(x3 + y + 3)  x = 0,1,2
                  f(x,y) = ç                y = 0,1,2
                           è   0 otherwise
  1. Find c.

    
                                   y
                               0   1   2
                             ------------
                           0 | 3   4   5 | 12
                             |           |
                        x  1 | 4   5   6 | 15
                             |           |
                           2 |11  12  13 | 36
                             |           |
                             ---------------
                              18  21  24 | 63
       Hence c = 1/63
    
  2. Find g1(x | y=2).


                                  æ  5/24  x = 0
                                  ç
                                  ç  6/24  x = 1
     g1(x | y=2) = f(x,2)/f2(2) = ç
                                  ç 13/24  x = 2 
                                  ç
                                  è   0 otherwise



Probability and Statistics                      Name__________________________
Spring 1999 Flex-Mode and Flex-Time 45-733
Practice Final
Keith Poole

(10 Points)

4. We wish to estimate the average length of bolts in a shipment to our factory. Suppose the population standard deviation is known to be 3.5cm. We take a random sample of 49 bolts. What is the probability that the absolute difference between the sample mean and the true mean will not exceed .2cm.


     _                       _
  P[|Xn - m| £ .2] = P[-.2 £ Xn - m £ .2] = 
                   _
  P[-.2/(3.5/7) £ (Xn - m)/(3.5/7) £ .2/(3.5/7)] = 

  P[-.4 £ Z £ .4] = F(.4) - F(-.4) = .6554 - .3446 = .3108


Probability and Statistics                      Name__________________________
Spring 1999 Flex-Mode and Flex-Time 45-733
Practice Final
Keith Poole

(10 Points)

5. Let X1, X2, ..., Xn be a random sample from a distribution with mean m and variance s2. What is the mean and variance of [X1/2 + X2/3 + X3/6]?

E[X1/2 + X2/3 + X3/6] = m/2 + m/3 + m/6 = m

VAR[X1/2 + X2/3 + X3/6] = s2/4 + s2/9 + s2/36 = (14s2)/36



Probability and Statistics                      Name__________________________
Spring 1999 Flex-Mode and Flex-Time 45-733
Practice Final
Keith Poole

(10 Points)

6. Diseases I and II are present among people in a certain large population. Suppose that 25% of the population will contract Disease I sometime during their lifetime, 23% will contract Disease II sometime during their lifetime, and 9% will contract both diseases.
  1. (5 Points) Find the probability that a randomly chosen person from this population will contract at least one disease.

    P(I È II) = P(I) + P(II) - P(I Ç II) = .25 + .23 - .09 = .39

  2. (5 Points) Find the probability that a randomly chosen person from this population will contract both diseases, given that he/she has contracted at least one disease.

    P(I Ç II | I È II) = P(I Ç II)/ P(I È II) = .09/.39 = .2308

Probability and Statistics                      Name__________________________
Spring 1999 Flex-Mode and Flex-Time 45-733
Practice Final
Keith Poole

(10 Points)

7. Suppose we read in The Tribune Review that a public opinion poll shows that 64 percent of 400 randomly sampled Pittsburgh residents think that crocuses are the most beautiful flowers. Compute 99 percent confidence limits for the true proportion of Pittsburgh residents who think that crocuses are the most beautiful flowers.

Confidence Limits are:
   Ù        Ù     Ù
   p ± za/2[p(1 - p)/n]1/2
   Ù
   p = .64,  z.005 = 2.58

  .64 ± 2.58*[(.64*.36)/400]1/2 = .64 ± .06192 

  so the 99% limits are:  (.578, .702)



Probability and Statistics                      Name__________________________
Spring 1999 Flex-Mode and Flex-Time 45-733
Practice Final
Keith Poole

(10 Points)

8. The Clean Soap Company is concerned about advertising claims made by its bitter rival the Better Soap Company that its soap resulted in cleaner clothes. They ask us to test Better's claim. We take 14 dirty sweatshirts and we randomly divide them into two groups of 7 sweatshirts each and wash them in separate but identical washing machines. In one washing machine we use Clean Soap and in the other washing machine we use Better Soap. On a scale of 1 (gross!) to 10 (squeaky clean!) we obtain the following results:


                    Clean    Better
                  ------------------
                       8       7       
                       8       5
                       7      10
                       9       8
                       6       6
                       9       6
                       7       8
                  -----------------
Assume that the cleaning scale values for both brands of soap are normally distributed with equal variances. Test the hypothesis that the two soaps are equally good at cleaning sweatshirts against the alternative that Clean Soap cleans better than Better Soap. Use E-VIEWS to calculate the P-Value of the test.

H0: m1 - m2 = 0
H1: m1 - m2 > 0
The decision rule for this problem is:
    _    _
If (Xn - Ym)/[s(1/n + 1/m)1/2] > ta then Reject H0:
    _    _
If (Xn - Ym)/[s(1/n + 1/m)1/2] < ta then Do Not Reject H0:

where        

s2 = [(n - 1)s12 + (m - 1)s22]/(n + m - 2)

   _             _
   Xn = 7.714,   Ym = 7.143,  s12 = 1.238,   s22 = 2.810

   s2 = (6*1.238 + 6*2.810)/12 = 2.024

Test Statistic = (7.714 - 7.143)/[2.024*(1/7 + 1/7)]1/2 = .751

Using the EViews Command:

SCALAR PVAL=@TDIST(.751,12)

Produces a two-tailed P-Value of .4671 which we divide by 2 to get:  

P-Value = .234



Probability and Statistics                      Name__________________________
Spring 1999 Flex-Mode and Flex-Time 45-733
Practice Final
Keith Poole

(10 Points)

9. We take a random sample of potholes in Pittsburgh city streets and Cleveland city streets and measure the depth of each pothole. The measurements for Pittsburgh in centimeters are:
          105   91  109   141  122   99  159  138   88
and the measurements for Cleveland in centimeters are:
          135  121   99   167   90   89  199 
Assume that the depth of potholes in both cities is normally distributed. Find 98 percent confidence limits for the difference between the true depths of the populations of Pittsburgh and Cleveland potholes.

The Confidence Interval is:
  _    _                                  _   _
P{Xn - Ym - ta/2s(1/n + 1/m)1/2 < mx - my < Xn - Ym + ta/2s(1/n + 1/m)1/2} = 
1 - a
where
s2 = [(n - 1)(sx)2 + (m - 1)(sy)2]/(n + m - 2)

  _              _                                 
  Xn = 116.889,  Ym = 128.571,  sx2 = 606.861,  sy2 = 1743.952

  t.01,14 = 2.624, and

  s2 = (8*606.861 + 6*1743.952)/14 = 1094.186

Confidence Limits are:  

  116.889 - 128.571 ± 2.624[1094.186(1/9 + 1/7)]1/2 = 

  -11.682 ± 43.742,  or  (-55.424, 32.060)

Probability and Statistics                      Name__________________________
Spring 1999 Flex-Mode and Flex-Time 45-733
Practice Final
Keith Poole

(10 Points)

10. Suppose the times that it takes for GSIA students to get through checkout line in the cafeteria on the first floor are independent random variables with a mean of 1.1 minutes and a standard deviation of 1.8 minutes. Approximate the probability that 50 GSIA students can get through the checkout line in less than one hour.

P{[(åi=1,n Xi) - nm]/ sn1/2 £ k} » F(k)

m = 1.1 and s2 = 1.82

Hence: P[(åi=1,50 Xi) < 60] =
P{[(åi=1,50 Xi) - 50*1.1]/[1.8*501/2] < [(60 - 50*1.1)/(1.8*501/2]} =
P(Z < .393) = F(.393) = .6528



Probability and Statistics                      Name__________________________
Spring 1999 Flex-Mode and Flex-Time 45-733
Practice Final
Keith Poole

(5 Points)

11. The proportion of people having a certain type of respiratory disease in a large population is known to be .004. We take a random sample of 1000 people from this population. What is the probability that at most 3 and at least 1 person has the respiratory disease.

l = np = 1000*.004 = 4

P(1 £ l £ 3) = F(3) - F(0) = .433 - .018 = .415




Probability and Statistics                      Name__________________________
Spring 1999 Flex-Mode and Flex-Time 45-733
Practice Final
Keith Poole

(10 Points)

12. We draw a random sample of size 61 from a normal distribution and we find that s2 = 1112.3. Compute 90 and 98 percent confidence limits for s2.

The Confidence Interval is:

P[(n-1)s2/C2 < s2 < (n-1)s2/C1] = 1 - a

90% Confidence Limits:

C1 = 43.1879 and C2 = 79.0819
(n-1)s2/C2 = (60*1112.3)/79.0819 = 843.9099, and
(n-1)s2/C1 = (60*1112.3)/43.1879 = 1545.2939

98% Confidence Limits:

C1 = 37.4848 and C2 = 88.3794
(n-1)s2/C2 = (60*1112.3)/88.3794 = 755.131, and
(n-1)s2/C1 = (60*1112.3)/37.4848 = 1780.402




Probability and Statistics                      Name__________________________
Spring 1999 Flex-Mode and Flex-Time 45-733
Practice Final
Keith Poole

(10 Points)

13. We take a second [see (9) above] random sample of potholes in Pittsburgh city streets and Cleveland city streets and measure the depth of each pothole. The measurements for Pittsburgh in centimeters are:
           95  191  119    81  102  111  100  108   98
and the measurements for Cleveland in centimeters are:
          115  171   99   117  130   99  179  107
Assume that the depth of potholes in both cities is normally distributed. Test the null hypothesis that the variances of the two populations are the same against the alternative hypothesis that the variances are not the same (a = .01).

H0: s12 = s22
H1: s12 ¹ s22

The decision rule for this problem is:

If s12/s22 > K2 or s12/s22 < K1 Then Reject H0:

Where P(Fn-1,m-1 > K2) = a/2 and P(Fn-1,m-1 < K1) = a/2

K2 = Fn-1,m-1, a/2 = F8,7,.005 = 8.68, and
K1 = 1/Fm-1,n-1, a/2 = 1/F7,8,.005 = 1/7.69 = .130

s12 = 999.5 and s22 = 980.125

Test Statistic = 999.5/980.125 = 1.02 So Do Not Reject H0:

To get the P-Value, use the EViews command:

SCALAR PVAL=@FDIST(1.02,8,7)
which produces a value of .496158. Multiply this by 2 to get the result:

P-Value = .992316

For the One Tail Test set K2 = Fn-1,m-1, a = F8,7,.01 = 6.84, and since
Test Statistic = 1.02 we still Do Not Reject H0:

Note that the P-Value = .496158. This is why that I said in class that I prefer this version of the F-Test. But I will accept either version on the final.



Probability and Statistics                      Name__________________________
Spring 1999 Flex-Mode and Flex-Time 45-733
Practice Final
Keith Poole

(10 Points)

14. Suppose X, Y have the continuous bivariate distribution:

                           æ [4(2 - xy)]/7   0 < x < 1
                  f(x,y) = ç                 0 < y < 1
                           è   0 otherwise
Find COV(X,Y).

COV(X,Y) = E(XY) - E(X)E(Y)

E(XY) = ò01 ò01 xy[4(2 - xy)/7]dxdy = (4/7)ò01 ò01 [(2xy - x2y2)]dxdy =
(4/7) ò01 [(x2y - (x3y2)/3]|01 dy = (4/7) ò01 [y - y2/3]dy = (4/7)[y2/2 - y3/9]|01 = 2/9

E(X) = ò01 ò01 x[4(2 - xy)/7]dxdy = (4/7)ò01 ò01 [(2x - x2y]dxdy =
(4/7) ò01 [(x2 - (x3y)/3]|01 dy = (4/7) ò01 [1 - y/3]dy = (4/7)[y - y2/6]|01 = 10/21

E(Y) = ò01 ò01 y[4(2 - xy)/7]dxdy = (4/7)ò01 ò01 [(2y - xy2]dxdy =
(4/7) ò01 [(2xy - (x2y2)/2]|01 dy = (4/7) ò01 [2y - y2/2]dy = (4/7)[y2 - y3/6]|01 = 10/21

COV(X,Y) = 2/9 - (10/21)2 = -.0045