45-734 Probability and Statistics II (4th Mini AY 1997-98 Flex-Mode and Flex-Time)

Assignment #3: Due 7 April 1998


  1. Cigarette smoking is bad for you! It causes a variety of dreadful diseases and cuts years off your lifespan. One of the wretched byproducts of cigarettes is carbon monoxide. In this first problem we are going to look at the amount of carbon monoxide produced by a cigarette as a function of tar, nicotine, and weight of the cigarette. The data are in

    Ciggy.wf1

    The following variables are in the file: TAR (measured in milligrams); NICO (nicotine measured in milligrams); WEIGHT (measured in grams); and CARMON (carbon monoxide measured in milligrams). The data are for 25 brands of cigarettes (source: Federal Trade Commission, courtesy of Dennis Epple).

    Regress carbon monoxide of the other variables (you do not have to do any transformations of the variables). Perform F tests of b1 = b2 = 0, b1 = b3 = 0, and b2 = b3 = 0 (that is, all combinations of 2 of the 3 independent variables; use a = .05). Test the null hypothesis that the coefficient of tar is equal to one (use a= .05). In your judgement, how well does this model account for the amount of carbon monoxide? Show the EVIEWS output and your calculations.

  2. In this problem we are going to look at the number of wildcat oilwells drilled every year in the United States as a function of a variety of variables. The data are in:

    Drill3.wf1

    The variables in the file are: WILDCT2 (number of wells drilled in the U.S. by year); OILCON (U.S. oil consumption measured in number of barrels per day per capita); PCI (U.S. per capita income in thousands of 1982 dollars); PRICEOK (price of a barrel of sweet Oklahoma crude oil in 1982 dollars); VEHICLE (U.S. per capita registration of motor vehicles); and YEAR which is included for your convenience. The data are for 1936 to 1987 and are courtesy of John Londregan.

    1. Regress the number of wildcat wells on the other variables (show your output--you do not have to transform any of the variables). Note that it does not make sense to use the variable YEAR here. Do the t-statistics for the coefficients follow the pattern you expected --that is, do you think the results make sense? Explain your answer. Test whether or not the coefficients for OILCON, PCI, and VEHICLE are all equal to zero (i.e., an F test on the set--use a= .05).

    2. In (a) the model was linear in the parameters and in the variables. Test the model

      WILDCT2 = eb0 (OILCON)b1 (PCI)b2 (PRICEOK)b3 (VEHICLE)b4 ee

      Compare the pattern of the t-statistics on the coefficients for this model with those of part (a) (show your output). Do you think the results make sense? Explain your answer.

  3. In the United States it is political folklore that the vote for the presidential candidate of the incumbent president's political party is a function of the economy. It is also political folklore that the vote for the candidates of the incumbent president's political party for seats in the House of Representatives is not a direct function of the economy; rather "local" issues matter more than "national" issues in House elections. We are going to examine this folklore using presidential election data from 1916 to 1988 (courtesy of Howard Rosenthal and John Londregan). The data are in:

    Presvote.wf1

    The variables are HOUSVOTE (percentage of the national vote for the House candidates of the incumbent president's political party); PRSVOTE (percentage of the national vote for the presidential candidate of the incumbent president's political party); GNP (real GNP growth); REPUB (a dummy variable which is 1 if the incumbent president is a Republican and 0 if the incumbent president is a Democrat); MILMOB (same variable as used in homework #1).

    1. Investigate the proposition that the presidential vote for the incumbent president's political party depends upon the performance of the economy. At some point try lagging the variable GNP and using this and MILMOB in your model but do not use REPUB. To lag the variable GNP means to use the data from last year's GNP to predict this year's PRSVOTE. This is done by putting GNP(-1) as a right hand side variable (to lag two periods use GNP(-2), three periods, GNP(-3), and so on). Show the output of the model that you feel is correct (without using the REPUB variable). Explain your model and discuss the effects of the various variables that you include and exclude. Specifically, compare the effect of GNP to GNP(-1). What does this tell you about voters? Now add the dummy variable REPUB to your model and show results. What effect does this dummy variable have and why would we want to include it in the model (hint: note that there are only two political parties in the U.S.).

    2. Investigate the proposition that the vote for the candidates of the incumbent president's political party for seats in the House of Representatives is not a direct function of the economy. Specifically estimate

      HOUSVOTE = b0 + b1(GNP) + b2(PRSVOTE) + e

      now estimate

      HOUSVOTE = b0 + b1(GNP) + b2(PRSVOTE) + b3(REPUB) + e

      Compare the results of the two regressions (show the results). Do they make sense? (In your answer, assume you know nothing about U.S. history!)