Problem Set 6

(due April 13)
  1. A random sample of faculty, stratified by gender, in a large American university in 1969 gave the following annual salaries (in thousands of dollars, rounded):
    
        Men    12, 11, 19, 16, 22
        Women   9, 12,  8, 10, 16
    Denote income as Y and gender by the dummy variable X (coded 0 for men and 1 for women).
    1. Graph Y against X.
    2. Estimate by eye the regression line of Y on X.
    3. Estimate by least squares, the regression of Y on X. How close was your "eyeball" estimate?
    4. d. Construct a 95% confidence interval for the coefficient of X, and explain what it means in lay language.
    5. Do you think the answer to d is a measure of how much women's salaries are affected by gender discrimination by the university?


  2. The following is the result of a test of gas consumption on a sample of 6 cards manufactured in the early 1980's:
                 miles per   engine 
                  gallon    horsepower
                     Y         X1
        Make A       21       210
                     18       240
                     15       310
        Make B       20       220
                     18       260
                     15       320
    1. Code X2 as 0 or 1, depending on whether the car is Make or Make B.
      1. Estimate the multiple regression including both horsepower and manufacturer.
      2. Graph the data and the separate lines for each manufacturer.
    2. Add a "slope" dummy to the regression equation.
      1. Estimate the new multiple regression equation.
      2. Add the two new lines to the plot.


  3. In a drug experiment, data were collected on the efficacy of three different drugs, represented by two dummy variables (DA coded 1 for drug A and 0 otherwise; and DB coded 1 for drug B and 0 otherwise), controlling for gender (M, coded 1 for male and 0 for female).

    1. presume the following regression result:

      Y-hat = 5 + 15M + 30DA + 20DB

      Fill in the following Table:

      
                           Gender
                 Drug   Male    Female
      
                  A    _____     _____
      	
                  B    _____     _____
      
                  C    _____     _____


    2. Now presume the following alternate regression result:

      Y-hat = 5 + 15M + 30DA + 20DB + 5MDA - 10MDB

      Fill in the table above again based on this equation.

    3. Make the correct choice in each bracket:
      In the [additive, interactive] model, the improvement of males over females is the same for all drugs. Then it is equally ture that the improvement of drug A (or drug B) over drug C is the same for [both genders, all treatments].

  4. In a study based on more than 10,000 students in 1966, the probability P that a student will live on campus was found by MLE to be:
                  P
            log ----- = 1.47R - .05F + other variables
                 1-P
                        (.16)   (.11)
    
            where  	
    			R = 1 if the student would prefer to live 
            		on campus even if money were not a problem, 
    	        	and 0 otherwise
    			F = 1 if the student is female, and 0 if male
            and 
    			the figures in parentheses are standard errors.

    In a certain situation (e.g., male student who prefers to live on campus) the probability of the student living on campus is .60.

    1. Other things remaining equal, estimate the probability of the student living on campus if the student were female instead of male.
    2. Is the difference statistically discernible (significant) at the .05 level?


  5. Using the data that will be the basis of your term paper:

    1. Carry out a regression analysis involving one interval predictor and one nominal predictor set up as an appropriate dummy variable.
      1. Write out the equation representing the regression model.
      2. Graph the regression lines
      3. .
      4. Interpret the results

    2. Add interaction terms between the interval variable and the nominal variable.
      1. Write out the equation representing the regression model.
      2. Graph the regression lines
      3. .
      4. Interpret the results

Bert Kritzer, 608-263-2277, Kritzer@PoliSci.Wisc.Edu
Last modified, April 9, 2004