返回列表 发帖

Reading 12: Multiple Regression and Issues in Regression Analy

Session 3: Quantitative Methods for Valuation
Reading 12: Multiple Regression and Issues in Regression Analysis

LOS a: Formulate a multiple regression equation to describe the relation between a dependent variable and several independent variables, determine the statistical significance of each independent variable, and interpret the estimated coefficients and their p-values.

 

 

Which of the following statements regarding the results of a regression analysis is least accurate? The:

A)
slope coefficient in a multiple regression is the change in the dependent variable for a one-unit change in the independent variable, holding all other variables constant.
B)
slope coefficient in a multiple regression is the value of the dependent variable for a given value of the independent variable.
C)
slope coefficients in the multiple regression are referred to as partial betas.


 

The slope coefficient is the change in the dependent variable for a one-unit change in the independent variable.

William Brent, CFA, is the chief financial officer for Mega Flowers, one of the largest producers of flowers and bedding plants in the Western United States. Mega Flowers grows its plants in three large nursery facilities located in California. Its products are sold in its company-owned retail nurseries as well as in large, home and garden “super centers”. For its retail stores, Mega Flowers has designed and implemented marketing plans each season that are aimed at its consumers in order to generate additional sales for certain high-margin products. To fully implement the marketing plan, additional contract salespeople are seasonally employed. 

For the past several years, these marketing plans seemed to be successful, providing a significant boost in sales to those specific products highlighted by the marketing efforts. However, for the past year, revenues have been flat, even though marketing expenditures increased slightly. Brent is concerned that the expensive seasonal marketing campaigns are simply no longer generating the desired returns, and should either be significantly modified or eliminated altogether. He proposes that the company hire additional, permanent salespeople to focus on selling Mega Flowers’ high-margin products all year long. The chief operating officer, David Johnson, disagrees with Brent. He believes that although last year’s results were disappointing, the marketing campaign has demonstrated impressive results for the past five years, and should be continued. His belief is that the prior years’ performance can be used as a gauge for future results, and that a simple increase in the sales force will not bring about the desired results. 

Brent gathers information regarding quarterly sales revenue and marketing expenditures for the past five years. Based upon historical data, Brent derives the following regression equation for Mega Flowers (stated in millions of dollars):

Expected Sales = 12.6 + 1.6 (Marketing Expenditures) + 1.2 (# of Salespeople)

Brent shows the equation to Johnson and tells him, “This equation shows that a $1 million increase in marketing expenditures will increase the independent variable by $1.6 million, all other factors being equal.” Johnson replies, “It also appears that sales will equal $12.6 million if all independent variables are equal to zero.”

In regard to their conversation about the regression equation:

A)
Brent’s statement is correct; Johnson’s statement is correct.
B)
Brent’s statement is incorrect; Johnson’s statement is correct.
C)
Brent’s statement is correct; Johnson’s statement is incorrect.


Expected sales is the dependent variable in the equation, while expenditures for marketing and salespeople are the independent variables. Therefore, a $1 million increase in marketing expenditures will increase the dependent variable (expected sales) by $1.6 million. Brent’s statement is incorrect.

Johnson’s statement is correct. 12.6 is the intercept in the equation, which means that if all independent variables are equal to zero, expected sales will be $12.6 million. (Study Session 3, LOS 12.a)


Using data from the past 20 quarters, Brent calculates the t-statistic for marketing expenditures to be 3.68 and the t-statistic for salespeople at 2.19. At a 5% significance level, the two-tailed critical values are tc = +/- 2.127. This most likely indicates that:

A)
the t-statistic has 18 degrees of freedom.
B)
both independent variables are statistically significant.
C)
the null hypothesis should not be rejected.


Using a 5% significance level with degrees of freedom (df) of 17 (20-2-1), both independent variables are significant and contribute to the level of expected sales. (Study Session 3, LOS 12.a)


Brent calculated that the sum of squared errors (SSE) for the variables is 267. The mean squared error (MSE) would be:

A)
15.706.
B)
14.831.
C)
14.055.


The MSE is calculated as SSE / (n – k – 1). Recall that there are twenty observations and two independent variables. Therefore, the MSE in this instance [267 / (20 – 2 - 1)] = 15.706. (Study Session 3, LOS 11.i)


Brent is trying to explain the concept of the standard error of estimate (SEE) to Johnson. In his explanation, Brent makes three points about the SEE:
  • Point 1: The SEE is the standard deviation of the differences between the estimated values for the independent variables and the actual observations for the independent variable.
  • Point 2: Any violation of the basic assumptions of a multiple regression model is going to affect the SEE.
  • Point 3: If there is a strong relationship between the variables and the SSE is small, the individual estimation errors will also be small.

How many of Brent’s points are most accurate?

A)
2 of Brent’s points are correct.
B)
1 of Brent’s points are correct.
C)
All 3 of Brent’s points are correct.


The statements that if there is a strong relationship between the variables and the SSE is small, the individual estimation errors will also be small, and also that any violation of the basic assumptions of a multiple regression model is going to affect the SEE are both correct.

The SEE is the standard deviation of the differences between the estimated values for the dependent variables (not independent) and the actual observations for the dependent variable. Brent’s Point 1 is incorrect.

Therefore, 2 of Brent’s points are correct. (Study Session 3, LOS 11.f)


Assuming that next year’s marketing expenditures are $3,500,000 and there are five salespeople, predicted sales for Mega Flowers will be:

A)
$11,600,000.
B)
$2,400,000.
C)
$24,200,000.


Using the regression equation from above, expected sales equals 12.6 + (1.6 x 3.5) + (1.2 x 5) = $24.2 million. Remember to check the details – i.e. this equation is denominated in millions of dollars. (Study Session 3, LOS 12.c)


Brent would like to further investigate whether at least one of the independent variables can explain a significant portion of the variation of the dependent variable. Which of the following methods would be best for Brent to use?

A)
The multiple coefficient of determination.
B)
An ANOVA table.
C)
The F-statistic.


To determine whether at least one of the coefficients is statistically significant, the calculated F-statistic is compared with the critical F-value at the appropriate level of significance. (Study Session 3, LOS 12.e)

TOP

Consider the following estimated regression equation, with the standard errors of the slope coefficients as noted:

Salesi = 10.0 + 1.25 R&Di + 1.0 ADVi – 2.0 COMPi + 8.0 CAPi

where the standard error for the estimated coefficient on R&D is 0.45, the standard error for the estimated coefficient on ADV is 2.2 , the standard error for the estimated coefficient on COMP is 0.63, and the standard error for the estimated coefficient on CAP is 2.5.

The equation was estimated over 40 companies. Using a 5% level of significance, which of the estimated coefficients are significantly different from zero?

A)
R&D, ADV, COMP, and CAP.
B)
R&D, COMP, and CAP only.
C)
ADV and CAP only.


The critical t-values for 40-4-1 = 35 degrees of freedom and a 5% level of significance are ± 2.03.

The calculated t-values are:
t for R&D = 1.25 / 0.45 = 2.777
t for ADV = 1.0/ 2.2 = 0.455
t for COMP = -2.0 / 0.63 = -3.175
t for CAP = 8.0 / 2.5 = 3.2
Therefore, R&D, COMP, and CAP are statistically significant.


TOP

Consider the following regression equation:

Salesi = 10.0 + 1.25 R&Di + 1.0 ADVi – 2.0 COMPi + 8.0 CAPi
where Sales is dollar sales in millions, R&D is research and development expenditures in millions, ADV is dollar amount spent on advertising in millions, COMP is the number of competitors in the industry, and CAP is the capital expenditures for the period in millions of dollars. 

Which of the following is NOT a correct interpretation of this regression information

A)
If R&D and advertising expenditures are $1 million each, there are 5 competitors, and capital expenditures are $2 million, expected Sales are $8.25 million.
B)
If a company spends $1 million more on capital expenditures (holding everything else constant), Sales are expected to increase by $8.0 million.
C)
One more competitor will mean $2 million less in Sales (holding everything else constant).


Predicted sales = $10 + 1.25 + 1 – 10 + 16 = $18.25 million.

TOP

Consider the following regression equation:

Salesi = 20.5 + 1.5 R&Di + 2.5 ADVi – 3.0 COMPi

where Sales is dollar sales in millions, R&D is research and development expenditures in millions, ADV is dollar amount spent on advertising in millions, and COMP is the number of competitors in the industry.

Which of the following is NOT a correct interpretation of this regression information?

A)
If a company spends $1 more on R&D (holding everything else constant), sales are expected to increase by $1.5 million.
B)
One more competitor will mean $3 million less in sales (holding everything else constant).
C)
If R&D and advertising expenditures are $1 million each and there are 5 competitors, expected sales are $9.5 million.


If a company spends $1 million more on R&D (holding everything else constant), sales are expected to increase by $1.5 million. Always be aware of the units of measure for the different variables.

TOP

Henry Hilton, CFA, is undertaking an analysis of the bicycle industry. He hypothesizes that bicycle sales (SALES) are a function of three factors: the population under 20 (POP), the level of disposable income (INCOME), and the number of dollars spent on advertising (ADV). All data are measured in millions of units. Hilton gathers data for the last 20 years. Which of the follow regression equations correctly represents Hilton’s hypothesis?

A)
SALES = α x β1 POP x β2 INCOME x β3 ADV x ε.
B)
INCOME = α + β1 POP + β2 SALES + β3 ADV + ε.
C)
SALES = α + β1 POP + β2 INCOME + β3 ADV + ε.


SALES is the dependent variable. POP, INCOME, and ADV should be the independent variables (on the right hand side) of the equation (in any order). Regression equations are additive.

TOP

 

Henry Hilton, CFA, is undertaking an analysis of the bicycle industry.  He hypothesizes that bicycle sales (SALES) are a function of three factors: the population under 20 (POP), the level of disposable income (INCOME), and the number of dollars spent on advertising (ADV).  All data are measured in millions of units.  Hilton gathers data for the last 20 years and estimates the following equation (standard errors in parentheses): 

SALES = α + 0.004 POP + 1.031 INCOME + 2.002 ADV

(0.005)

(0.337)

(2.312)

 

The critical t-statistic for a 95% confidence level is 2.120.  Which of the independent variables is statistically different from zero at the 95% confidence level?

A)
INCOME only.
B)
ADV only.
C)
INCOME and ADV.


The calculated test statistic is coefficient/standard error. Hence, the t-stats are 0.8 for POP, 3.059 for INCOME, and 0.866 for ADV. Since the t-stat for INCOME is the only one greater than the critical t-value of 2.120, only INCOME is significantly different from zero.

TOP

 

Henry Hilton, CFA, is undertaking an analysis of the bicycle industry.  He hypothesizes that bicycle sales (SALES) are a function of three factors: the population under 20 (POP), the level of disposable income (INCOME), and the number of dollars spent on advertising (ADV).  All data are measured in millions of units.  Hilton gathers data for the last 20 years and estimates the following equation (standard errors in parentheses):

SALES = 0.000  +  0.004 POP + 1.031 INCOME + 2.002 ADV

(0.113)

(0.005)

(0.337)

(2.312)

 

For next year, Hilton estimates the following parameters: (1) the population under 20 will be 120 million, (2) disposable income will be $300,000,000, and (3) advertising expenditures will be $100,000,000.  Based on these estimates and the regression equation, what are predicted sales for the industry for next year?

A)
$557,143,000.
B)
$509,980,000.
C)
$656,991,000.


Predicted sales for next year are:

SALES = α + 0.004 (120) + 1.031 (300) + 2.002 (100) = 509,980,000.

TOP

In a recent analysis of salaries (in $1,000) of financial analysts, a regression of salaries on education, experience, and gender is run. Gender equals one for men and zero for women. The regression results from a sample of 230 financial analysts are presented below, with t-statistics in parenthesis.

Salaries = 34.98 + 1.2 Education + 0.5 Experience + 6.3 Gender

                (29.11)          (8.93)                (2.98)                (1.58)

What is the expected salary (in $1,000) of a woman with 16 years of education and 10 years of experience?

A)
54.98.
B)
59.18.
C)
65.48.


34.98 + 1.2(16) + 0.5(10) = 59.18


Holding everything else constant, do men get paid more than women? Use a 5% level of significance. No, since the t-value:

A)
does not exceed the critical value of 1.96.
B)
exceeds the critical value of 1.96.
C)
does not exceed the critical value of 1.65.


H0: bgender ≤ 0
Ha: bgender > 0

t-value of 1.58 < 1.65 (critical value)

TOP

Werner Baltz, CFA, has regressed 30 years of data to forecast future sales for National Motor Company based on the percent change in gross domestic product (GDP) and the change in price of a U.S. gallon of fuel at retail. The results are presented below. Note: results must be multiplied by $1,000,000:

Coefficient Estimates

Standard Error

Predictor

Coefficient

of the Coefficient

Intercept

78

13.710

?1 GDP

30.22

12.120

?2$ Fuel

?412.39

183.981

 

Analysis of Variance Table (ANOVA)

Source

Degrees of Freedom

Sum of Squares

Mean Square

Regression

291.30

145.65

Error

27

132.12

Total

29

423.42

 

In 2002, if GDP rises 2.2% and the price of fuels falls $0.15, Baltz’s model will predict Company sales in 2002 to be (in $ millions) closest to:

A)
$128.
B)
$82.
C)
$206.


Sales will be closest to $78 + ($30.22 × 2.2) + [(?412.39) × (?$0.15)] = $206.34 million.


Baltz proceeds to test the hypothesis that none of the independent variables has significant explanatory power. He concludes that, at a 5% level of significance:

A)
at least one of the independent variables has explanatory power, because the calculated F-statistic exceeds its critical value.
B)
none of the independent variables has explanatory power, because the calculated F-statistic does not exceed its critical value.
C)
all of the independent variables have explanatory power, because the calculated F-statistic exceeds its critical value.


From the ANOVA table, the calculated F-statistic is (mean square regression / mean square error) = 145.65 / 4.89 = 29.7853. From the F distribution table (2 df numerator, 27 df denominator) the F-critical value may be interpolated to be 3.36. Because 29.7853 is greater than 3.36, Baltz rejects the null hypothesis and concludes that at least one of the independent variables has explanatory power.


Baltz then tests the individual variables, at a 5% level of significance, to determine whether sales are explained by individual changes in GDP and fuel prices. Baltz concludes that:

A)
both GDP and fuel price changes explain changes in sales.
B)
neither GDP nor fuel price changes explain changes in sales.
C)
only GDP changes explain changes in sales.


From the ANOVA table, the calculated t-statistics are (30.22 / 12.12) = 2.49 for GDP and (?412.39 / 183.981) = ?2.24 for fuel prices. These values are both outside the t-critical value at 27 degrees of freedom of ±2.052. Therefore, Baltz is able to reject the null hypothesis that these coefficients are equal to zero, and concludes that each variable is important in explaining sales.

TOP

返回列表