Which of the following statements regarding the results of a regression analysis is FALSE? The:
| ||
| ||
|
Which of the following statements regarding the results of a regression analysis is FALSE? The:
| ||
| ||
|
The slope coefficient is the change in the dependent variable for a one-unit change in the independent variable.
William Brent, CFA, is the chief financial officer for Mega Flowers, one of the largest producers of flowers and bedding plants in the Western United States. Mega Flowers grows its plants in three large nursery facilities located in California. Its products are sold in its company-owned retail nurseries as well as in large, home and garden “super centers”. For its retail stores, Mega Flowers has designed and implemented marketing plans each season that are aimed at its consumers in order to generate additional sales for certain high-margin products. To fully implement the marketing plan, additional contract salespeople are seasonally employed.
For the past several years, these marketing plans seemed to be successful, providing a significant boost in sales to those specific products highlighted by the marketing efforts. However, for the past year, revenues have been flat, even though marketing expenditures increased slightly. Brent is concerned that the expensive seasonal marketing campaigns are simply no longer generating the desired returns, and should either be significantly modified or eliminated altogether. He proposes that the company hire additional, permanent salespeople to focus on selling Mega Flowers’ high-margin products all year long. The chief operating officer, David Johnson, disagrees with Brent. He believes that although last year’s results were disappointing, the marketing campaign has demonstrated impressive results for the past five years, and should be continued. His belief is that the prior years’ performance can be used as a gauge for future results, and that a simple increase in the sales force will not bring about the desired results.
Brent gathers information regarding quarterly sales revenue and marketing expenditures for the past five years. Based upon historical data, Brent derives the following regression equation for Mega Flowers (stated in millions of dollars):
Expected Sales = 12.6 + 1.6 (Marketing Expenditures) + 1.2 (# of Salespeople)
Brent shows the equation to Johnson and tells him, “This equation shows that a $1 million increase in marketing expenditures will increase the independent variable by $1.6 million, all other factors being equal.” Johnson replies, “It also appears that sales will equal $12.6 million if all independent variables are equal to zero.”
In regard to their conversation about the regression equation:
| ||
| ||
|
Expected sales is the dependent variable in the equation, while expenditures for marketing and salespeople are the independent variables. Therefore, a $1 million increase in marketing expenditures will increase the dependent variable (expected sales) by $1.6 million. Brent’s statement is incorrect.
| ||
| ||
|
Using a 5% significance level with degrees of freedom (df) of 17 (20-2-1), both independent variables are significant and contribute to the level of expected sales. (Study Session 3, LOS 12.a)
| ||
| ||
|
The MSE is calculated as SSE / (n – k – 1). Recall that there are twenty observations and two independent variables. Therefore, the MSE in this instance [267 / (20 – 2 - 1)] = 15.706. (Study Session 3, LOS 11.i)
How many of Brent’s points are most accurate?
| ||
| ||
|
The statements that if there is a strong relationship between the variables and the SSE is small, the individual estimation errors will also be small, and also that any violation of the basic assumptions of a multiple regression model is going to affect the SEE are both correct. The SEE is the standard deviation of the differences between the estimated values for the dependent variables (not independent) and the actual observations for the dependent variable. Brent’s Point 1 is incorrect. Therefore, 2 of Brent’s points are correct. (Study Session 3, LOS 11.f)
| ||
| ||
|
Using the regression equation from above, expected sales equals 12.6 + (1.6 x 3.5) + (1.2 x 5) = $24.2 million. Remember to check the details – i.e. this equation is denominated in millions of dollars. (Study Session 3, LOS 12.c)
| ||
| ||
|
To determine whether at least one of the coefficients is statistically significant, the calculated F-statistic is compared with the critical F-value at the appropriate level of significance. (Study Session 3, LOS 12.e)
Consider the following estimated regression equation, with the standard errors of the slope coefficients as noted:
Salesi = 10.0 + 1.25 R&Di + 1.0 ADVi – 2.0 COMPi + 8.0 CAPi
where the standard error for the estimated coefficient on R&D is 0.45, the standard error for the estimated coefficient on ADV is 2.2 , the standard error for the estimated coefficient on COMP is 0.63, and the standard error for the estimated coefficient on CAP is 2.5.
The equation was estimated over 40 companies. Using a 5% level of significance, which of the estimated coefficients are significantly different from zero?
| ||
| ||
|
Consider the following estimated regression equation, with the standard errors of the slope coefficients as noted:
Salesi = 10.0 + 1.25 R&Di + 1.0 ADVi – 2.0 COMPi + 8.0 CAPi
where the standard error for the estimated coefficient on R&D is 0.45, the standard error for the estimated coefficient on ADV is 2.2 , the standard error for the estimated coefficient on COMP is 0.63, and the standard error for the estimated coefficient on CAP is 2.5.
The equation was estimated over 40 companies. Using a 5% level of significance, which of the estimated coefficients are significantly different from zero?
| ||
| ||
|
The critical t-values for 40-4-1 = 35 degrees of freedom and a 5% level of significance are ± 2.03.
The calculated t-values are:
t for R&D = 1.25 / 0.45 = 2.777
t for ADV = 1.0/ 2.2 = 0.455
t for COMP = -2.0 / 0.63 = -3.175
t for CAP = 8.0 / 2.5 = 3.2
Therefore, R&D, COMP, and CAP are statistically significant.
Consider the following regression equation:
Salesi = 10.0 + 1.25 R&Di + 1.0 ADVi – 2.0 COMPi + 8.0 CAPi
where Sales is dollar sales in millions, R&D is research and development expenditures in millions, ADV is dollar amount spent on advertising in millions, COMP is the number of competitors in the industry, and CAP is the capital expenditures for the period in millions of dollars.
Which of the following is NOT a correct interpretation of this regression information?
| ||
| ||
|
Consider the following regression equation:
Salesi = 10.0 + 1.25 R&Di + 1.0 ADVi – 2.0 COMPi + 8.0 CAPi
where Sales is dollar sales in millions, R&D is research and development expenditures in millions, ADV is dollar amount spent on advertising in millions, COMP is the number of competitors in the industry, and CAP is the capital expenditures for the period in millions of dollars.
Which of the following is NOT a correct interpretation of this regression information?
| ||
| ||
|
Predicted sales = $10 + 1.25 + 1 – 10 + 16 = $18.25 million.
Consider the following regression equation:
Salesi = 20.5 + 1.5 R&Di + 2.5 ADVi – 3.0 COMPi
where Sales is dollar sales in millions, R&D is research and development expenditures in millions, ADV is dollar amount spent on advertising in millions, and COMP is the number of competitors in the industry.
Which of the following is NOT a correct interpretation of this regression information?
| ||
| ||
|
Consider the following regression equation:
Salesi = 20.5 + 1.5 R&Di + 2.5 ADVi – 3.0 COMPi
where Sales is dollar sales in millions, R&D is research and development expenditures in millions, ADV is dollar amount spent on advertising in millions, and COMP is the number of competitors in the industry.
Which of the following is NOT a correct interpretation of this regression information?
| ||
| ||
|
If a company spends $1 million more on R&D (holding everything else constant), sales are expected to increase by $1.5 million. Always be aware of the units of measure for the different variables.
Henry Hilton, CFA, is undertaking an analysis of the bicycle industry. He hypothesizes that bicycle sales (SALES) are a function of three factors: the population under 20 (POP), the level of disposable income (INCOME), and the number of dollars spent on advertising (ADV). All data are measured in millions of units. Hilton gathers data for the last 20 years. Which of the follow regression equations correctly represents Hilton’s hypothesis?
| ||
| ||
|
Henry Hilton, CFA, is undertaking an analysis of the bicycle industry. He hypothesizes that bicycle sales (SALES) are a function of three factors: the population under 20 (POP), the level of disposable income (INCOME), and the number of dollars spent on advertising (ADV). All data are measured in millions of units. Hilton gathers data for the last 20 years. Which of the follow regression equations correctly represents Hilton’s hypothesis?
| ||
| ||
|
SALES is the dependent variable. POP, INCOME, and ADV should be the independent variables (on the right hand side) of the equation (in any order). Regression equations are additive.
Henry Hilton, CFA, is undertaking an analysis of the bicycle industry. He hypothesizes that bicycle sales (SALES) are a function of three factors: the population under 20 (POP), the level of disposable income (INCOME), and the number of dollars spent on advertising (ADV). All data are measured in millions of units. Hilton gathers data for the last 20 years and estimates the following equation (standard errors in parentheses):
SALES = α + 0.004 POP + 1.031 INCOME + 2.002 ADV | |||
(0.005) |
(0.337) |
(2.312) |
|
The critical t-statistic for a 95% confidence level is 2.120. Which of the independent variables is statistically different from zero at the 95% confidence level?
| ||
| ||
|
Henry Hilton, CFA, is undertaking an analysis of the bicycle industry. He hypothesizes that bicycle sales (SALES) are a function of three factors: the population under 20 (POP), the level of disposable income (INCOME), and the number of dollars spent on advertising (ADV). All data are measured in millions of units. Hilton gathers data for the last 20 years and estimates the following equation (standard errors in parentheses):
SALES = α + 0.004 POP + 1.031 INCOME + 2.002 ADV | |||
(0.005) |
(0.337) |
(2.312) |
|
The critical t-statistic for a 95% confidence level is 2.120. Which of the independent variables is statistically different from zero at the 95% confidence level?
| ||
| ||
|
The calculated test statistic is coefficient/standard error. Hence, the t-stats are 0.8 for POP, 3.059 for INCOME, and 0.866 for ADV. Since the t-stat for INCOME is the only one greater than the critical t-value of 2.120, only INCOME is significantly different from zero.
Henry Hilton, CFA, is undertaking an analysis of the bicycle industry. He hypothesizes that bicycle sales (SALES) are a function of three factors: the population under 20 (POP), the level of disposable income (INCOME), and the number of dollars spent on advertising (ADV). All data are measured in millions of units. Hilton gathers data for the last 20 years and estimates the following equation (standard errors in parentheses):
SALES = 0.000 + 0.004 POP + 1.031 INCOME + 2.002 ADV | ||||
(0.113) |
(0.005) |
(0.337) |
(2.312) |
|
For next year, Hilton estimates the following parameters: (1) the population under 20 will be 120 million, (2) disposable income will be $300,000,000, and (3) advertising expenditures will be $100,000,000. Based on these estimates and the regression equation, what are predicted sales for the industry for next year?
| ||
| ||
|
Henry Hilton, CFA, is undertaking an analysis of the bicycle industry. He hypothesizes that bicycle sales (SALES) are a function of three factors: the population under 20 (POP), the level of disposable income (INCOME), and the number of dollars spent on advertising (ADV). All data are measured in millions of units. Hilton gathers data for the last 20 years and estimates the following equation (standard errors in parentheses):
SALES = 0.000 + 0.004 POP + 1.031 INCOME + 2.002 ADV | ||||
(0.113) |
(0.005) |
(0.337) |
(2.312) |
|
For next year, Hilton estimates the following parameters: (1) the population under 20 will be 120 million, (2) disposable income will be $300,000,000, and (3) advertising expenditures will be $100,000,000. Based on these estimates and the regression equation, what are predicted sales for the industry for next year?
| ||
| ||
|
Predicted sales for next year are: SALES = α + 0.004 (120) + 1.031 (300) + 2.002 (100) = 509,980,000.
In a recent analysis of salaries (in $1,000) of financial analysts, a regression of salaries on education, experience, and gender is run. Gender equals one for men and zero for women. The regression results from a sample of 230 financial analysts are presented below, with t-statistics in parenthesis.
Salaries = 34.98 + 1.2 Education + 0.5 Experience + 6.3 Gender
(29.11) (8.93) (2.98) (1.58)
What is the expected salary (in $1,000) of a woman with 16 years of education and 10 years of experience?
| ||
| ||
|
34.98 + 1.2(16) + 0.5(10) = 59.18
| ||
| ||
|
H0: bgender ≤ 0
Ha: bgender > 0
t-value of 1.58 < 1.65 (critical value)
Werner Baltz, CFA, has regressed 30 years of data to forecast future sales for National Motor Company based on the percent change in gross domestic product (GDP) and the change in price of a U.S. gallon of fuel at retail. The results are presented below. Note: results must be multiplied by $1,000,000:
Coefficient Estimates | ||
Standard Error | ||
Predictor |
Coefficient |
of the Coefficient |
Intercept |
78 |
13.710 |
?1 GDP |
30.22 |
12.120 |
?2$ Fuel |
?412.39 |
183.981
|
Analysis of Variance Table (ANOVA) | |||
Source |
Degrees of Freedom |
Sum of Squares |
Mean Square |
Regression |
291.30 |
145.65 | |
Error |
27 |
132.12 |
|
Total |
29 |
423.42
|
In 2002, if GDP rises 2.2% and the price of fuels falls $0.15, Baltz’s model will predict Company sales in 2002 to be (in $ millions) closest to:
| ||
| ||
|
Sales will be closest to $78 + ($30.22 × 2.2) + [(?412.39) × (?$0.15)] = $206.34 million.
| ||
| ||
|
From the ANOVA table, the calculated F-statistic is (mean square regression / mean square error) = 145.65 / 4.89 = 29.7853. From the F distribution table (2 df numerator, 27 df denominator) the F-critical value may be interpolated to be 3.36. Because 29.7853 is greater than 3.36, Baltz rejects the null hypothesis and concludes that at least one of the independent variables has explanatory power.
| ||
| ||
|
From the ANOVA table, the calculated t-statistics are (30.22 / 12.12) = 2.49 for GDP and (?412.39 / 183.981) = ?2.24 for fuel prices. These values are both outside the t-critical value at 27 degrees of freedom of ±2.052. Therefore, Baltz is able to reject the null hypothesis that these coefficients are equal to zero, and concludes that each variable is important in explaining sales.
Milky Way, Inc. is a large manufacturer of children’s toys and games based in the United States. Their products have high name brand recognition, and have been sold in retail outlets throughout the United States for nearly fifty years. The founding management team was bought out by a group of investors five years ago. The new management team, led by Russell Stepp, decided that Milky Way should try to expand its sales into the Western European market, which had never been tapped by the former owners. Under Stepp’s leadership, additional personnel are hired in the Research and Development department, and a new marketing plan specific to the European market is implemented. Being a new player in the European market, Stepp knows that it will take several years for Milky Way to establish its brand name in the marketplace, and is willing to make the expenditures now in exchange for increased future profitability.
Now, five years after entering the European market, Stepp is reviewing the results of his plan. Sales in Europe have slowly but steadily increased over since Milky Way’s entrance into the market, but profitability seems to have leveled out. Stepp decides to hire a consultant, Ann Hays, CFA, to review and evaluate their European strategy. One of Hays’ first tasks on the job is to perform a regression analysis on Milky Way’s European sales. She is seeking to determine whether the additional expenditures on research and development and marketing for the European market should be continued in the future.
Hays begins by establishing a relationship between the European sales of Milky Way (in millions of dollars) and the two independent variables, the number of dollars (in millions) spent on research and development (R&D) and marketing (MKTG). Based upon five years of monthly data, Hays constructs the following estimated regression equation:
Estimated Sales = 54.82 + 5.97 (MKTG) + 1.45 (R&D)
Additionally, Hays calculates the following regression estimates:
Coefficient
Standard Error
Intercept
54.82
3.165
MKTG
5.97
1.825
R&D
1.45
0.987
Hays begins the analysis by determining if both of the independent variables are statistically significant. To test whether a coefficient is statistically significant means to test whether it is statistically significantly different from:
| ||
| ||
|
The magnitude of the coefficient reveals nothing about the importance of the independent variable in explaining the dependent variable. Therefore, it must be determined if each independent variable is statistically significant. The null hypothesis is that the slope coefficient for each independent variable equals zero. (Study Session 3, LOS 11.a)
| ||
| ||
|
The t-statistic for the marketing coefficient is calculated as follows: (5.97– 0.0) / 1.825 = 3.271. (Study Session 3, LOS 11.g)
Hays formulates a test structure where the decision rule is to reject the null hypothesis if the calculated test statistic is either larger than the upper tail critical value or lower than the lower tail critical value. At a 5% significance level with 57 degrees of freedom, assume that the two-tailed critical t-values are tc = ±2.004. Based on this information, Hays makes the following conclusions:
Which of Hays’ conclusions are CORRECT?
| ||
| ||
|
Hays’ Point 1 is correct. The t-statistic for the intercept term is (54.82 – 0) / 3.165 = 17.32, which is greater than the critical value of 2.004, so we can conclude that the intercept term is statistically significant. Hays’ Point 2 is incorrect. The t-statistic for the R&D term is (1.45 – 0) / 0.987 = 1.469, which is not greater than the critical value of 2.004. This means that only MKTG can be said to contribute to explaining sales for Milky Way, Inc. Hays’ Point 3 is correct. An F-test tests whether at least one of the independent variables is significantly different from zero, where the null hypothesis is that all none of the independent variables are significant. Since we know that MKTG is a significant variable (t-statistic of 3.271), we can reject the hypothesis that none of the variables are significant. (Study Session 3, LOS 11.i)
| ||
| ||
|
RSS (Regression sum of squares) is the portion of the total variation in Y that is explained by the regression equation. The SSE (Sum of squared errors), is the portion of the total variation in Y that is not explained by the regression. The SST is the total variation of Y around its average value. Therefore, SST = RSS + SSE. These sums of squares will always be calculated for you on the exam, so focus on understanding the interpretation of each. (Study Session 3, LOS 11.i)
| ||
| ||
|
The R2 is calculated as (SST – SSE) / SST. In this example, R2 equals (389.14–146.85) / 389.14 = .623 or 62.3%. This indicates that the two independent variables together explain 62.3% of the variation in monthly sales. The value for mean squared error is not used in this calculation. (Study Session 3, LOS 11.i)
| ||
| ||
|
Unconditional heteroskedasticity occurs when the heteroskedasticity is not related to the level of the independent variables. This means that it does not systematically increase or decrease with changes in the independent variable(s). Note that heteroskedasticity occurs when the variance of the residuals is different across all observations in the sample and can be detected either by examining scatter plots or using a Breusch-Pagen test. (Study Session 3, LOS 12.g)
John Rains, CFA, is a professor of finance at a large university located in the Eastern United States. He is actively involved with his local chapter of the Society of Financial Analysts. Recently, he was asked to teach one session of a Society-sponsored CFA review course, specifically teaching the class addressing the topic of quantitative analysis. Based upon his familiarity with the CFA exam, he decides that the first part of the session should be a review of the basic elements of quantitative analysis, such as hypothesis testing, regression and multiple regression analysis. He would like to devote the second half of the review session to the practical application of the topics he covered in the first half.
Rains decides to construct a sample regression analysis case study for his students in order to demonstrate a “real-life” application of the concepts. He begins by compiling financial information on a fictitious company called Big Rig, Inc. According to the case study, Big Rig is the primary producer of the equipment used in the exploration for and drilling of new oil and gas wells in the United States. Rains has based the information in the problem on an actual equity holding in his personal portfolio, but has simplified the data for the purposes of the review course.
Rains constructs a basic regression model for Big Rig in order to estimate its profitability (in millions), using two independent variables: the number of new wells drilled in the U.S. (WLS) and the number of new competitors (COMP) entering the market:
Profits = b0 + b1WLS – b2COMP + ε
Based on the model, the estimated regression equation is:
Profits = 22.5 + 0.98(WLS) ? 0.35(COMP)
Using the past 5 years of quarterly data, he calculated the following regression estimates for Big Rig, Inc:
Coefficient
Standard Error
Intercept
22.5
2.465
WLS
0.98
0.683
COMP
0.35
0.186
Using the information presented, the t-statistic for the number of new competitors (COMP) coefficient is:
| ||
| ||
|
To test whether a coefficient is statistically significant, the null hypothesis is that the slope coefficient is zero. The t-statistic for the COMP coefficient is calculated as follows: (0.35 – 0.0) / 0.186 = 1.882 (Study Session 3, LOS 11.g)
| ||
| ||
|
The coefficient given in the above table for the number of new wells drilled (WLS) is 0.98. The hypothesis should test to see whether the coefficient is indeed equal to 0.98 or is equal to some other value. Note that hypotheses with the “greater than” or “less than” symbol are used with one-tailed tests. (Study Session 3, LOS 11.g)
| ||
| ||
|
The MSE is calculated as SSE / (n – k – 1). Recall that there are twenty observations and two independent variables. Therefore, the SEE in this instance = 359 / (20 – 2 ? 1) = 21.118. (Study Session 3, LOS 11.i)
| ||
| ||
|
An F-test assesses how well a set of impendent variables, as a group, explains the variation in the dependent variable. It tests all independent variables as a group, and is always a one-tailed test. The decision rule is to reject the null hypothesis if the calculated F-value is greater than the critical F-value. (Study Session 3, LOS 11.i)
| ||
| ||
|
Heteroskedasticity is present when the variance of the residuals is not the same across all observations in the sample, and there are sub-samples that are more spread out than the rest of the sample. (Study Session 3, LOS 12.g)
| ||
| ||
|
The Durbin-Watson test (DW ≈ 2(1 ? r)) can detect serial correlation. Another commonly used method is to visually inspect a scatter plot of residuals over time. The Hansen method does not detect serial correlation, but can be used to remedy the situation. Note that the Breusch-Pagen test is used to detect heteroskedasticity. (Study Session 3, LOS 12.g)
| ||
| ||
|
The appropriate test statistic for tests of significance on individual slope coefficient estimates is the t-statistic, which is provided in Exhibit 2 for each regression coefficient estimate. The reported t-statistic equals -2.10 for the STIM slope estimate and equals 2.35 for the CRISIS slope estimate. The critical t-statistic for the 5% significance level equals 2.12 (16 degrees of freedom, 5% level of significance).
Therefore, the slope estimate for STIM is not statistically significant (the reported t-statistic, -2.10, is not large enough). In contrast, the slope estimate for CRISIS is statistically significant (the reported t-statistic, 2.35, exceeds the 5% significance level critical value). (Study Session 3, LOS 12.a)
| ||
| ||
|
The formula for the Standard Error of the Estimate (SEE) is: The SEE equals the standard deviation of the regression residuals. A low SEE implies a high R2. (Study Session 3, LOS 12.e)
| ||
| ||
|
Smith’s Concern 1 is incorrect. Heteroskedasticity is a violation of a regression assumption, and refers to regression error variance that is not constant over all observations in the regression. Conditional heteroskedasticity is a case in which the error variance is related to the magnitudes of the independent variables (the error variance is “conditional” on the independent variables). The consequence of conditional heteroskedasticity is that the standard errors will be too low, which, in turn, causes the t-statistics to be too high. Smith’s Concern 2 also is not correct. Multicollinearity refers to independent variables that are correlated with each other. Multicollinearity causes standard errors for the regression coefficients to be too high, which, in turn, causes the t-statistics to be too low. However, contrary to Smith’s concern, multicollinearity has no effect on the F-statistic. (Study Session 3, LOS 12.g)
| ||
| ||
|
Forecasts are derived by substituting the appropriate value for the period t-1 lagged value.
So, the one-step ahead forecast equals 0.30%. The two-step ahead (%) forecast is derived by substituting 0.30 into the equation. ΔForeclosure Sharet+1 = 0.05 + 0.25(0.30) = 0.125 Therefore, the two-step ahead forecast equals 0.125%. (Study Session 3, LOS 13.d)
| ||
| ||
|
The error terms in the regressions for choices A, B, and C will be nonstationary. Therefore, some of the regression assumptions will be violated and the regression results are unreliable. If, however, both series are nonstationary (which will happen if each has unit root), but cointegrated, then the error term will be covariance stationary and the regression results are reliable. (Study Session 3, LOS 13.j)
欢迎光临 CFA论坛 (http://forum.theanalystspace.com/) | Powered by Discuz! 7.2 |