返回列表 发帖

Henry Hilton, CFA, is undertaking an analysis of the bicycle industry.
He hypothesizes that bicycle sales (SALES) are a function of three factors: the population under 20 (POP), the level of disposable income (INCOME), and the number of dollars spent on advertising (ADV).
All data are measured in millions of units.
Hilton gathers data for the last 20 years and estimates the following equation (standard errors in parentheses):

 SALES = 0.000 + 0.004 POP + 1.031 INCOME + 2.002 ADV (0.113) (0.005) (0.337) (2.312)

For next year, Hilton estimates the following parameters: (1) the population under 20 will be 120 million, (2) disposable income will be \$300,000,000, and (3) advertising expenditures will be \$100,000,000.
Based on these estimates and the regression equation, what are predicted sales for the industry for next year?

 A) \$509,980,000.
 B) \$557,143,000.
 C) \$656,991,000.

Predicted sales for next year are:
SALES = α + 0.004 (120) + 1.031 (300) + 2.002 (100) = 509,980,000.
A real estate agent wants to develop a model to predict the selling price of a home. The agent believes that the most important variables in determining the price of a house are its size (in square feet) and the number of bedrooms. Accordingly, he takes a random sample of 32 homes that has recently been sold. The results of the regression are:
 Coefficient Standard Error t-statistics Intercept 66,500 59,292 1.12 House Size 74.30 21.11 3.52 Number of Bedrooms 10306 3230 3.19

R2 = 0.56; F = 40.73
What is the predicted price of a house that has 2,000 square feet of space and has 4 bedrooms?
 A) \$256,324.
 B) \$292,496.
 C) \$114,432.

66,500 + 74.30(2,000) + 10,306(4) = \$256,324

What percent of the variability in the dependent variable is explained by the independent variable?
 A) 56.00%.
 B) 40.73%.
 C) 12.68%.

R2 = 0.56

The model indicates that at the 5% level of significance:
 A) the slopes are significant but the constant is not.
 B) the slopes are not significant but the constant is.
 C) the slopes and the constant are statistically significant.

DF = N − k − 1 = 32 − 2 − 1 = 29. The t-critical value at 5% significance for a 2-tailed test with 29 df = 2.045. T-values for the slope coefficients are 3.52 and 3.19, which are both greater than 2.045 (critical value). For the constant, the t-value of 1.2 < 2.045.

When a number of independent variables in a multiple regression are highly correlated with each other, the problem is called:
 A) autocorrelation.
 B) multicollinearity.
 C) heteroskedasticity.

Multicollinearity is present when the independent variables are highly correlated.
In a recent analysis of salaries (in \$1,000) of financial analysts, a regression of salaries on education, experience, and gender is run. Gender equals one for men and zero for women. The regression results from a sample of 230 financial analysts are presented below, with t-statistics in parenthesis.
Salaries = 34.98 + 1.2 Education + 0.5 Experience + 6.3 Gender
(29.11)          (8.93)                (2.98)                (1.58)What is the expected salary (in \$1,000) of a woman with 16 years of education and 10 years of experience?
 A) 54.98.
 B) 65.48.
 C) 59.18.

34.98 + 1.2(16) + 0.5(10) = 59.18

Holding everything else constant, do men get paid more than women? Use a 5% level of significance. No, since the t-value:
 A) does not exceed the critical value of 1.65.
 B) does not exceed the critical value of 1.96.
 C) exceeds the critical value of 1.96.

H0: bgender ≤ 0
Ha: bgender > 0

t-value of 1.58 < 1.65 (critical value)
Werner Baltz, CFA, has regressed 30 years of data to forecast future sales for National Motor Company based on the percent change in gross domestic product (GDP) and the change in price of a U.S. gallon of fuel at retail. The results are presented below. Note: results must be multiplied by \$1,000,000:
 Coefficient Estimates [td] [td] [/td] [td] [td]Standard Error Predictor Coefficient of the Coefficient Intercept 78 13.710 ∆1 GDP 30.22 12.120 ∆2\$ Fuel −412.39 183.981

 Analysis of Variance Table (ANOVA) [td] [td] [td] [/td] Source Degrees of Freedom Sum of Squares Mean Square Regression [td]291.30 145.65 Error 27 132.12 [/td] Total 29 423.42
In 2002, if GDP rises 2.2% and the price of fuels falls \$0.15, Baltz’s model will predict Company sales in 2002 to be (in \$ millions) closest to:
 A) \$128.
 B) \$82.
 C) \$206.

Sales will be closest to \$78 + (\$30.22 × 2.2) + [(−412.39) × (−\$0.15)] = \$206.34 million.

Baltz proceeds to test the hypothesis that none of the independent variables has significant explanatory power. He concludes that, at a 5% level of significance:
 A) none of the independent variables has explanatory power, because the calculated F-statistic does not exceed its critical value.
 B) all of the independent variables have explanatory power, because the calculated F-statistic exceeds its critical value.
 C) at least one of the independent variables has explanatory power, because the calculated F-statistic exceeds its critical value.

From the ANOVA table, the calculated F-statistic is (mean square regression / mean square error) = 145.65 / 4.89 = 29.7853. From the F distribution table (2 df numerator, 27 df denominator) the F-critical value may be interpolated to be 3.36. Because 29.7853 is greater than 3.36, Baltz rejects the null hypothesis and concludes that at least one of the independent variables has explanatory power.

Baltz then tests the individual variables, at a 5% level of significance, to determine whether sales are explained by individual changes in GDP and fuel prices. Baltz concludes that:
 A) neither GDP nor fuel price changes explain changes in sales.
 B) only GDP changes explain changes in sales.
 C) both GDP and fuel price changes explain changes in sales.

From the ANOVA table, the calculated t-statistics are (30.22 / 12.12) = 2.49 for GDP and (−412.39 / 183.981) = −2.24 for fuel prices. These values are both outside the t-critical value at 27 degrees of freedom of ±2.052. Therefore, Baltz is able to reject the null hypothesis that these coefficients are equal to zero, and concludes that each variable is important in explaining sales.
Autumn Voiku is attempting to forecast sales for Brookfield Farms based on a multiple regression model. Voiku has constructed the following model:

sales = b0 + (b1 × CPI) + (b2 × IP) + (b3 × GDP) + εt
Where:
sales = \$ change in sales (in 000’s)
CPI = change in the consumer price index
IP = change in industrial production (millions)
GDP = change in GDP (millions)
All changes in variables are in percentage terms.

Voiku uses monthly data from the previous 180 months of sales data and for the independent variables. The model estimates (with coefficient standard errors in parentheses) are:
 sales = 10.2 + (4.6 × CPI) + (5.2 × IP) + (11.7 × GDP) (5.4) (3.5) (5.9) (6.8)

The sum of squared errors is 140.3 and the total sum of squares is 368.7.
Voiku calculates the unadjusted R2, the adjusted R2, and the standard error of estimate to be 0.592, 0.597, and 0.910, respectively.
Voiku is concerned that one or more of the assumptions underlying multiple regression has been violated in her analysis. In a conversation with Dave Grimbles, CFA, a colleague who is considered by many in the firm to be a quant specialist. Voiku says, “It is my understanding that there are five assumptions of a multiple regression model:”
 Assumption 1: There is a linear relationship between the dependent and independent variables. Assumption 2: The independent variables are not random, and there is no correlation between any two of the independent variables. Assumption 3: The residual term is normally distributed with an expected value of zero. Assumption 4: The residuals are serially correlated. Assumption 5: The variance of the residuals is constant.

Grimbles agrees with Miller’s assessment of the assumptions of multiple regression.
Voiku tests and fails to reject each of the following four null hypotheses at the 99% confidence interval:
 Hypothesis 1: The coefficient on GDP is negative. Hypothesis 2: The intercept term is equal to –4. Hypothesis 3: A 2.6% increase in the CPI will result in an increase in sales of more than 12.0%. Hypothesis 4: A 1% increase in industrial production will result in a 1% decrease in sales.
Figure 1: Partial table of the Student’s t-distribution (One-tailed probabilities)
 df p = 0.10 p = 0.05 p = 0.025 p = 0.01 p = 0.005 170 1.287 1.654 1.974 2.348 2.605 176 1.286 1.654 1.974 2.348 2.604 180 1.286 1.653 1.973 2.347 2.603

Figure 2: Partial F-Table critical values for right-hand tail area equal to 0.05
 df1 = 1 df1 = 3 df1 = 5 df2 = 170 3.90 2.66 2.27 df2 = 176 3.89 2.66 2.27 df2 = 180 3.89 2.65 2.26

Figure 3: Partial F-Table critical values for right-hand tail area equal to 0.025
 df1 = 1 df1 = 3 df1 = 5 df2 = 170 5.11 3.19 2.64 df2 = 176 5.11 3.19 2.64 df2 = 180 5.11 3.19 2.64
Concerning the assumptions of multiple regression, Grimbles is:
 A) incorrect to agree with Voiku’s list of assumptions because three of the assumptions are stated incorrectly.
 B) incorrect to agree with Voiku’s list of assumptions because one of the assumptions is stated incorrectly.
 C) incorrect to agree with Voiku’s list of assumptions because two of the assumptions are stated incorrectly.

Assumption 2 is stated incorrectly. Some correlation between independent variables is unavoidable; high correlation results in multicollinearity. An exact linear relationship between linear combinations of two or more independent variables should not exist.
Assumption 4 is also stated incorrectly. The assumption is that the residuals are serially uncorrelated (i.e., they are not serially correlated).

For which of the four hypotheses did Voiku incorrectly fail to reject the null, based on the data given in the problem?
 A) Hypothesis 2.
 B) Hypothesis 3.
 C) Hypothesis 4.

The critical values at the 1% level of significance (99% confidence) are 2.348 for a one-tail test and 2.604 for a two-tail test (df = 176).
The t-values for the hypotheses are:
Hypothesis 1: 11.7 / 6.8 = 1.72
Hypothesis 2: 14.2 / 5.4 = 2.63
Hypothesis 3: 12.0 / 2.6 = 4.6, so the hypothesis is that the coefficient is greater than 4.6, and the t-stat of that hypothesis is (4.6 − 4.6) / 3.5 = 0.
Hypothesis 4: (5.2 + 1) / 5.9 = 1.05
Hypotheses 1 and 3 are one-tail tests; 2 and 4 are two-tail tests. Only Hypothesis 2 exceeds the critical value, so only Hypothesis 2 should be rejected.

The most appropriate decision with regard to the F-statistic for testing the null hypothesis that all of the independent variables are simultaneously equal to zero at the 5 percent significance level is to:
 A) fail to reject the null hypothesis because the F-statistic is smaller than the critical F-value of 2.66.
 B) reject the null hypothesis because the F-statistic is larger than the critical F-value of 3.19.
 C) reject the null hypothesis because the F-statistic is larger than the critical F-value of 2.66.

RSS = 368.7 – 140.3 = 228.4, F-statistic = (228.4 / 3) / (140.3 / 176) = 95.51. The critical value for a one-tailed 5% F-test with 3 and 176 degrees of freedom is 2.66. Because the F-statistic is greater than the critical F-value, the null hypothesis that all of the independent variables are simultaneously equal to zero should be rejected.

Regarding Voiku’s calculations of R2 and the standard error of estimate, she is:
 A) incorrect in her calculation of both the unadjusted R2 and the standard error of estimate.
 B) correct in her calculation of the unadjusted R2 but incorrect in her calculation of the standard error of estimate.
 C) incorrect in her calculation of the unadjusted R2 but correct in her calculation of the standard error of estimate.

SEE = √[140.3 / (180 − 3 − 1)] = 0.893
unadjusted R2 = (368.7 − 140.3) / 368.7 = 0.619

The multiple regression, as specified, most likely suffers from:
 A) heteroskedasticity.
 B) serial correlation of the error terms.
 C) multicollinearity.

The regression is highly significant (based on the F-stat in Part 3), but the individual coefficients are not. This is a result of a regression with significant multicollinearity problems. The t-stats for the significance of the regression coefficients are, respectively, 1.89, 1.60, 0.88, 1.72. None of these are high enough to reject the hypothesis that the coefficient is zero at the 5% level of significance (two-tailed critical value of 1.974 from t-table).

A 90 percent confidence interval for the coefficient on GDP is:
 A) –1.5 to 20.0.
 B) –1.9 to 19.6.
 C) 0.5 to 22.9.

A 90% confidence interval with 176 degrees of freedom is coefficient ± tc(se) = 11.7 ± 1.654 (6.8) or 0.5 to 22.9.
Housing industry analyst Elaine Smith has been assigned the task of forecasting housing foreclosures. Specifically, Smith is asked to forecast the percentage of outstanding mortgages that will be foreclosed upon in the coming quarter. Smith decides to employ multiple linear regression and time series analysis.
Besides constructing a forecast for the foreclosure percentage, Smith wants to address the following two questions:
 Research Question 1: Is the foreclosure percentage significantly affected by short-term interest rates? Research Question 2: Is the foreclosure percentage significantly affected by government intervention policies?

Smith contends that adjustable rate mortgages often are used by higher risk borrowers and that their homes are at higher risk of foreclosure. Therefore, Smith decides to use short-term interest rates as one of the independent variables to test Research Question 1.
To measure the effects of government intervention in Research Question 2, Smith uses a dummy variable that equals 1 whenever the Federal government intervened with a fiscal policy stimulus package that exceeded 2% of the annual Gross Domestic Product. Smith sets the dummy variable equal to 1 for four quarters starting with the quarter in which the policy is enacted and extending through the following 3 quarters. Otherwise, the dummy variable equals zero.
Smith uses quarterly data over the past 5 years to derive her regression. Smith’s regression equation is provided in Exhibit 1:
Exhibit 1: Foreclosure Share Regression Equation
foreclosure share = b0 + b1(ΔINT) + b2(STIM) + b3(CRISIS) + ε

 where:

[td=1,1,700]

[/td] Foreclosure share = the percentage of all outstanding mortgages foreclosed upon during the quarter ΔINT = the quarterly change in the 1-year Treasury bill rate (e.g., ΔINT = 2 for a two percentage point increase in interest rates) STIM = 1 for quarters in which a Federal fiscal stimulus package was in place CRISIS = 1 for quarters in which the median house price is one standard deviation below its 5-year moving average

The results of Smith’s regression are provided in Exhibit 2:

Exhibit 2: Foreclosure Share Regression Results

 Variable Coefficient t-statistic Intercept 3.00 2.40 ΔINT 1.00 2.22 STIM -2.50 -2.10 CRISIS 4.00 2.35

The ANOVA results from Smith’s regression are provided in Exhibit 3:

Exhibit 3: Foreclosure Share Regression Equation ANOVA Table

 Source Degrees of Freedom Sum of Squares Mean Sum of Squares Regression 3 15 5.0000 Error 16 5 0.3125 Total 19 20

Smith expresses the following concerns about the test statistics derived in her regression:
 Concern 1: If my regression errors exhibit conditional heteroskedasticity, my t-statistics will be underestimated. Concern 2: If my independent variables are correlated with each other, my F-statistic will be overestimated.

Before completing her analysis, Smith runs a regression of the changes in foreclosure share on its lagged value. The following regression results and autocorrelations were derived using quarterly data over the past 5 years (Exhibits 4 and 5, respectively):
Exhibit 4. Lagged Regression Results
Δ foreclosure sharet = 0.05 + 0.25(Δ foreclosure sharet–1)

Exhibit 5. Autocorrelation Analysis

 Lag Autocorrelation t-statistic 1 0.05 0.22 2 -0.35 -1.53 3 0.25 1.09 4 0.10 0.44

Exhibit 6 provides critical values for the Student’s t-Distribution

Exhibit 6: Critical Values for Student’s t-Distribution

 [td=4,1,237]Area in Both Tails Combined Degrees of Freedom 20% 10% 5% 1% 16 1.337 1.746 2.120 2.921 17 1.333 1.740 2.110 2.898 18 1.330 1.734 2.101 2.878 19 1.328 1.729 2.093 2.861 20 1.325 1.725 2.086 2.845

Using a 1% significance level, which of the following is closest to the lower bound of the lower confidence interval of the ΔINT slope coefficient?
 A) –0.316
 B) –0.296
 C) –0.045

The appropriate confidence interval associated with a 1% significance level is the 99% confidence level, which equals;
slope coefficient ± critical t-statistic (1% significance level) × coefficient standard error
The standard error is not explicitly provided in this question, but it can be derived by knowing the formula for the t-statistic:

From Exhibit 1, the ΔINT slope coefficient estimate equals 1.0, and its t-statistic equals 2.22. Therefore, solving for the standard error, we derive:

The critical value for the 1% significance level is found down the 1% column in the t-tables provided in Exhibit 6. The appropriate degrees of freedom for the confidence interval equals n – k – 1 = 20 – 3 – 1 = 16 (k is the number of slope estimates = 3). Therefore, the critical value for the 99% confidence interval (or 1% significance level) equals 2.921.
So, the 99% confidence interval for the ΔINT slope coefficient is:
1.00 ± 2.921(0.450): lower bound equals 1 – 1.316 and upper bound 1 + 1.316
or (-0.316, 2.316).
(Study Session 3, LOS 12.c)

Based on her regression results in Exhibit 2, using a 5% level of significance, Smith should conclude that:
 A) stimulus packages have significant effects on foreclosure percentages, but housing crises do not have significant effects on foreclosure percentages.
 B) stimulus packages do not have significant effects on foreclosure percentages, but housing crises do have significant effects on foreclosure percentages.
 C) both stimulus packages and housing crises have significant effects on foreclosure percentages.

The appropriate test statistic for tests of significance on individual slope coefficient estimates is the t-statistic, which is provided in Exhibit 2 for each regression coefficient estimate. The reported t-statistic equals -2.10 for the STIM slope estimate and equals 2.35 for the CRISIS slope estimate. The critical t-statistic for the 5% significance level equals 2.12 (16 degrees of freedom, 5% level of significance).
Therefore, the slope estimate for STIM is not statistically significant (the reported t-statistic, -2.10, is not large enough). In contrast, the slope estimate for CRISIS is statistically significant (the reported t-statistic, 2.35, exceeds the 5% significance level critical value). (Study Session 3, LOS 12.a)

The standard error of estimate for Smith’s regression is closest to:
 A) 0.53
 B) 0.16
 C) 0.56

The formula for the Standard Error of the Estimate (SEE) is:

The SEE equals the standard deviation of the regression residuals. A low SEE implies a high R2. (Study Session 3, LOS 12.f)

Is Smith correct or incorrect regarding Concerns 1 and 2?
 A) Incorrect on both Concerns.
 B) Only correct on one concern and incorrect on the other.
 C) Correct on both Concerns.

Smith’s Concern 1 is incorrect. Heteroskedasticity is a violation of a regression assumption, and refers to regression error variance that is not constant over all observations in the regression. Conditional heteroskedasticity is a case in which the error variance is related to the magnitudes of the independent variables (the error variance is “conditional” on the independent variables). The consequence of conditional heteroskedasticity is that the standard errors will be too low, which, in turn, causes the t-statistics to be too high. Smith’s Concern 2 also is not correct. Multicollinearity refers to independent variables that are correlated with each other. Multicollinearity causes standard errors for the regression coefficients to be too high, which, in turn, causes the t-statistics to be too low. However, contrary to Smith’s concern, multicollinearity has no effect on the F-statistic. (Study Session 3, LOS 12.i)

The most recent change in foreclosure share was +1 percent. Smith decides to base her analysis on the data and methods provided in Exhibits 4 and 5, and determines that the two-step ahead forecast for the change in foreclosure share (in percent) is 0.125, and that the mean reverting value for the change in foreclosure share (in percent) is 0.071. Is Smith correct?
 A) Smith is correct on the two-step ahead forecast for change in foreclosure share only.
 B) Smith is correct on the mean-reverting level for forecast of change in foreclosure share only.
 C) Smith is correct on both the forecast and the mean reverting level.

Forecasts are derived by substituting the appropriate value for the period t-1 lagged value.

So, the one-step ahead forecast equals 0.30%. The two-step ahead (%) forecast is derived by substituting 0.30 into the equation.
ΔForeclosure Sharet+1 = 0.05 + 0.25(0.30) = 0.125
Therefore, the two-step ahead forecast equals 0.125%.

(Study Session 3, LOS 13.d)

Assume for this question that Smith finds that the foreclosure share series has a unit root. Under these conditions, she can most reliably regress foreclosure share against the change in interest rates (ΔINT) if:
 A) ΔINT does not have unit root.
 B) ΔINT has unit root and is not cointegrated with foreclosure share.
 C) ΔINT has unit root and is cointegrated with foreclosure share.

The error terms in the regressions for choices A, B, and C will be nonstationary. Therefore, some of the regression assumptions will be violated and the regression results are unreliable. If, however, both series are nonstationary (which will happen if each has unit root), but cointegrated, then the error term will be covariance stationary and the regression results are reliable. (Study Session 3, LOS 13.k)
Which of the following statements most accurately interprets the following regression results at the given significance level?
 Variable p-value Intercept 0.0201 X1 0.0284 X2 0.0310 X3 0.0143
 A) The variables X1 and X2 are statistically significantly different from zero at the 2% significance level.
 B) The variable X3 is statistically significantly different from zero at the 2% significance level.
 C) The variable X2 is statistically significantly different from zero at the 3% significance level.

The p-value is the smallest level of significance for which the null hypothesis can be rejected. An independent variable is significant if the p-value is less than the stated significance level. In this example, X3 is the variable that has a p-value less than the stated significance level.
Which of the following statements most accurately interprets the following regression results at the given significance level?
 Variable p-value Intercept 0.0201 X1 0.0284 X2 0.0310 X3 0.0143
 A) The variables X1 and X2 are statistically significantly different from zero at the 2% significance level.
 B) The variable X3 is statistically significantly different from zero at the 2% significance level.
 C) The variable X2 is statistically significantly different from zero at the 3% significance level.

The p-value is the smallest level of significance for which the null hypothesis can be rejected. An independent variable is significant if the p-value is less than the stated significance level. In this example, X3 is the variable that has a p-value less than the stated significance level.
Dave Turner is a security analyst who is using regression analysis to determine how well two factors explain returns for common stocks. The independent variables are the natural logarithm of the number of analysts following the companies, Ln(no. of analysts), and the natural logarithm of the market value of the companies, Ln(market value). The regression output generated from a statistical program is given in the following tables. Each p-value corresponds to a two-tail test.
Turner plans to use the result in the analysis of two investments. WLK Corp. has twelve analysts following it and a market capitalization of \$2.33 billion. NGR Corp. has two analysts following it and a market capitalization of \$47 million.
Table 1: Regression Output
 Variable Coefficient Standard Error of the Coefficient t-statistic p-value Intercept 0.043 0.01159 3.71 < 0.001 Ln(No. of Analysts) −0.027 0.00466 −5.80 < 0.001 Ln(Market Value) 0.006 0.00271 2.21 0.028

Table 2: ANOVA
 Degrees of Freedom Sum of Squares Mean Square Regression 2 0.103 0.051 Residual 194 0.559 0.003 Total 196 0.662
In a one-sided test and a 1% level of significance, which of the following coefficients is significantly different from zero?
 A) The coefficient on ln(no. of Analysts) only.
 B) The intercept and the coefficient on ln(no. of analysts) only.
 C) The intercept and the coefficient on ln(market value) only.

The p-values correspond to a two-tail test. For a one-tailed test, divide the provided p-value by two to find the minimum level of significance for which a null hypothesis of a coefficient equaling zero can be rejected. Dividing the provided p-value for the intercept and ln(no. of analysts) will give a value less than 0.0005, which is less than 1% and would lead to a rejection of the hypothesis. Dividing the provided p-value for ln(market value) will give a value of 0.014 which is greater than 1%; thus, that coefficient is not significantly different from zero at the 1% level of significance. (Study Session 3, LOS 12.a)

The 95% confidence interval (use a t-stat of 1.96 for this question only) of the estimated coefficient for the independant variable Ln(Market Value) is closest to:
 A) 0.011 to 0.001
 B) 0.014 to -0.009
 C) -0.018 to -0.036

The confidence interval is 0.006 ± (1.96)(0.00271) = 0.011 to 0.001
(Study Session 3, LOS 12.c)

If the number of analysts on NGR Corp. were to double to 4, the change in the forecast of NGR would be closest to?
 A) −0.035.
 B) −0.019.
 C) −0.055.

Initially, the estimate is 0.1303 = 0.043 + ln(2)(−0.027) + ln(47000000)(0.006)
Then, the estimate is 0.1116 = 0.043 + ln(4)(−0.027) + ln(47000000)(0.006)
0.1116 − 0.1303 = −0.0187, or −0.019
(Study Session 3, LOS 12.a)

Based on a R2 calculated from the information in Table 2, the analyst should conclude that the number of analysts and ln(market value) of the firm explain:
 A) 18.4% of the variation in returns.
 B) 15.6% of the variation in returns.
 C) 84.4% of the variation in returns.

R2 is the percentage of the variation in the dependent variable (in this case, variation of returns) explained by the set of independent variables. R2 is calculated as follows: R2 = (SSR / SST) = (0.103 / 0.662) = 15.6%. (Study Session 3, LOS 12.f)

What is the F-statistic from the regression? And, what can be concluded from its value at a 1% level of significance?
 A) F = 17.00, reject a hypothesis that both of the slope coefficients are equal to zero.
 B) F = 5.80, reject a hypothesis that both of the slope coefficients are equal to zero.
 C) F = 1.97, fail to reject a hypothesis that both of the slope coefficients are equal to zero.

The F-statistic is calculated as follows: F = MSR / MSE = 0.051 / 0.003 = 17.00; and 17.00 > 4.61, which is the critical F-value for the given degrees of freedom and a 1% level of significance. However, when F-values are in excess of 10 for a large sample like this, a table is not needed to know that the value is significant. (Study Session 3, LOS 12.e)

Upon further analysis, Turner concludes that multicollinearity is a problem. What might have prompted this further analysis and what is intuition behind the conclusion?
 A) At least one of the t-statistics was not significant, the F-statistic was significant, and a positive relationship between the number of analysts and the size of the firm would be expected.
 B) At least one of the t-statistics was not significant, the F-statistic was not significant, and a positive relationship between the number of analysts and the size of the firm would be expected.
 C) At least one of the t-statistics was not significant, the F-statistic was significant, and an intercept not significantly different from zero would be expected.

Multicollinearity occurs when there is a high correlation among independent variables and may exist if there is a significant F-statistic for the fit of the regression model, but at least one insignificant independent variable when we expect all of them to be significant. In this case the coefficient on ln(market value) was not significant at the 1% level, but the F-statistic was significant. It would make sense that the size of the firm, i.e., the market value, and the number of analysts would be positively correlated. (Study Session 3, LOS 12.j)
When interpreting the results of a multiple regression analysis, which of the following terms represents the value of the dependent variable when the independent variables are all equal to zero?
 A) Intercept term.
 B) Slope coefficient.
 C) p-value.

The intercept term is the value of the dependent variable when the independent variables are set to zero.
﻿