AIM 1: List the assumptions of the classical linear regression model.
1、Which of the following is least likely an assumption of linear regression?
A) The residuals are normally distributed
B) There is a linear relation between the dependent and independent variables.
C) The independent variable is correlated with the residuals.
D) The variance of the residuals is constant.
The correct answer is C
The assumption is that the independent variable is uncorrelated with the residuals.
2、Which of the following is least likely an assumption of linear regression? The:
A) expected value of the residuals is zero.
B) variance of the residuals is constant.
C) residuals are mean reverting; that is, they tend towards zero over time.
D) residuals are independently distributed.
The correct answer is C
The assumptions regarding the residuals are that the residuals have a constant variance, have a mean of zero, and are independently distributed.
3、Which of the following is least likely an assumption of linear regression analysis?
A) The error term is normally distributed.
B) The Y values are all less than 3 standard deviations from the regression line.
C) The X values are uncorrelated with the error terms.
D) The expected value of the residuals is zero.
The correct answer is B
In a normal distribution, approximately 99%, but not necessarily all, of the observations are within 3 standard deviations of the regression line.
4、Which of the following statements about linear regression analysis is most accurate?
A) The coefficient of determination is defined as the strength of the linear relationship between two variables.
B) A perfectly negative correlation can be depicted by a correlation coefficient of +1.
C) An assumption of linear regression is that the residuals are independently distributed.
D) When there is a strong relationship between two variables we can conclude that a change in one will cause a change in the other.
The correct answer is C
A perfectly negative correlation can be depicted by a correlation coefficient of -1. Even when there is a strong relationship between two variables, we cannot conclude that a causal relationship exists. The coefficient of determination is defined as the percentage of total variation in the dependent variable explained by the independent variable.
5、The assumptions underlying linear regression include all of the following EXCEPT the:
A) disturbance term is normally distributed with an expected value of 0.
B) independent variable is linearly related to the residuals (or disturbance term).
C) disturbance term is homoskedastic and is independently distributed.
D) dependent variable and independent variable are linearly related.
The correct answer is B
The independent variable is uncorrelated with the residuals (or disturbance term).
The other statements are true. The disturbance term is homoskedastic because it has a constant variance. It is independently distributed because the residual for one observation is not correlated with that of another observation. Note: The opposite of homoskedastic is heteroskedastic. For the examination, memorize the assumptions underlying linear regression!
6、Linear regression is based on a number of assumptions. Which of the following is least likely an assumption of linear regression?
A) Values of the independent variable are not correlated with the error term.
B) A linear relationship exists between the dependent and independent variables.
C) There is at least some correlation between the error terms from one observation to the next.
D) The variance of the error terms each period remains the same.
The correct answer is C
When correlation exists, autocorrelation is present. As a result, residual terms are not normally distributed. This is inconsistent with linear regression.
7、Which of the following is least likely an assumption of a simple regression?
A) The variance of the error term is one.
B) The error term is normally distributed.
C) The expected value of the error term is zero.
D) There is a linear relationship between dependent and independent variables.
The correct answer is A
There is no requirement that the variance of the error term should be equal to one.
AIM 2: Define and distinguish homoskedasticity and heteroskedasticity.
Which expression best represents the condition homoskedasticity? (In the expressions assume σ2 > 0)
A) V(εi|Xi) = σ2.
B) E(εi|Xi) = σ2.
C) corr(Xi, εi) = 0.
D) corr(εi, εi + j) = 0.
The correct answer is A
Homoskedasticity means the variance of εi is constant and unrelated to the value of the independent variable.
AIM 3: Define, calculate and interpret the standard errors of the coefficients in an OLS model.
In a two-variable regression of the dependent variable Y on the independent variable X, the standard error of the slope coefficient is:
A )
[attach]13872[/attach]
B)
[attach]13873[/attach]
C)
[attach]13874[/attach]
D)
[attach]13875[/attach][此贴子已经被作者于2009-6-26 11:05:41编辑过]
The correct answer is C
The correct formulas for the standard errors uses X and not Y. The other formula among the choices using X is the standard error of the intercept.
AIM 5: Explain the application of the Gauss-Markov and Central Limit Theorem in OLS estimates.
1、The Gauss-Markov theorem says that if the classical linear regression model assumptions are true, then the OLS estimators have all of the following properties except the:
A) OLS estimated coefficients are based upon linear functions.
B) OLS estimated coefficients are unbiased, which means E(b0) = B0 and E(b1) = B1.
C) OLS estimate of the variance of the errors is unbiased, i.e., E()= σ2.
D) OLS estimated coefficients have the minimum absolute error when compared to other methods of estimating the coefficients, i.e., they are the most precise.
The correct answer is D
They have the minimum variance, which is not the same as the minimum absolute error.
2、In a regression analysis, the Central Limit Theorem (CLT)
A) is useful because it implies that the independent variables are normally distributed.
B) is useful because it implies that the estimators are BLUE (best linear unbiased estimators).
C) is not useful.
D) is useful because it implies that the residuals are normally distributed, which implies the coefficient estimates are normally distributed.
The correct answer is D
The theory is that the residuals represent a large number of effects not captured by the included independent variables. Therefore, by the CLT, the residuals can be assumed normally distributed. Since the estimators are linear functions of the residuals, the estimates can be assumed normally distributed.
AIM 6: Define, calculate and interpret hypothesis testing in an OLS regression model
1、Assume you ran a multiple regression to gain a better understanding of the relationship between lumber sales, housing starts, and commercial construction. The regression uses lumber sales as the dependent variable with housing starts and commercial construction as the independent variables. The results of the regression are:
|
Coefficient |
Standard Error |
t-statistics |
Intercept |
5.37 |
1.71 |
3.14 |
Housing starts |
0.76 |
0.09 |
8.44 |
Commercial construction |
1.25 |
0.33 |
3.78 |
The level of significance for a 95% confidence level is 1.96 |
Construct a 95% confidence interval for the slope coefficient for Housing Starts.
A) 0.76 ± 1.96(0.09).
B) 0.76 ± 1.96(8.44).
C) 1.25 ± 1.96(0.33).
D) 1.25 ± 1.96(3.78).
The correct answer is A
The confidence interval for the slope coefficient is b1 ± (tc × sb1).
Construct a 95% confidence interval for the slope coefficient for Commercial Construction.
A) 1.25 ± 1.96(3.78).
B) 1.25 ± 1.96(0.33).
C) 0.76 ± 1.96(0.09).
D) 0.76 ± 1.96(8.44).
The correct answer is B
The confidence interval for the slope coefficient is b1 ± (tc × sb1).
2、Consider the following estimated regression equation:
AUTOt = 0.89 + 1.32 PIt
The standard error of the coefficient is 0.42 and the number of observations is 22. The 95 percent confidence interval for the slope coefficient, b1, is:
A) {-0.766 < b1 < 3.406}.
B) {0.900 < b1 < 1.740}.
C) {0.480 < b1 < 2.160}.
D) {0.444 < b1 < 2.196}.
The correct answer is D
The degrees of freedom are found by n-k-1 with k being the number of independent variables or 1 in this case. DF = 22-1-1 = 20. Looking up 20 degrees of freedom on the student's t distribution for a 95% confidence level and a 2 tailed test gives us a critical value of 2.086. The confidence interval is 1.32 ± 2.086 (0.42), or {0.444 < b1 < 2.196}.
3、Consider the following estimated regression equation:
ROEt = 0.23 - 1.50 CEt
The standard error of the coefficient is 0.40 and the number of observations is 32. The 95 percent confidence interval for the slope coefficient, b1, is:
A) {0.683 < b1 < 2.317}.
B) {-2.300 < b1 < -0.700}.
C) {-2.317 < b1 < -0.683}.
D) {-3.542 < b1 < 0.542}.
The correct answer is C
The confidence interval is -1.50 ± 2.042 (0.40), or {-2.317 < b1 < -0.683}.
4、Consider the regression results from the regression of Y against X for 50 observations:
Y = 0.78 + 1.2 X
The standard error of the estimate is 0.40 and the standard error of the coefficient is 0.45.
Which of the following reports the correct value of the t-statistic for the slope and correctly evaluates its statistical significance with 95 percent confidence?
A) t = 2.667; slope is significantly different from zero.
B) t = 3.000; slope is significantly different from zero.
C) t = 1.789; slope is not significantly different from zero.
D) t = 1.200; slope not significantly different from zero.
The correct answer is A
The test statistic is t = (1.2 – 0) / 0.45 = 2.667. The critical t-values for 48 degrees of freedom are ± 2.011. Therefore, the slope is different from zero.
5、Consider the regression results from the regression of Y against X for 50 observations:
Y = 0.78 - 1.5 X
The standard error of the estimate is 0.40 and the standard error of the coefficient is 0.45.
Which of the following reports the correct value of the t-statistic for the slope and correctly evaluates H0: b1 ≥ 0 versus Ha: b1 < 0 with 95 percent confidence?
A) t = -3.333; slope is significantly negative.
B) t = 3.750; slope is significantly different from zero.
C) t = -3.750; slope is significantly different from zero.
D) t = 3.333; slope not significantly different from zero.
The correct answer is A
The test statistic is t = (-1.5 – 0) / 0.45 = -3.333. The critical t-value for 48 degrees of freedom is +/- 1.667. However, in the Schweser Notes you should use the closest degrees of freedom number of 40 df. which is +/-1.684. Therefore, the slope is different from zero. We reject the null in favor of the alternative.
6、An analyst is regressing fund returns against the return on the Wilshire 5000 to determine whether beta is equal to 1.0. The analyst is trying to determine whether the number of observations should be increased. Which of the following is a reason why the test will have higher power if the number of observations is increased? The:
A) estimate of beta will be farther away from 1.0.
B) standard error of the regression will be lower.
C) mean squared error of the regression will be lower.
D) constant of the regression will be closer to zero.
The correct answer is B
A larger number of observations will decrease the standard error of the regression which will increase the size of the test statistic if beta is different than 1.0.
7、A sample of 200 monthly observations is used to run a simple linear regression: Returns = b0 + b1Leverage + u. The t-value for the regression coefficient of leverage is calculated as t = – 1.09. A 5 percent level of significance is used to test whether leverage has a significant influence on returns. The correct decision is to:
A) do not reject the null hypothesis and conclude that leverage does not significantly explain returns.
B) reject the null hypothesis and conclude that leverage does not significantly explain returns.
C) do not reject the null hypothesis and conclude that leverage significantly explains returns.
D) reject the null hypothesis and conclude that leverage significantly explains returns.
The correct answer is A
Do not reject the null since |–1.09|<1.96(critical t-value).
8、An analyst has been assigned the task of evaluating revenue growth for an online education provider company that specializes in training adult students. She has gathered information about student ages, number of courses offered to all students each year, years of experience, annual income and type of college degrees, if any. A regression of annual dollar revenue on the number of courses offered each year yields the results shown below.
Coefficient Estimates | ||
Predictor |
Coefficient |
Standard Error of the Coefficient |
Intercept |
0.10 |
0.50 |
Slope (Number of Courses) |
2.20 |
0.60 |
Which statement about the slope coefficient is most correct, assuming a 5 percent level of significance and 50 observations?
A) t-Statistic: 3.67. Slope: Not significantly different from zero.
B) t-Statistic: 3.67. Slope: Significantly different from zero.
C) t-Statistic: 0.20. Slope: Not significantly different from zero.
D) t-Statistic: 0.20. Slope: Significantly different from zero.
The correct answer is B
t = 2.20/0.60 = 3.67. Since the t-statistic is larger than an assumed critical value of about 2.0, the slope coefficient is statistically significant.
AIM 7: Define, calculate and interpret the coefficient of determination and coefficient of correlation.
1、Which of the following statements regarding the coefficient of determination is least accurate? The coefficient of determination:
A) may range from ?1 to +1.
B) is the percentage of the total variation in the dependent variable that is explained by the independent variable.
C) cannot decrease as independent variables are added to the model.
D) is the ratio of explained variation to total variation.
The correct answer is A
In a simple regression, the coefficient of determination is calculated as the correlation coefficient squared and ranges from 0 to +1.
2、A simple linear regression equation had a coefficient of determination (R2) of 0.8. What is the correlation coefficient between the dependent and independent variables and what is the covariance between the two variables if the variance of the independent variable is 4 and the variance of the dependent variable is 9?
Correlation coefficient Covariance
A) 0.89 5.34
B) 0.91 4.80
C) 0.89 4.80
D) 0.91 5.34
The correct answer is A
The correlation coefficient is the square root of the R2, r = 0.89.
To calculate the covariance multiply the correlation coefficient by the product of the standard deviations of the two variables:
COV = 0.89 × √4 × √9 = 5.34
3、Which term is least likely to apply to a regression model?
A) Goodness of fit.
B) R2.
C) Coefficient of determination.
D) Coefficient of variation.
The correct answer is D
Goodness of fit, coefficient of determination and R2 are different names for the same concept. The coefficient of variation is not directly part of a regression model.
4、Unlike the coefficient of determination, the coefficient of correlation:
A) measures the strength of association between the two variables more exactly.
B) can have an absolute value greater than 1.
C) indicates the percentage of variation explained by a regression model.
D) indicates whether the slope of the regression line is positive or negative.
The correct answer is D
In a simple linear regression the coefficient of determination (R2) is the squared correlation coefficient, so it is positive even when the correlation is negative.
5、An analyst performs two simple regressions. The first regression analysis has an R-squared of 0.40 and a beta coefficient of 1.2. The second regression analysis has an R-squared of 0.77 and a beta coefficient of 1.75. Which one of the following statements is most accurate?
A) The first regression equation has more explaining power than the second regression equation.
B) The second regression equation has more explaining power than the first regression equation.
C) The beta coefficient of the 2nd regression indicates that this regression has more explaining power than the first.
D) The R-squared of the first regression indicates that there is a 0.40 correlation between the independent and the dependent variables.
The correct answer is B
The coefficient of determination (R-squared) is the percentage of variation in the dependent variable explained by the variation in the independent variable. The larger R-squared (0.77) of the second regression means that 77% of the variability in the dependent variable is explained by variability in the independent variable, while only 40% of that is explained in the first regression. This means that the second regression has more explaining power than the first regression. Note that the Beta is the slope of the regression line and doesn’t measure explaining power.
6、What does the R2 of a simple regression of two variables measure and what calculation is used to equate the correlation coefficient to the coefficient of determination? fficeffice" />
R2 measures: Correlation coefficient
A)percent of variability of the independent variable that is explained by the variability of the dependent variable R2 = r × 2
B)percent of variability of the dependent variable that is explained by the variability of the independent variable R2 = r × 2
C)percent of variability of the independent variable that is explained by the variability of the dependent variable R2 = r2
D)percent of variability of the dependent variable that is explained by the variability of the independent variable R2 = r2
The correct answer is D
R2, or the Coefficient of Determination, is the square of the coefficient of correlation (r). The coefficient of correlation describes the strength of the relationship between the X and Y variables. The standard error of the residuals is the standard deviation of the dispersion about the regression line. The t-statistic measures the statistical significance of the coefficients of the regression equation. In the response: "percent of variability of the independent variable that is explained by the variability of the dependent variable," the definitions of the variables are reversed.
7、A simple linear regression is run to quantify the relationship between the return on the common stocks of medium sized companies (Mid Caps) and the return on the S& 500 Index, using the monthly return on Mid Cap stocks as the dependent variable and the monthly return on the S& 500 as the independent variable. The results of the regression are shown below:
|
Coefficient |
Standard Error of coefficient |
t-Value |
Intercept |
1.71 |
2.950 |
0.58 |
S& 500 |
1.52 |
0.130 |
11.69 |
R2= 0.599 |
|
|
|
The strength of the relationship, as measured by the correlation coefficient, between the return on Mid Cap stocks and the return on the S& 500 for the period under study was:
A) 0.774.
B) 0.599.
C) 2.950.
D) 0.130.
The correct answer is A
You are given R2 or the coefficient of determination of 0.599 and are asked to find R or the coefficient of correlation. The square root of 0.599 = 0.774.
8、Assume an analyst performs two simple regressions. The first regression analysis has an R-squared of 0.90 and a slope coefficient of 0.10. The second regression analysis has an R-squared of 0.70 and a slope coefficient of 0.25. Which one of the following statements is most accurate?
A) The first regression has more explanatory power than the second regression.
B) Results of the second analysis are more reliable than the first analysis.
C) The influence on the dependent variable of a one unit increase in the independent variable is 0.9 in the first analysis and 0.7 in the second analysis.
D) The influence on the dependent variable of a one unit increase in the independent variable is 0.7 in the first analysis and 0.9 in the second analysis.
The correct answer is A
The coefficient of determination (R-squared) is the percentage of variation in the dependent variable explained by the variation in the independent variable. The larger R-squared (0.90) of the first regression means that 90 percent of the variability in the dependent variable is explained by variability in the independent variable, while 70 percent of that is explained in the second regression. This means that the first regression has more explanatory power than the second regression. Note that the Beta is the slope of the regression line and doesn’t measure explanatory power.
9、Assume you perform two simple regressions. The first regression analysis has an R-squared of 0.80 and a beta coefficient of 0.10. The second regression analysis has an R-squared of 0.80 and a beta coefficient of 0.25. Which one of the following statements is most accurate?
A) Results from the first analysis are more reliable than the second analysis.
B) Results of the second analysis are more reliable than the first analysis.
C) Results from both analyses are equally reliable.
D) The influence on the dependent variable of a one-unit increase in the independent variable is the same in both analyses.
The correct answer is C
The coefficient of determination (R-squared) is the percentage of variation in the dependent variable explained by the variation in the independent variable. The R-squared (0.80) being identical between the first and second regressions means that 80 percent of the variability in the dependent variable is explained by variability in the independent variable for both regressions. This means that the first regression has the same explaining power as the second regression.
AIM 8: Explain the process of normality testing using histograms and normal probability plots.
An analyst is using a normal probability plot to determine if a data set is normally distributed. If the data is normally distributed, then the plot should resemble a:
A) bell shaped curve.
B) convex plot that asymptotically approaches one.
C) concave plot that asymptotically approaches one.
D) straight line.
The correct answer is D
The normal probability plot approach uses the statistics of the variables to compute values of the variable assuming its distribution is normal and then plots these values over the observed values. If the variable is normally distributed, then the plot will be a straight line. The degree to which the plots are off the line indicates whether to reject an assumption of normality.
辛苦了!
欢迎光临 CFA论坛 (http://forum.theanalystspace.com/) | Powered by Discuz! 7.2 |