返回列表 发帖

Reading 10: Sampling and Estimation LOSk习题精选

LOS k: Discuss the issues regarding selection of the appropriate sample size, data-mining bias, sample selection bias, survivorship bias, look-ahead bias, and time-period bias.

When sampling from a population, the most appropriate sample size:

A)
minimizes the sampling error and the standard deviation of the sample statistic around its population value.
B)
is at least 30.
C)
involves a trade-off between the cost of increasing the sample size and the value of increasing the precision of the estimates.



A larger sample reduces the sampling error and the standard deviation of the sample statistic around its population value. However, this does not imply that the sample should be as large as possible, or that the sampling error must be as small as can be achieved. Larger samples might contain observations that come from a different population, in which case they would not necessarily improve the estimates of the population parameters. Cost also increases with the sample size. When the cost of increasing the sample size is greater than the value of the extra precision gained, increasing the sample size is not appropriate.

When sampling from a nonnormal distribution with an known variance, which statistic should be used if the sample size is large and if the respective sample size is small?

A)
z-statistic; z-statistic.
B)
z-statistic; not available.
C)
t-statistic; t-statistic.



When you are sampling from a:

and the sample size is small, use a: and the sample size is large, use a:
Normal distribution with a known variance z-statistic z-statistic
Normal distribution with an unknown variance t-statistic t-statistic*
Nonnormal distribution with a known variance not available z-statistic
Nonnormal distribution with an unknown variance not available t-statistic*

*The z-statistic is theoretically acceptable here, but use of the t-statistic is more conservative.

TOP

Which of the following statements about sample statistics is least accurate?

A)
There is no sample statistic for non-normal distributions with unknown variance for either small or large samples.
B)
The z-statistic is used to test normally distributed data with a known variance, whether testing a large or a small sample.
C)
The z-statistic is used for nonnormal distributions with known variance, but only for large samples.



There is no sample statistic for non-normal distributions with unknown variance for small samples, but the t-statistic is used when the sample size is large.

TOP

A)
sample-selection bias.
B)
look-ahead bias.
C)
survivorship bias.



Studies of the performance of mutual fund managers often suffer from survivorship bias as poorly performing funds are closed down and are not included in the sample.

TOP

A scientist working for a pharmaceutical company tries many models using the same data before reporting the one that shows that the given drug has no serious side effects. The scientist is guilty of:

A)
data mining.
B)
look-ahead bias.
C)
sample selection bias.



Data mining is the process where the same data is used with different methods until the desired results are obtained.

TOP

The practice of repeatedly using the same database to search for patterns until one is found is called:

A)
data snooping.
B)
sample selection bias.
C)
data mining.



The practice of data mining involves analyzing the same data so as to detect a pattern, which may not replicate in other data sets, also known as torturing the data until it confesses.

TOP

A research paper that reports finding a profitable trading strategy without providing any discussion of an economic theory that makes predictions consistent with the empirical results is most likely evidence of:

A)

a sample that is not large enough.

B)

data mining.

C)

a non-normal population distribution.




Data mining occurs when the analyst continually uses the same database to search for patterns or trading rules until he finds one that works. If you are reading research that suggests a profitable trading strategy, make sure you heed the following warning signs of data mining:

Evidence that the author used many variables (most unreported) until he found ones that were significant.

The lack of any economic theory that is consistent with the empirical results.

TOP

The average mutual fund return calculated from a sample of funds with significant survivorship bias would most likely be:

A)

larger than the mean return of the population of all mutual funds.

B)

an unbiased estimate of the mean return of the population of all mutual funds if the sample size was large enough.

C)

smaller than the mean return of the population of all mutual funds.



If we try to draw any conclusions from an analysis of a mutual fund database with survivorship bias, we overestimate the average mutual fund return, because we don’t include the poorer-performing funds that dropped out. A larger sample size from a database with survivorship bias will still result in a biased estimate.

TOP

A study reports that from 2002 to 2004 the average return on growth stocks was twice as large as that of value stocks. These results most likely reflect:

A)
look-ahead bias.
B)
survivorship bias.
C)
time-period bias.



Time-period bias can result if the time period over which the data is gathered is either too short because the results may reflect phenomenon specific to that time period, or if a change occurred during the time frame that would result in two different return distributions. In this case the time period sampled is probably not large enough to draw any conclusions about the long-term relative performance of value and growth stocks, even if the sample size within that time period is large.

Look-ahead bias occurs when the analyst uses historical data that was not publicly available at the time being studied. Survivorship bias is a form of sample selection bias in which the observations in the sample are biased because the elements of the sample that survived until the sample was taken are different than the elements that dropped out of the population.

TOP

An analyst has compiled stock returns for the first 10 days of the year for a sample of firms and estimated the correlation between these returns and changes in book value for these firms over the just ended year. What objection could be raised to such a correlation being used as a trading strategy?

A)
The study suffers from look-ahead bias.
B)
Use of year-end values causes a time-period bias.
C)
Use of year-end values causes a sample selection bias.



The study suffers from look-ahead bias because traders at the beginning of the year would not be able to know the book value changes. Financial statements usually take 60 to 90 days to be completed and released.

TOP

返回列表