返回列表 发帖

Reading 10: Sampling and Estimation - LOS k, (Part 2) ~ Q

1.The practice of repeatedly using the same database to search for patterns until one is found is called as:

A)   data snooping.

B)   look-ahead bias.

C)   data mining.

D)   sample selection bias.

2.A scientist working for a pharmaceutical company tries many models using the same data before reporting the one that shows that the given drug has no serious side effects. The scientist is guilty of:

A)   look-ahead bias.

B)   time-period bias.

C)   data mining.

D)  sample selection bias.

3.An analyst has reviewed market data for returns from 1980–1990 extensively, searching for patterns in the returns. She has found that when the end of the month falls on a Saturday, there are usually positive returns on the following Thursday. She has engaged in:

A)   data snooping.

B)   data mining.

C)   biased selection.

D)   survivor engineering.

4.Which of the following is the best method to avoid data mining bias when testing a profitable trading strategy?

A)   Increase the sample size to at least 30 observations per year.

B)   Test the strategy on a different data set than the one used to develop the rules.

C)   Use a sample free of survivorship bias.

D)   Use wider confidence intervals to test for statistical significance.

5.A research paper that reports finding a profitable trading strategy without providing any discussion of an economic theory that makes predictions consistent with the empirical results is most likely evidence of:

A)   a non-normal population distribution.

B)   stratified random sampling.

C)   a sample that is not large enough.

D)   data mining.

thx

TOP

data mining

TOP

答案和详解如下:

1.The practice of repeatedly using the same database to search for patterns until one is found is called as:

A)   data snooping.

B)   look-ahead bias.

C)   data mining.

D)   sample selection bias.

The correct answer was C)

The practice of data mining involves analyzing the same data so as to detect a pattern, which may not replicate in other data sets, also known as torturing the data until it confesses.

2.A scientist working for a pharmaceutical company tries many models using the same data before reporting the one that shows that the given drug has no serious side effects. The scientist is guilty of:

A)   look-ahead bias.

B)   time-period bias.

C)   data mining.

D)  sample selection bias.

The correct answer was C)

Data mining is the process where the same data is used with different methods until the desired results are obtained.

3.An analyst has reviewed market data for returns from 1980–1990 extensively, searching for patterns in the returns. She has found that when the end of the month falls on a Saturday, there are usually positive returns on the following Thursday. She has engaged in:

A)   data snooping.

B)   data mining.

C)   biased selection.

D)   survivor engineering.

The correct answer was B)

Data mining refers to the extensive review of the same database searching for patterns.

4.Which of the following is the best method to avoid data mining bias when testing a profitable trading strategy?

A)   Increase the sample size to at least 30 observations per year.

B)   Test the strategy on a different data set than the one used to develop the rules.

C)   Use a sample free of survivorship bias.

D)   Use wider confidence intervals to test for statistical significance.

The correct answer was B)

The best way to avoid data mining is to test a potentially profitable trading rule on a data set different than the one you used to develop the rule (out-of-sample data). A larger sample size won’t prevent data mining, and you can still data mine a database free of survivorship bias. Wider confidence intervals result from lower confidence levels, all else equal, and won’t prevent data mining.

5.A research paper that reports finding a profitable trading strategy without providing any discussion of an economic theory that makes predictions consistent with the empirical results is most likely evidence of:

A)   a non-normal population distribution.

B)   stratified random sampling.

C)   a sample that is not large enough.

D)   data mining.

The correct answer was D)

Data mining occurs when the analyst continually uses the same database to search for patterns or trading rules until he finds one that works. If you are reading research that suggests a profitable trading strategy, make sure you heed the following warning signs of data mining:
Evidence that the author used many variables (most unreported) until he found ones that were significant.
The lack of any economic theory that is consistent with the empirical results.

TOP

返回列表