返回列表 发帖

PCA/Model Questions

Anyone here pretty familiar with principal components (PCA) or interest rate/credit models might be able to help me out with some questions.

1) Say you have log excess returns for a bunch of stocks and the log risk-free rate of return as a matrix you perform PCA on. Since you choose factors in PCA based on how much variance they explain, does mean that log risk-free returns would normally not be one of the more important factors? If that's the case, what's the normal procedure in practice? Model the risk-free separately?

2b) Let's say you have the above, plus a bunch of bond yields (let's say the changes in government curve + changes in YTMs for a bunch of corporate bonds). I would guess this PCA might pull out factors that might be correlated with the market, two important government yields, and a corporate bond risk factor. There might be some non-linearity when trying to explain corporate bonds returns. But PCA assumes linearity. If I shouldn't include these in PCA, any idea what I should do instead?

3) In Matlab, the coefficients from the pca functions do not sum to 1. This means that the factors you produce might have high correlations with an individual security, but they probably have much higher variance. So for instance, the interest rate factor might be highly correlated with one of the yields, but have quite different variance. Is there any advantage to making the coefficients sum to 1 so that the factors more closely reflect the underlying securities they are highly correlated to?

1 - Depends what you're trying to do with the result of the PCA. If you're trying to cluster your securities based on their exposures to the difference PCAs, then you don't need to worry about the risk-free rate at all. If it's important, it will come out as one of the dominant Eigenvectors.

2 - If the linearity in the PCA is troubling you, then use ICA ( independent component analysis). that's not a function in Matlab, you'll have to program it, but it's pretty easy to program ( a day maybe max). ICA does not minimize variance, but rather looks at maximizing independence using Kurtosis as a measure of independence.

3 - There is no value in having the coefficients sum to 1 ( assuming the PCA is not your end point as I said earlier) and that you're using it as a way to cluster your universe.

I'm not sure what you're doing but this whole PCA thing been beaten to death sine the mid-80s, so unless you have access to some new data, don't waste too much time on it.

TOP

返回列表