返回列表 发帖

Excel question, deciles of various number of rows

Here is what I am trying to do. I have columns of data, each of varying numbers of rows, that I want to calculate decile averages for. One column could have three hundred rows, which would mean 10 deciles of 30 rows each. I could then manually create an average formula for each of the ten deciles.

The problem is I have thousands of such columns of data, with different numbers of rows. It will take way too long to manually average the deciles. If they were all 30 rows to a decile it would be no problem.

So, does anyone know of a simple solution where one formula will be able to grab the average of the first decile (and 2 - 10 in subsequent formulas), while automatically accounting for whether there are 30 or 50 or whatever rows in that decile. Then I could just copy that formula for the other columns.

Thanks for the help, I think I have a decent idea of how I am going to do this

TOP

I just saw this query,
Actually you can use function 'large'

so you insert one column at the left and 51 (1/10 of the maximum rows +1) rows at the top of your data

A2:A51 = 1 to 50
b1 = =COUNT(B52:B7001),
b2 = =IF($A2<B$1/10,LARGE(B$52:B$597,$A2),"")
drag this to row 51
drag b1:b51 all the way to the right

TOP

Oh, that's pretty close to jmh's solution. We both agree that excel is not really suited to this.

TOP

If you have thousands of columns, then excel is clearly not your tool. Even if you had a simple macro that could calculate it for a row, you'd still have to run it on thousands of rows. You could, I suppose, loop through the rows, but you'd still have to analyze a thousand different columns of rankings, which would be 1000 x 10 deciles or 10,000.

Excel just doesn't seem like the appropriate tool.


If you have 30 columns of data, then maybe you can do it. 100 at the outside. The RANK function might come in handy here. TRUNC(10* (RANK(*,*) / COUNT(*))), should get you the decile for each entry. Depending on how the function is implemented, you might have to add 1.

A wrinkle might come in with how you've coded missing data and how rank counts blank entries.

I would make a separate worksheet that gives the decile for each data point in the corresponding master data sheet. You can then get the averages either by doing a SUMIF / COUNTIF equation, or perhaps by running pivot tables.


This is a project more suited to R or Matlab, though.

TOP

Use something besides excel.

If that's not acceptable, you could do something like this:
1) Create a new sheet just for the decile information
2) First column put 1 through 10, for each decile
3) You need to count the number of real entries in each column and then calculate the edges of each decile, so for the first one and decile 1, it would be something like, in cell B1, =small(sheet1!A:A,COUNT(A:A)*$A1/10)
4) after you copy this over, create another sheet (not really necessary), you can put in the 1-10, but not needed
5) in cell B1 (if you have the 1-10 in), type =sumif(sheet1!A:A,"<="&sheet2:B1)
for B2, you could need to do the same thing but subtract what is above
6) You need another sheet to do a similar calculation but for the counts, (replace sumif with countif)
7) Divide the sums from the counts. You might be able to simplify this so you just need one averageif, but I can't say for sure since I don't recall if the logical arguments work with more than one statement.

Should come out to what you want.

TOP

Uh, so if I understand correctly, if column A has 1000 rows, you need ten numbers? That is, the average of the first decile, average of the second decile, ... etc.? Is the data sorted or non-sorted? If it's unsorted, do you want to organize deciles by value or by the position in the column?

In any case, it seems like you should just write a macro to do this. Do you know any VBA?

TOP

Right here bro:

=IF(ISERROR(SUM(OFFSET(A$1,(ROW(A1)-1)*10,,10,)))/10,"",SUM(OFFSET(A$1,(ROW(A1)-1)*10,,10,))/10)

If the columns are on a sheet starting at A1, just paste this formula beside the columns and drag it to make it apply to all columns

And boom goes the dynamite

TOP

Your repeated use of the word "deciles" is impressive. No idea what the question is, but I'm impressed none the less.

TOP

返回列表