Excel Cheat Sheet
January 28, 2017 | Author: jmclaug502 | Category: N/A
Short Description
Download Excel Cheat Sheet...
Description
Is A Particular Word Contained In A Text String? Category: Formulas / General VBA | [Item URL] Here's a VBA function that might be useful in some situations. The ExactWordInString functions returns True if a specified word is contained in a text string. You might think that this function is just a variation on Excel's FIND function or VBA's Instr function. There's a subtle difference. The ExactWordInString function looks for a complete word -- not text that might be part of a different word. The examples in the accompanying figure should clarify how this function works. Cell C2 contains this formula, which was copied to the cells below: =ExactWordInString(A2,B2)
The function identifies the complete word trapped, but not the word trap, which is part of trapped. Also, note that a space is not required after a word in order to identify it as a word. For example, the word can be followed by a punctuation mark. The function, listed below, modified the first argument (Text) and replaces all non-alpha characters with a space character. It then adds a leading and trailing space to both arguments. Finally, it uses the Instr function to determine if the modified Word argument is present in the modified Text argument. To use this function in a formula, just copy and paste it to a VBA module in your workbook. Function ExactWordInString(Text As String, Word As String) As Boolean '
Returns TRUE if Word is contained in Text as an exact word match Dim i As Long Const Space As String = " " Text = UCase(Text)
'
Replace non-text characters with a space For i = 0 To 64 Text = Replace(Text, Chr(i), Space) Next i For i = 91 To 255 Text = Replace(Text, Chr(i), Space) Next i
'
Add initial and final space to Text & Word Text = Space & Text & Space Word = UCase(Space & Word & Space) ExactWordInString = InStr(Text, Word) 0
End Function
* Update * Excel MVP Rick Rothstein sent me a much simpler function that produces the same result. In fact, it uses just one statement: Function ExactWordInString(Text As String, Word As String) As Boolean ExactWordInString = " " & UCase(Text) & " " Like "*[!A-Z]" & UCase(Word) & "[!A-Z]*" End Function
Formulas To Perform Day Of Month Calculations Category: Formulas | [Item URL] Many events are scheduled for a particular occurrence of the day within a month. For example, payday might be the last Friday of every month. Or, a meeting might be scheduled for every second Monday of the month. Excel doesn't have a function that can calculate these types of dates, but it's possible to create a formula. In the figure below, the formula in cell D4 calculates the date based on the parameters in column C. The formula in D4 is: =DATE(C3,C4,1+((C6-(C5>=WEEKDAY(DATE(C3,C4,1))))*7)+(C5-WEEKDAY(DATE(C3,C4,1))))
This formula is not always accurate, however. If you specify a day number that doesn't exist (for example, the 6th Friday), it returns a date in the following month. Cell D6 contains a modified formula that displays "(none)" if the date isn't in the month specified. This formula is much longer: =IF(MONTH(DATE(C3,C4,1+((C6-(C5>=WEEKDAY(DATE(C3,C4,1))))*7)+ (C5-WEEKDAY(DATE(C3,C4,1)))))C4,"(none)",DATE(C3,C4,1+ ((C6-(C5>=WEEKDAY(DATE(C3,C4,1))))*7)+(C5-WEEKDAY(DATE(C3,C4,1)))))
In some cases, you might need to determine the last occurrence of a day in a particular month. This calculation requires a different formula (refer to the figure below):
=DATE(C9,C10+1,1)-1+IF(C11>WEEKDAY(DATE(C9,C10+1,1)-1), C11-WEEKDAY(DATE(C9,C10+1,1)-1)-7,C11-WEEKDAY(DATE(C9,C10+1,1)-1))
In this figure, the formula in cell D10 displays the date of the last Friday in March, 2008. The download file for this tip contains another example that has an easy-to-use interface. The user can select the parameters from drop-down lists. The megaformula in the Calculated Date column is very complex because it needs to covert words into values.
Making An Exact Copy Of A Range Of Formulas, Take 2 Category: General / Formulas | [Item URL] When you copy a range of formulas and paste them to a new location, Excel adjusts the cell references automatically. Most of the time, this is exactly what you want. Consider this simple formula: =SUM(A2:A13)
If you copy this formula and paste it to the next column, the references are adjusted and the pasted formula is: =SUM(B2:B13)
Making an exact copy of a single formula is easy: Press F2, highlight the formula, and press Ctrl+C to copy it as text. Then paste it to another cell. In some situations, however, you might need to make an exact copy of a range of formulas. In an older tip, I described a rather complicated way to do this. See Making An Exact Copy Of A Range Of Formulas. Matthew D. Healy saw that tip and shared another method, which uses Notepad. Here's how it works:
1. Put Excel in formula view mode. The easiest way to do this is to press Ctrl+` (that character is a "backwards apostrophe," and is usually on the same key that has the ~ (tilde).
2. Select the range to copy. 3. Press Ctrl+C 4. Start Windows Notepad 5. Press Ctrl+V to past the copied data into Notepad 6. In Notepad, press Ctrl+A followed by Ctrl+C to copy the text 7. Activate Excel and activate the upper left cell where you want to paste the formulas. And, make sure that the sheet you are copying to is in formula view mode. 8. Press Ctrl+V to paste. 9. Press Ctrl+` to toggle out of formula view mode. Note: If the paste operation back to Excel doesn't work correctly, chances are that you've used Excel's Text-to-Columns feature recently, and Excel is trying to be helpful by remembering how you last parsed your data. You need to fire up the Convert Text to Columns Wizard. Choose the Delimited option and click Next. Clear all of the Delimiter option checkmarks except Tab.
Calculating Easter Category: Formulas | [Item URL]
Easter is one of the most difficult holidays to calculate. Several years ago, a Web site had a contest to see who could come up with the best formula to calculate the date of Easter for any year. Here's one of the formulas submitted (it assumes that cell A1 contains a year): =DOLLAR(("4/"&A1)/7+MOD(19*MOD(A1,19)-7,30)*14%,)*7-6
Just for fun, I calculated the date of Easter for 300 years from 1900 through 2199. Then I created a pivot table, and grouped the dates by day. And then, a pivot chart:
During this 300-year period, the most common date for Easter is March 31 (it occurs 13 times on that data). The least common is March 24 (only one occurrence). I also learned that the next time Easter falls on April Fool's Day will be in 2018.
Converting Unix Timestamps Category: Formulas | [Item URL] If you import data you might encounter time values stored as Unix timestamps. Unix time is defined as the number of seconds since midnight (GMT time) on January 1, 1970 -- also known as the Unix epoch. For example, here's the Unix timestamp for August 4, 2008 at 10:19:08 pm (GMT): 1217888348
To create an Excel formula to convert a Unix timestamp to a readable data and time, start by converting the seconds to days. This formula assumes that the Unix timestamp is in cell A1: =(((A1/60)/60)/24)
Then, you need to add the result to the date value for January 1, 1970. The modified formula is: =(((A1/60)/60)/24)+DATE(1970,1,1)
Finally, you need to adjust the formula for the GMT offset. For example, if you're in New York the GMT offset is -5. Therefore, the final formula is: =(((A1/60)/60)/24)+DATE(1970,1,1)+(-5/24)
A simpler (but much less clear) formula that returns the same result is: =(A1/86400)+25569+(-5/24)
Both of these formulas return a date/time serial number, so you need to apply a number format to make it readable as a date and time.
Naming Techniques Most Excel users know how to name cells and ranges. Using named cells and ranges can make your formulas more readable, and less prone to errors. Most users, however, don't realize that Excel lets you provide names for other types of items. This document describes some useful naming techniques that you may not be aware of.
Naming a constant If formulas in your worksheet use a constant value (such as an interest rate), the common procedure is to insert the value for the constant into a cell. Then, if you give a name to the cell (such as InterestRate), you can use the name in your formulas. Here's how create a named constant that doesn't appear in a cell:
1. Select the Insert Name Define command to display the Define Name dialog box. 2. Enter the name (such as InterestRate) in the field labeled Names in workbook. 3. Enter the value for the name in the Refers to field (this field normally holds a formula). For example, you can enter =.075. 4. Click OK
Try it out by entering the name into a cell (preceded by an equal sign). For example, if you defined a name called InterestRate, enter the following into a cell: =InterestRate
This formula will return the constant value that you defined for the InterestRate name. And this value does not appear in any cell.
Names are actually named formulas Here's another way of looking at names. Whenever you create a name, Excel actually creates a name for a formula. For example, if you give a name (such as Amount) to cell D4, Excel creates a name for this formula: =$D$4
You can use the Define Name dialog box and edit the formula for a name. And you can use all of the standard operators and worksheet functions. Try this: 1. Create a name for cell D4. Call it Amount. 2. Enter =Amount into any cell. The cell will display the value in cell D4. 3. Use the Insert Name Define command and edit the refers to field so it appears as =$D$4*2 You'll find that entering =Amount now displays the value in cell D4 multiplied by 2.
Using relative references When you create a name for a cell or range, Excel always uses absolute cell references for the range. For example, if you give the name Months to range A1:A12, Excel associates $A$1:$A$12 (an absolute reference) with the name Months. You can override the absolute references for a name and enter relative references. To see how this works, follow the steps below to create a relative name called CellBelow. 1. Select cell A1. 2. Select the Insert Name Define command to display the Define Name dialog box. 3. Enter the name CellBelow in the field labeled Names in workbook. 4. Replace the value in the Refers to field with =A2 (this is a relative reference) 5. Click OK Try it out by entering the following formula into any cell: =CellBelow
You'll find that this formula always returns the contents of the cell directly below. NOTE: It's important to understand that the formula you enter in Step 4 above depends on the active cell. Since cell A1 was the active cell, =A2 is the formula that returns the cell below. If, for example, cell C6 was the active cell when you created the name, you would enter =C7 in step 4.
Using mixed references You can also used "mixed" references for you names. Here's a practical example of how to create a name that uses mixed references. This name, SumAbove, is a formula that returns the sum of all values above the cell. 1. Activate cell A3. 2. Select the Insert Name Define command to display the Define Name dialog box.
3. In the Names in workbook field, enter SumAbove. 4. In the Refers to field, enter =SUM(A$1:A2) Notice that the formula in Step 3 is a mixed reference (the row part is absolute, but the column part is relative). Try it out by entering =SumAbove into any cell. You'll find that this formula returns the sum of all cells in the column from Row 1 to the row directly above the cell.
Creating A List Of Formulas Most users have discovered that Excel has an option that lets you display formulas directly in their cells: Choose Tools Options, click the View tab, and select the Formulas checkbox. However, Excel doesn't provide a way to generate a concise list of all formulas in a worksheet. The VBA macro below inserts a new worksheet, then creates a list of all formulas and their current values. NOTE: My Power Utility Pak add-in includes a more sophisticated version of this subroutine, plus several other auditing tools. To use this subroutine: 1. Copy the code below to a VBA module. You can also store it in your Personal Macro Workbook, or create an add-in. 2. Activate the worksheet that contains the formulas you want to list. 3. Execute the ListFormulas subroutine. The subroutine will insert a new worksheet that contains a list of the formulas and their values.
The ListFormulas Subroutine Sub ListFormulas() Dim FormulaCells As Range, Cell As Range Dim FormulaSheet As Worksheet Dim Row As Integer '
Create a Range object for all formula cells On Error Resume Next Set FormulaCells = Range("A1").SpecialCells(xlFormulas, 23)
'
Exit if no formulas are found If FormulaCells Is Nothing Then MsgBox "No Formulas." Exit Sub End If
'
Add a new worksheet Application.ScreenUpdating = False Set FormulaSheet = ActiveWorkbook.Worksheets.Add FormulaSheet.Name = "Formulas in " & FormulaCells.Parent.Name
'
Set up the column headings With FormulaSheet
Range("A1") = "Address" Range("B1") = "Formula" Range("C1") = "Value" Range("A1:C1").Font.Bold = True End With '
Process each formula Row = 2 For Each Cell In FormulaCells Application.StatusBar = Format((Row - 1) / FormulaCells.Count, "0%") With FormulaSheet Cells(Row, 1) = Cell.Address _ (RowAbsolute:=False, ColumnAbsolute:=False) Cells(Row, 2) = " " & Cell.Formula Cells(Row, 3) = Cell.Value Row = Row + 1 End With Next Cell
'
Adjust column widths FormulaSheet.Columns("A:C").AutoFit Application.StatusBar = False
End Sub
Cell Counting Techniques Excel provides many ways to count cells in a range that meet various criteria: • • • • • •
The DCOUNT function. The data must be set up in a table, and a separate criterion range is required. The COUNT function. Simply counts the number of cells in a range that contain a number. The COUNTA function. Counts the number of non-empty cells in a range. The COUNTBLANK function. Counts the number of empty cells in a range. The COUNTIF function. Very flexible, but often not quite flexible enough. An array formula. Useful when the other techniques won't work.
Formula Examples Listed below are some formula examples that demonstrate various counting techniques. These formula all use a range named data. To count the number of cells that contain a negative number: =COUNTIF(data,"=1")-COUNTIF(data,">10")
To count the number of unique numeric values (ignores text entries): =SUM(IF(FREQUENCY(data,data)>0,1,0))
To count the number of cells that contain an error value (this is an array formula, entered with Ctrl+Shift+Enter): =SUM(IF(ISERR(data),1,0))
Using the formulas in VBA You can also use these techniques in your VBA code. For example the VBA statement below calculates the number of three-letter words in a range named data, and assigns the value to the NumWords variable: NumWords = Application.COUNTIF(Sheets("Sheet1").Range("data"), "???")
The other formula examples listed above can also be converted to VBA.
Summing And Counting Using Multiple Criteria If you peruse the Excel newsgroups, you've probably realized that one of the most common questions involves summing or counting using multiple criteria. If your data is set up as a database table you can use database functions such as DCOUNT or DSUM. These functions, however, require the use of a separate criteria range on your worksheet. This tip provides a number of examples that should solve most of your counting and summing problems. Unlike DCOUNT and DSUM, these formulas don't require a criteria range. The example formulas presented in this tip use the simple database table shown below. You will need to adjust the formulas to account for your own data.
Sum of Sales, where Month="Jan" This is a straightforward use of the SUMIF function (it uses a single criterion): =SUMIF(A2:A10,"Jan",C2:C10)
Count of Sales, where Month="Jan" This is a straightforward use of the COUNTIF function (single criterion): =COUNTIF(A2:A10,"Jan")
Sum of Sales, where Month"Jan" Another simple use of SUMIF (single criterion): =SUMIF(A2:A10,"Jan",C2:C10)
Sum of Sales where Month="Jan" or "Feb" For multiple OR criteria in the same field, use multiple SUMIF functions: =SUMIF(A2:A10,"Jan",C2:C10)+SUMIF(A2:A10,"Feb",C2:C10)
Sum of Sales where Month="Jan" AND Region="North" For multiple criteria in different fields, the SUMIF function doesn't work. However, you can use an array formula. When you enter this formula, use Ctrl+Shift+Enter: =SUM((A2:A10="Jan")*(B2:B10="North")*C2:C10)
Sum of Sales where Month="Jan" AND Region"North" Requires an array formula similar to the previous formula. When you enter this formula, use Ctrl+Shift+Enter: =SUM((A2:A10="Jan")*(B2:B10"North")*C2:C10)
Count of Sales where Month="Jan" AND Region="North" For multiple criteria in different fields, the COUNTIF function doesn't work. However, you can use an array formula. When you enter this formula, use Ctrl+Shift+Enter: =SUM((A2:A10="Jan")*(B2:B10="North"))
Sum of Sales where Month="Jan" AND Sales>= 200 Requires an array formula similar to the previous example. When you enter this formula, use Ctrl+Shift+Enter: =SUM((A2:A10="Jan")*(C2:C10>=200)*(C2:C10))
Sum of Sales between 300 and 400 This also requires an array formula. When you enter this formula, use Ctrl+Shift+Enter: =SUM((C2:C10>=300)*(C2:C10=300)*(C2:C101),--('[Nowfal Rates.xls]RATES'! $K$11:$K$13))
returns the same result, but it will still work when the other workbook is closed and the sheet is recalculated, and can be initially entered referencing the closed workbook, without a #VALUE error. The second major advantage is being able to handle text in numeric columns differently. Consider the follwoing dataset, as shown in Table 2. A
B
1
Item
Number
2
x
1
3
y
2
4
x
3
Table 2. If we are looking at rows 1:4. we can see that we have a text value in B1 In this case it is simply a heading row, but the principle applies to a text value in any row. Using SUMPRODUCT, we can either return an error, or ignore the text. This can be useful if we want to ignore errors, or if we want to trap the error (and presumably correct it later). Errors will be returned if we use this version =SUMPRODUCT((A1:A4="x")*(B1:B4))
To ignore errors, use this amended version which uses the double unary operator (see SUMPRODUCT Explained below for details) =SUMPRODUCT(--(A1:A4="x"),(B1:B4))
And a third, most significant advantage, is that the conditional test range or the condition can be constructed in a huge number of ways to facilitate the requirement, such as LEFT(A1:A10), ISNUMBER(MATCH(A1:A10,{"apples","pears"},0),or ISNUMBER(MATCH(K2:K30,ROW(INDIRECT(TODAY()&":"&TODAY()+10)),0))
But how does it work?
SUMPRODUCT Explained Understanding how SUMPRODUCT works helps to determine where to use it, how to can construct thus formula, and thus how it can be extended. Table 3. below shows an example data set that we will use. A
B
C
9
Ford
B
3
10
Vauxhall
C
4
11
Ford
A
2
12
Ford
A
1
13
Ford
D
4
14
Ford
A
3
`5
Ford
A
2
16
Renault
A
8
17
Ford
A
6
18
Ford
A
8
19
Ford
A
7
20
Ford
A
6
Table 3. In this example, the problem is to find how many Fords with a category of "A" were sold. A9:A20 holds the make, B9:B20 has the category, and C9:C20 has the number sold. The formula to get this result is =SUMPRODUCT((A9:A20="Ford")*(B9:B20="A")*(C9:C20)). The first part of the formula (A9:A20="Ford") checks the array of makes for a value of Ford. This returns an array of TRUE/FALSE, in this case it is {TRUE,FALSE,TRUE,TRUE,TRUE,TRUE,TRUE,FALSE,TRUE,TRUE,TRUE,TRUE} Similarly, the categories are checked for the vale A with (B9:B20="A"). Again, this returns an array of TRUE/FALSE, or {FALSE,FALSE,TRUE,TRUE,FALSE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE,TRUE} And finally, the numbers are not checked but taken as is, that is (C9:C20), which returns an array of numbers {3,4,2,1,4,3,2,8,6,8,7,6} So now we have three arrays, two of TRUE/FALSE values, one of numbers. This is showm in Table 4. A
B
C
TRUE
* FALSE *
3
10 FALSE * FALSE *
4
11
TRUE
*
TRUE
*
2
12
TRUE
*
TRUE
*
1
13
TRUE
* FALSE *
4
14
TRUE
*
TRUE
*
3
15
TRUE
*
TRUE
*
2
16 FALSE *
TRUE
*
8
17
TRUE
*
TRUE
*
6
18
TRUE
*
TRUE
*
8
19
TRUE
*
TRUE
*
7
20
TRUE
*
TRUE
*
6
9
Table 4. And this is where it gets interesting. SUMPRODUCT usually works on arrays of numbers, but we have arrays of TRUE/FALSE values as
well as an array of numbers. By using the '*' (multiply) operator, we can get numeric values that can be summed. '*' has the effect of coercing these two arrays into a single array of 1/0 values. Multiplying TRUE by TRUE returns 1 (try it, enter =TRUE*TRUE in a cell and see the result), any
other combination returns 0. Therefore, when both conditions are satisfied, we get a 1, whereas if any or both conditions are not satisfied, we get a 0. Multiplying the first array of TRUE/FALSE values by the second array of TRUE/FALSE values returns a composite array of 1/0 values, or {0,0,1,1,0,1,1,0,1,1,1,1}. This subsequent array of 1/0 values is then multiplied by the array of numbers sold to give a further array, an array of numbers sold that satisfy the two test conditions. SUMPRODUCT then sums the members of this array to give the count. Table 4. shows the values that the conditional tests break down to before being acted upon by the '*' operator. Table 5. shows a virtual representation of those TRUE/FALSE values as their numerical equivalents of 1/0 and the individual multiplication results. From this, you should be able to see how SUMPRODUCT arrives at its result, namely 35. A
B
C
9
1
*
0
*
3
0
10
0
*
0
*
4
0
11
1
*
1
*
2
2
12
1
*
1
*
1
1
13
1
*
0
*
4
0
14
1
*
1
*
3
3
15
1
*
1
*
2
2
16
0
*
1
*
8
0
17
1
*
1
*
6
6
18
1
*
1
*
8
8
19
1
*
1
*
7
7
20
1
*
1
*
6
6 35
Table 5. Table 6. shows you the same virtual representation of 1/0 numerical values without the numbers sold column, that is using SUMPRODUCT to count the number of rows satisfying the two conditions, or =SUMPRODUCT((A9:A20=A1)*(B9:B20="A"))
A
B
9
1
*
0
= 0
10
0
*
0
= 0
11
1
*
1
= 1
12
1
*
1
= 1
13
1
*
0
= 0
14
1
*
1
= 1
15
1
*
1
= 1
16
0
*
1
= 0
17
1
*
1
= 1
18
1
*
1
= 1
19
1
*
1
= 1
20
1
*
1
= 1 8
Table 6. If you have been able to follow this explanation all of the way through, it may have occurred to you that although we are using the SUMPRODUCT function, the '*' operators have resolved the multiple arrays into a single composite array, leaving SUMPRODUCT to simply sum the members of that composite array, that is, there is no product. This is perfectly correct, and perfectly valid, SUMPRODUCT can work on a single array (put 1,2,3 in cells A1,A2,A3, and insert =SUMPRODUCT(A1:A3) in a cell, it returns 6 correctly). In reality, we only need the '*' to coerce the arrays that are being tested for a particular condition, we do not need it for the array that is not subject to a conditional test. So we could also use =SUMPRODUCT((A9:A20="Ford")*(B9:B20="A"),(C9:C20)), which does use the product aspect (see more on this in the next section). When using the SUMPRODUCT function, all arrays must be the same size, as corresponding members of each array are multiplied by each other. When using the SUMPRODUCT function, no array can be a whole column (A:A), the array must be for a range within a column (although the best part of a column could be defined with A1:A65535 if so desired). Whole rows (1:1) are acceptable[3]. In a SUMPRODUCT function, the arrays being evaluated cannot be a mix of column and row ranges, they must all be columns, or all rows. However, the row data can be transposed to present it to SUMPRODUCT as columnar - see the Using TRANSPOSE to test against values in a column not row example.
Format of SUMPRODUCT In the examples presented so far, the format has been
=SUMPRODUCT((array1=condition1)*(array2=condition2)*(array3))
As mentioned above, we could also use
=SUMPRODUCT((array1=condition1)*(array2=condition2),(array3))
which works as the '*' operator is only required to coerce the conditional arrays that resolve to TRUE/FALSE into numeric values. As it the use of a arithmetic operator that coreces the TRUE/FALSE values to 1/0, we could use many different operators and achieve the same result. Thus, it is also possible to coerce each of the conditional arrays individually by multiplying them by 1, =SUMPRODUCT((array1=condition1)*1,(array2=condition2)*1,(array3))
or
=SUMPRODUCT(1*(array1=condition1),1*(array2=condition2),(array3))
or by raising to the power of 1,
=SUMPRODUCT((array1=condition1)^1,(array2=condition2)^1,(array3))
or by adding 0,
=SUMPRODUCT((array1=condition1)+0,(array2=condition2)+0,(array3))
or
=SUMPRODUCT(0+(array1=condition1),0+(array2=condition2),(array3))
or even by using the N function,
=SUMPRODUCT(N(array1=condition1),N(array2=condition2),(array3))
These methods differ from the '*' operator in that they are applied to individual arrays, '*' operates on two arrays. All of these methods work, when there is more than one conditional array, so it is really a matter of preference as to which to use. If there is a single conditional array, then the '*' operator cannot be used (there are not two to multiply), so one of the other above methods has to be used. Yet another method is to use the double unary operator, --, in this way =SUMPRODUCT(--(array1=condition1),--(array2=condition2),(array3))
The double unary operator also coerces the indivual array(s), which then acts more akin to classic SUMPRODUCT. There has been much discussion that one way is faster than another, or is more of a 'standard' than another, but in reality there will be few instances where one method will gain a noticeable performance advantage over another, and as for standards, this is all new territory, and will mainly be used by people who have never been involved in using these standards, and who care even less. For me, I believe it is a matter of preference. Personally, I am being swayed to the double unary -notation, because it avoids a function call, it works in all situations (the '*' operator won't work on a single array), and I don't like the '1*', '*1', '^1', or '+0' variations. So my preference is for =SUMPRODUCT(--(array1=condition1),--(array2=condition2),(array3)) which also has more similarity to classic SUMPRODUCT,
There is one other varitaion which has been promoted recently, which is the single unary operator, '-', such as =SUMPRODUCT(-(array1=condition1),-(array2=condition2),(array3))
but I would not encourage this as it has no real merit that I can see, and has to be paired off, otherwise it will return a negative result. So, to sum up ... Tests, like A=10 normally resolve to TRUE or FALSE, and any operator is only needed if you want to coerce an array of TRUE/FALSE values to 1/0 integers, such as =SUMPRODUCT(--(B5:B1953=101))
SUMPRODUCT arrays are normally separated by the comma. So, to preserve this format, if you have multiple conditions, you can use the -- on both conditions like so =SUMPRODUCT(--(B5:B1953=101),--(C5:C1953=7))
But, if you simply multiply two arrays of TRUE/FALSE, that implicitly resolves to 1/0 values that are then summed, you don;t need comma, so you could then use =SUMPRODUCT((B5:B193=101)*(C5:C193=7))
Any further, final, array of values can use the same operator, or could revert to comma. So your formula can be written as =SUMPRODUCT(--(B5:B1953=101),--(C5:C1953=7),(D5:D1953))
or
=SUMPRODUCT((B5:B1953=101)*(C5:C1953=7),(D5:D1953))
or
=SUMPRODUCT(--(B5:B1953=101),--(C5:C1953=7),--(D5:D1953))
or
=SUMPRODUCT((B5:B1953=101)*(C5:C1953=7)*(D5:D1953))
or
=SUMPRODUCT(--(B5:B1953=101),--(C5:C1953=7)*(D5:D1953))
If the result is the product of two conditions being multiplied, it is fine to multiply them together as this will coerce the True/False values to 1/0 values to allow the summing =SUMPRODUCT((condition1)*(condition2))
However, if there is only one condition, you can coerce to 1/0 with the double unary -=SUMPRODUCT(--(condition1))
You could achieve this equally as well with =SUMPRODUCT((1*(condition1)))
and equally the first could be represented as =SUMPRODUCT(--(condition1),--(condition2))
There is no situation that I know of whereby a solution using -- could not be achieved somehow with a '*'. Conversely, if using the TRANSPOSE function within SUMPRODUCT, then the '*' has to be used. So, as you can see there are a number of possibilities, and you make your own choice. I leave the final word to Harlan Grove, who once wrote this paragraph on why he prefers the double unary operator ... ....as I've written before, it's not the speed of double unary minuses I like, it's the fact that due to Excel's operator precedence it's harder to screw up double unary minuses with typos than it is to screw up the alternatives ^1, *1, +0. Also, since I read left to right, I prefer my number type coercions on the left rather than the right of my Boolean expressions, and -- looks nicer than 1* or 0+. Wrapping Boolean expressions inside N() is another alternative, possibly clearer, but it eats a nested function call level, so I don't use it.
Conditional Counting and Summing in VBA All of the discussion so far has been about conditional formulae, that is directly within Excel worksheets. It is often necessary to count or sum conditionally some worksheet ranges within a VBA routine. In these instances, we could code a simple loop to go through all of the data and check if it matches the condition, summing the matching items as we go. Excel VBA has a method that allows a call out from VBA routines to a built-in worksheet function, saving ourselves having to build that functionality, and greatly improving the power of our VBA code. Whilst there is an overhead to calling an Excel function from within VBA, any performance impact should be minimal if not over-used, and the usefulness of this facility is clear. We can utilise this facility to achieve conditional counting and summing in VBA with little effort, but there are a few things to be aware of. As an example, consider the data in Table 1. above. If we needed to know how many Fords were in the range A1:A10from within a VBA procedure, we could simply use the following code Dim mModel As String Dim mCount As Long mModel = "Ford" mCount = Application.WorksheetFunction.Countif( _ Range("A1:A10"), mModel)
This will load the mCount variable with the number of Fords, 4 in this instance. Similalry, we can use SUMIF to calculate the value Dim mModel As String Dim mCount As Long mModel = "Ford" mValue = Application.WorksheetFunction.SumIf( _ Range("A1:A10"), mModel, Range("C1:C10")) This will load the mCount variable with the value of the Fords, 33873 in this instance. The natural next step is to assume that we can extend this technique to our multiple condition test formulae discussed above. If we are using COUNTIFS and SUMIFS in Excel 2007 (see SUMPRODUCT and Excel 2007) then this is correct. For example, we can count how many Fords were sold in June using Dim mModel As String Dim mMonth As String Dim mCount As Long mModel = "Ford" mMonth = "June" mCount = Application.WorksheetFunction.CountIfs( _ Range("A1:A10"), mModel, _ Range("B1:B10"), mMonth) We get a result of 3 here in our mCount variable. Unfortunately, this technique cannot be extended to array formulae, or conditional testing SUMPRODUCT formulae. For example, a simple formula to count how many Fords were sold in Feb might be =SUMPRODUCT((A2:A10="Ford")*(B2:B10="Feb"))
(none, as it happens), and you might think that we could use the following VBA to get the same result Dim mModel As String Dim mMonth As String Dim mCount As Long mModel = "Ford" mMonth = "Feb" mCount = Application.WorksheetFunction.Sumproduct( _ Range("A1:A10") = mModel , Range("C1:C10") = mMonth)) This fails to compile, never mind getting the correct result. In this case, VBA is trying to make a simple call to the worksheet function, but when array and these type of SUMPRODUCT formulae are resolved in Excel each item is within the array is resolved and then passed to the main function for SUMming, AVERAGEing, or whatever is being actioned. As VBA doesnt evaluate the ranges, it is not passing correct information to the worksheet function, so we get the error[4]. There is a solution to this problem, and that is to evaluate the function call within VBA, using the VBA Evaluate method, which converts a Microsoft Excel name to an a value. The code here is
Dim Dim Dim Dim
mModel As String mMonth As String mFormula As String mCount As Long mModel = "Ford" mMonth = "Feb" mFormula = "SUMPRODUCT((A1:A10=""" & mModel & _ """)*(B1:B10=""" & mMonth & """))" mCount = Application.Evaluate(mFormula)
Although there is more effort required to ensure that the syntax of the function call is properly constructed, and that strings tested against are properly formed with quotes around them[5], it is still a useful technique to have, and provides the capability to use SUMPRODUCT (and by association, array formulae) within VBA.
SUMPRODUCT and Excel 2007 When Microsoft introduced Excel 2007, the main focus was on ease of use, and improved business analysis functionality. Unfortunately, the worksheet functions did not get much attention, but there were a few new functions. Two of the new functions, COUNTIFS and SUMIFS, support multiple conditional tests. For instance, in our previous examples , =SUMPRODUCT((A1:A10="Ford")*(B1:B10="June")) =SUMPRODUCT((A1:A10="Ford")*(B1:B10="June")*(C1:C10))
where we count those items where A1:A10 is = Ford AND B1:B10 = June, and where A1:A10 = Ford AND B1: B10 = June multiplied by C1:C10. In Excel 2007, COUNTIFS and SUMIFS can be used in place of SUMPRODUCT. The Excel 2007 formulae would be =COUNTIFS(A1:A10,"Ford",B1:B10,"June") =SUMIFS(C1:C10,A1:A10,"Ford",B1:B10,"June")
A further improvement is that in Excel 2007, SUMPRODUCT can address a whole column, which is a helpful change. So, with Excel 2007 supporting multiple conditional tests, does this mean that the special use of SUMPRODUCT is now redundant, and that it is relegated to its original, simple array multiplication role? Whilst this may seem to be the case at first sight, a little thought shows that SUMPRODUCT retains its unique position in the Excel developers toolkit. Why? Because COUNTIFS and SUMIFS are still unable to calculate values in closed workbooks just as their predecessors could not; and the Excel 2007 functions are still not able to accommodate the complex extra functions that can be added to the conditional ranges in SUMPRODUCT.
Performance Considerations Double Unary v * Operator
In most circumstances, either the '*' or -- versions of SUMPRODUCT can be used, and both will function correctly. There are some exceptions to this. Consider a table of names and amounts in A1:B10, where row 1 is a text heading of 'Name' and 'Amount'. The formula =SUMPRODUCT(--(A1:A10="Bob"),--(B1:B10>0),B1:B10)
will correctly sum the positive values in column B where the value in column is 'Bob'. However, this formula =SUMPRODUCT((A1:A10="Bob")*(B1:B10>0)*(B1:B10))
returns a #VALUE! Error. The reason for the error is due to the text in B1, multiplying a text value creates an error. To overcome it with the latter form, the ranges need to start beyond the heading, in A2 and B2[6]. Similalrly, if one or more of the ranges within the formula is multi-column, then the '*' operator again has to be used. Whilst this formula fails =SUMPRODUCT(--(A1:A10="Bob"),--(B1:C10>0),--(B1:C10))
this formula works perfectly well =SUMPRODUCT((A1:A10="Bob")*(B1:C10>0)*(B1:C10))
as indeed does this =SUMPRODUCT((A1:A10="Bob")*(B1:C10>0),B1:C10)[7]
Using Transpose If using the TRANSPOSE function within SUMPRODUCT, then the '*' operator has to be used.
Formula Efficiency Most people will be familiar with the fact that array formulas can be very expensive, and if overused can significantly impair the recalculation of a worksheet/workbook. Whilst SUMPRODUCT is not an array formula per se, it suffers from the same problem. Although SUMPRODUCT is often faster than an equivalent array formula, it is marginal. And like array formula, SUMPRODUCT is much slower than COUNTIF/SUMIF,thus it is better to use these if appropriate. So, never use SUMPRODUCT in this situation =SUMPRODUCT((A1:A10="Ford")*(C1:C10))
Use the equivalent SUMIF =SUMIF(A1:A10,"Ford",C1:C10)
Even two COUNTIF /SUMIF functions are quicker than one SUMPRODUCT, so this formula =COUNTIF(A1:A10,>=10)-COUNTIF(A1:A10,>20)
will be more efficient than this one, =SUMPRODUCT((A1:A10>=10)*(A1:A100,1,0)) will count the number of sales (not the number of units sold) in which the product was a Fax OR the salesman was Jones (or both). Addition acts as an OR because the result it TRUE (or 0) if either one or both of the elements are TRUE ( 0). It is FALSE ( = 0) only when both elements are FALSE (or 0). This formula adds two arrays: the results of the comparisons A2:A10 to "Fax", and the results of the comparisons B2:B10 to "Jones". Each of these arrays is an array of TRUE and FALSE values, each element being the result of comparing one cell to "Fax" or "Jones". It then adds these two arrays. When you add two arrays, the result is itself an array, each element of which is the sum of the corresponding element of the original arrays. For example, {1, 2, 3} + {4, 5, 6} = {1+4, 2+5, 3+6} = {5, 7, 9}. For each element in the sum array (A2:A10="Fax")+(B2:B10="Jones"), if that element is greater than 0, IF returns 1, otherwise it returns 0. Finally, SUM just adds up the array. An "exclusive or" or XOR operation is a comparison that returns TRUE when exactly one of the two elements is TRUE. XOR is FALSE if both elements are TRUE or if both elements are FALSE. Arithmetically, we can use the MOD operator to simulate an XOR operation. For example, to count the number of sales in which the product was a Fax XOR the salesman was Jones (excluding Faxes sold by Jones), we can use the following formula: =SUM(IF(MOD((A2:A10="Fax")+(B2:B10="Jones"),2),1,0))
A "negative and" or NAND operation is a comparison that returns TRUE when neither or exactly one of the elements is TRUE, but returns FALSE if both elements are TRUE. For example, we can count the number of sales except those in which Jones sold a Fax with the formula: =SUM(IF((A2:A10="Fax")+(B2:B10="Jones")2,1,0))
Creating Sequences And Loops For Array Formulas When you are constructing some types of array formulas, you need to create a sequence of numbers for a function to process as an array. As an example, consider an array formula that will compute the average of the Nth largest elements in a range. To do this, we will use the LARGE function to get the largest numbers, and then pass those numbers as an array to AVERAGE to compute the average. Normally, the LARGE function takes as parameters a range to process and a number indicating which largest value to return (1 = largest, 2 = second largest, etc.,). But LARGE does work with arrays for its second parameter. You might be tempted to type in the array in the formula yourself: =LARGE(A1:A10,{1,2,3}). While this will indeed work, it is tedious. Instead, you can use the ROW function to return a sequence of numbers. When used in an array formula, the function ROW(m:n) will return an array of integers from m to n. Therefore, we can use ROW to create the array to pass to LARGE. This changes our array formula to =LARGE(A1:A10,ROW(1:3)). This brings us closer to a good formula, but two things remain. First, if you insert a row between rows 1 through 3, Excel will change the row reference 1:3, and therefore the formula will average the wrong numbers. Second, the formula is locked into the three largest values. We can make it more flexible by making the number of elements to average a cell reference that can be easily changed. For example, we can specify that cell C1 contains the size of the array to pass to LARGE. This is accomplished with the INDIRECT function. (Click here for more information about INDIRECT.) The INDIRECT function converts a string representing a cell reference into an actual cell reference. The sub-formula ROW(INDIRECT("1:"&C1)) will return an array of numbers between 1 and the value in cell C1. Now, coming together the formula to average the N largest values in A1:A10 becomes: =AVERAGE(LARGE(A1:A10,ROW(INDIRECT("1:"&C1))))
Formulas That Return Arrays The other type of array formula is one that returns an array of numbers as its result. These sort of array formulas are entered into multiple cells that are then treated as a group. For example, consider the formula =ROW(A1:A10). If this is entered into one cell, either as a normal formula or as an array formula, the result will be 1 in that single cell. If, however, you array enter it into a range of cells each cell will contain one element of the array. To do this, you first must select the range of cells in to which the array should be written, say C1:C10, type the formula =ROW(A1:A10), and then press CTRL SHIFT ENTER. The elements of the array {1, 2, ...., 10} will be written to the range of cells, with one element of the array in each cell. When you array enter a formula into an array of cells, Excel prevents you from modifying a single cell with that array range. You may select the entire range, edit
the formula, and array-enter it again with CTRL SHIFT ENTER, but you cannot change a single element of the array. Some of the built-in Excel functions return an array of values. These formulas must be entered into an array of cells. For example, the MINVERSE function returns the inverse of a matrix with an equal number of rows and columns. Since the inverse of a matrix is itself a matrix, the MINVERSE function must be entered into a range of cells with the same number of rows and columns as the matrix to be inverted. Therefore, if your matrix is in cells A1:B2 (two rows and two columns), you must select a range the same size, type the formula =MINVERSE(A1:B2) and press CTRL SHIFT ENTER rather than just ENTER. This enters the formula as an array formula into all the selected cells. If you were to use the MINVERSE function in a single cell, only the upper left corner value of the inverted matrix would be returned. For information about writing your own VBA functions that return arrays, see Writing Your Own Functions In VBA.
Other Useful Array Functions Array formulas can do a wide variety of tasks. A few miscellaneous array formulas are shown below: Sum Ignoring Errors Normally, if there is an error in a cell, the SUM function will return that error. The following formula will ignore the error values. =SUM(IF(ISERROR(A1:A10),0,A1:A10)) Average Ignoring Errors This formula will ignore errors when averaging range. =AVERAGE(IF(ISERROR(A1:A10),FALSE,IF(A1:A10="",FALSE,A1:A10))) Average Ignoring Zeros This formula will ignore zero values in an AVERAGE function. =AVERAGE(IF(A1:A100,A1:A10,FALSE)) Sum Of Absolute Values You can sum a range of number treating them all as positive using the ABS function. =SUM(ABS(A1:A10)) Sum Of Integer Portion Only This formula will sum only the integer portion of the numbers in A1:A10. The fractional portion is discarded. =SUM(TRUNC(A1:A5)) Longest Text In Cells This formula will return the contents of the cell with the longest amount of text in it. =OFFSET(A1,MATCH(MAX(LEN(A1:A10)),LEN(A1:A10),0)-1,0,1,1)
Array Formulas Versus The Data Functions There is considerable overlap between what you can accomplish with array formulas and what you can do with the so called D-Functions (DSUM, DCOUNT, and so on). Broadly speaking, the D-Functions are faster than their array formula counterparts. If you have a large and complex workbook with many array formulas, you may see a significant improvement in calculation time if you convert your array formulas to D-Functions. The primary differences between the DFunctions and array formulas are as follows: • • •
D-Functions are typically faster than array formulas, all else being equal The selection criteria in a D-Function must reside in cells. Array formulas can include the selection criteria directly in the formula D-Functions can return only a single value to a single cell, while array formulas can return arrays to many cells
Tables And Lookups This page describes a number of formulas to return data from tables and formulas to look up data in tables.
Introduction Almost every worksheet contains at least one table of data, typically a set of rows and columns. Very frequently, you will need to return a row or column of values from the table the row or column position in the table, or you may need to return a value from the table based upon a match of values in the row headers and column headers. For example, you may need to return the 5th row of a table, or you may need to return the row where the ID number is 1234. The simplest types of lookups are performed with the VLOOKUP or HLOOKUP functions. The functions are well documented in the Help file and are not discussed in detail on this page. It is assumed that you are familiar with VLOOKUP and HLOOKUP. For more complicated lookups in tables, we will use formulas based on the OFFSET, MATCH, and INDEX functions. While the Help file describes these functions individually, it does not describe how these functions can be combined to create more powerful and flexible lookup formulas. That is the goal of this page. At the core of most of the formulas on this page is the OFFSET function. You should be familiar with this function before proceeding with this page. Most of the formulas on this page are array formulas. Array formulas are described in detail on the Array Formulas page on this web site. You should be at ease with array formulas in order to modify the lookup formulas presented on this page. With few exceptions, the formulas on this page use only a single range reference, a Defined Name that refers to the data table against which the lookup is
performed. Using a single reference may make the formulas longer, but it also makes them considerably more flexible. To use the formulas on your own worksheets, you need only modify a single name. This convenience makes up for, in my opinion, the longer formula length. Of course, if you are not using a defined name, simply replace the name in the formula with the appropriate range reference. If the formulas on this page do not return the expected result when you use them on your own worksheets, the first thing to check is to ensure that the formula is entered as an array formula. If you are unsure whether a formula needs to be array entered, go ahead and enter it as an array formula; that is completely safe. ENTERING AN ARRAY FORMULA: When you enter a formula as an array formula, you must press CTRL SHIFT ENTER rather than just ENTER when you first enter the formula and whenever you edit it later. If you do this properly, Excel will display the formula in the formula bar enclosed in curly braces, { }. You do not type in the curly braces, { }; Excel will display them automatically. In the interest of brevity and clarity, the formulas on this page do not have any error checking and handling. For example, there is nothing to prevent you from attempting to return the 6th row of a table that has only 4 rows. If a parameter in a function call is invalid, you will most likely get a #N/A error. You may want to add some error checks when you use these formulas in your own worksheets. As is the case with many types of formulas in Excel, there are several different ways to accomplish the same thing. Many of the formulas on this page could be written with a combination of the INDEX and MATCH functions instead of the OFFSET function. OFFSET is neither better nor worse than INDEX/MATCH. For consistency, I have chosen to use OFFSET for nearly all the tasks at hand. Other sources may use other methods. I encourage you to learn a variety of ways to accompish a task.
Example Data The example formulas in the first section of this page, those formulas for returning rows and columns of a table, use the following data table.
This table contains two named ranges that are used in the formulas. The name Table refers to the entire table, cells B2:G7, which includes the row labels and column labels. The name InnerTable refers only the the actual data, cells C3:G7, which does not include the row labels and the column labels. For illustration, the values of the row labels (abby, beth, etc.) and the column labels (apples, oranges, etc) are in alphabetical order. This is for illustration only. The formulas do not require that the values be in any particular order.
Returning A Row Or Column From A Table You can use an array formula to return a single row or column from a table. The formulas in this section need to be array entered (press CTRL SHIFT ENTER rather than just ENTER) into a number of cells equal to the size of the row or column of the table. The example table contains 6 columns (including the row header); thus, you would select a range that is 6 columns wide and 1 row tall, enter the formula and press CTRL SHIFT ENTER. The first formulas return a single row, based on position, from Table or InnerTable. =OFFSET(Table,E13-1,0,1,COLUMNS(Table)) In this formula, cell E13 contains the row to return. The row is 1-based (the title row is 1, the first row of data is 2, etc). The OFFSET function uses 0-based rows and columns, so we subtract 1 from the row number before passing it into the OFFSET function. If cell E13 contains the number 5, the formula returns the following values:
The following formula returns a row from the InnerTable range. It return only the data values, not the row header. =OFFSET(InnerTable,E18-1,0,1,COLUMNS(InnerTable)) In this formula, cell E18 contains the 1-based row of InnerTable to return. Thus, if cell E18 contains 5, the formula returns the following values.
By changing the values that are passed to the OFFSET function, we can return a column from either the Table or InnerTable range, either by using a column offset or the value of a column label. The following formula will return a column from the Table range. =OFFSET(Table,0,E22-1,ROWS(Table),1) If cell E22 contains the value 3, the third column of Table is returned, as shown below:
Since this formula returns a column of data from Table, it should be array entered into to a range that is one column wide and has the same number of rows and the Table range. You can also return a column from Table that corresponds to a matching column label. The following formula will return the column from Table whose column label is equal to the value in cell E39. =OFFSET(Table,0,MATCH(E39,OFFSET(Table,0,0,1,COLUMNS(Table)),0)1,ROWS(Table),1) If cell E39 contains the value plums, the following values are returned.
Calculations On Rows Or Columns Of A Table Because the formulas described above return arrays of values, either a row or column of the InnerTable, you can use those formulas with functions that accept arrays. Indeed, you can use the row and column functions in any function or formula where you would normally provide a range of cells, such as in the SUM, MIN, MAX, or AVERAGE functions, among others. For example, the following formula will return the SUM of the row whose row label is equal to the value in cell E48. =SUM(OFFSET(InnerTable,MATCH(E48,OFFSET(Table,0,0,ROWS(Table),1),0)2,0,1,COLUMNS(InnerTable))) If cell E48 contains the value callie, this formula will return the value 560. You can get the maximum or minimum of the row by changing SUM to MAX or MIN. These formula do not need to be entered as array formulas, although it is harmless to do so. A very similar formula can be used to return the sum, minimum, or maximum of a column in the table. The following formula will return the sum of the values in the column of Table where the column label is equal to the value in cell E52. =SUM(OFFSET(InnerTable,0,MATCH(E52,OFFSET(Table,0,0,1,COLUMNS(Table)),0)2,ROWS(InnerTable),1)) If cell E52 contains oranges, the formula will return 535. As before, you can change SUM to MIN or MAX to return the minimum or maximum of the column. Again, these formulas need not be array entered.
Last Value In A Row Or Column You can use a formula to return the last cell in a row or column, where the row or column is select either by its position in the table or by a match of a value with the row or column label. The following formula will return the last (right-most) value in a row of Table, where cell E56 contains the 1-based row position: =OFFSET(Table,E56-1,COLUMNS(Table)-1,1,1) If E56 contains 4, the result is 122, the last value in the 4th column of Table (including the column labels). You also select the row to use by matching a row label. If cell E59 contains the value callie, the following formula will return 122, the right-most value in the row whose row label is callie. =OFFSET(Table,MATCH(E59,OFFSET(Table,0,0,ROWS(Table),1),0)-1,COLUMNS(Table)1,1,1) The following formulas will return the last (bottom-most) value of a column, selected by either its position in Table (cell E62) or by a match of a column label (in cell E65). =OFFSET(Table,E62-1,COLUMNS(Table)-1,1,1) =OFFSET(Table,ROWS(Table)-1,MATCH(E65,OFFSET(Table,0,0,1,COLUMNS(Table)),0)1)
Double Lookups A double lookup is a formula that returns a value from a table based on a match of values in both the rows and columns. Refering to the example data shown above, you may want to return the value corresponding to the dora row and the plums column. If cell E74 contains the value to match on the rows (e.g., dora) and cell E75 contains the value to match on the columns (e.g., plums), the following formula will return the appropriate value from the Table range: =OFFSET(Table,MATCH(E74,OFFSET(Table,0,0,ROWS(Table),1),0)-1, MATCH(E75,OFFSET(Table,0,0,1,COLUMNS(Table)),0)-1)
Left Lookups While the VLOOKUP function is very useful, it has a significant limitation. That is that you can only return a value to the right of the lookup column. For example, you can look in column B for a value and then return the corresponding value from column D. However, the reverse is not true. You cannot look up a value in column D and return the corresponding value from column B. This is where a Left Lookup formula is useful. For example, suppose you have the following table, and a defined name of LLTable that refers to the actual data (colored in red).
The following formula will look for a value in the Value column and return the corresponding value in the Type column. =OFFSET(LLTable,MATCH(F67,OFFSET(LLTable,0,1,ROWS(LLTable),1),0)-1,0,1,1) In this formula, cell F67 contains the value to be searched for in the Value column. Thus, if F67 contains 44, the formula will return dd.
Upper Lookups The HLOOKUP function is the "transpose" of the VLOOKUP function. As VLOOKUP scans down a column for a match and then moves to the right to return a value, HLOOKUP scans across a row for a match and then moves down to return a value. HLOOKUP cannot move upwards to return a value. For example, you can search row 5 to find a match and then return the corresponding value from row 8, but the reverse is not possible. You cannot scan row 8 and return a value from row 5. Just as the Left Lookup formula overcame the limitation of VLOOKUP, an Upper Lookup formula can overcome the limitation of HLOOKUP. Consider the following table:
In this table, the range displayed in red has the name ULTable. The followng formula will allow you to look in the Value row for a value equal to cell J82 and return the corresponding value from the Type row. =OFFSET(ULTable,0, MATCH(J82, OFFSET(ULTable,ROWS(ULTable)1,0,1,COLUMNS(ULTable)),0)-1,1,1) For example, if J82 contains 33, the formula will return cc.
Arbitrary Lookups Another limitation of the VLOOKUP function is that if there are duplicate matches in the lookup column, the first occurrence of the matching value is used. For example, consider the following table of data:
With a simple VLOOKUP function for the value Beth, the value 22 will be returned, since 22 corresponds to the first occurrence of the value Beth. It may be necessary, however, to return the value corresponding to the second or third occurrence of Beth. If the table of values (colored in red, excluding the Name and Score column labels) is named ALTable, the following formula will return the value form the Score column corresponding the the Nth occurrence of the value in cell F90, where the number N is in cell F91. For example, if F90 contains the value Beth and cell F91 contains the value 3 (indicating to find the 3rd occurrence of Beth), the formula will return the value 88. =INDEX(ALTable,SMALL(IF(OFFSET(ALTable,0,0,ROWS(ALTable),1)=F90, ROW(OFFSET(ALTable,0,0,ROWS(ALTable),1))-ROW(OFFSET(ALTable,0,0,1,1))+1, ROW(OFFSET(ALTable,ROWS(ALTable)-1,0,1,1))+1),F91),2) A special case of the arbitrary lookup formula above is to return the value corresponding to the last occurrence in the list. For example, if cell F94 contains the value Beth, the following formula will return the value 88, which corresponds to the last occurrence of the value Beth. =INDEX(ALTable,SMALL(IF(OFFSET(ALTable,0,0,ROWS(ALTable),1)=F94, ROW(OFFSET(ALTable,0,0,ROWS(ALTable),1))-ROW( OFFSET(ALTable,0,0,1,1) )+1, ROW(OFFSET(ALTable,ROWS(ALTable)-1,0,1,1)) +1),COUNTIF(OFFSET(ALTable,0,0,ROWS(ALTable),1),F94)),2)
Closest Match Lookups The MATCH function is an important tool when working with lists of data. If you are searching for an exact match in a range of cells, the values may be in any order. However, if you are attempting to find a closest match, the values must be in sorted order. Using the INDEX and MATCH functions, you can write a formula that will return the number in a list that is closest to a specified value. We will look at three related Closest Match formula. These three formulas are based on the example data shown below. All three formulas are array formulas and must be properly entered. This list of values has the defined name of CMTable.
The following array formula will return the smallest number in the list CMTable that greater than or equal to the value in cell E105. =INDEX(CMTable,MATCH(MIN(IF(CMTable-E105>=0,CMTable,FALSE)),IF(CMTableE105>=0,CMTable,FALSE),0)) Thus is E105 has the value 5, the formula will return 5.1, which is the smallest number in the list that is greater than or equal to 5. The second Closest Match formula will return the largest number in a list that is less than or equal to a specified number. In the following formula, cell E108 contains the test value. =INDEX(CMTable,MATCH(MAX(IF(CMTable-E1081,"Duplicates","No Duplicates")
Highlighting Duplicate Entries You can use Excel's Conditional Formatting tool to highlight duplicate entries in a list. All of the examples in this section assume that the data to be tested and highlighted is in the range B2:B11. You should change the cell references to the appropriate values on your worksheet.
This first example will highlight duplicate rows in the range B2:B11. Select the cells that you wish to test and format, B2:B11 in this example. Then, open the Conditional Formatting dialog from the Format menu, change Cell Value Is to Formula Is, enter the formula below, and choose a font or background format to apply to cells that are duplicates. =COUNTIF($B$2:$B$11,B2)>1 The formula above, when used in Conditional Formatting, will highlight all duplicates. That is, if the value 'abc' occurs twice in the list, both instances of 'abc' will be highlighted. This is shown in the image to the left, in which all occurrences of 'a' and 'g' are higlighted.
You can use the following formula in Conditional Formatting to highlight only the first occurrence of an entry in the list. For example, the first occurrence of 'abc' will be highlighted, but the second and subsequent occurrences of 'abc' will not be highlighted. =IF(COUNTIF($B$2:$B$11,B2)=1,FALSE,COUNTIF($B$2:B2,B2)=1) This is shown at the left where only the first occurrences of the duplicate items 'a', 'e', and 'g' are highlighted. The second and subsequent occurrences of these values are not highlighted.
You can also do the reverse of this with Conditional Formatting. Using the formula below in Conditional Formatting will highlight only the second and subsequent occurrences of a value. The first occurrence of the value will not be highlighted. =IF(COUNTIF($B$2:$B$11,B2)=1,FALSE,NOT(COUNTIF($B$2:B2,B2)=1)) This is shown at the left where only the second occurrences of 'a', 'b', 'c' and 'f' are highlighted. The first occurrences of these items are not highlighted.
Another formula for Conditional Formatting will highlight only the last occurrence of a duplicate element in a list (or the element itself if it occurs only once). =IF(COUNTIF($B$2:$B$11,B2)=1,TRUE,COUNTIF($B$2:B2,B2)=COUNTIF($B$2:$B$11,B2) ) As you can see only the last occurrences of elements 'a', 'b', 'c', and 'f' are highlighted. Element 'd' is highlighted because it occurs only once. The occurrences of 'a', 'b', 'c' and 'f' that occurs before the last occurrence are not highlighted.
We can round out our discussion of highlighting duplicate rows with two additional formula related to distinct items in a list.
The following can be used in Conditional Formatting to highlight elements that occur only once in the range B2:B11. =COUNTIF($B$2:$B$11,B2)=1 This image illustrates the formula. Elements 'b', 'c', and 'e' are highlighted because they occur only once in the list. Items 'a', 'd' and 'f' are not highlighted because they occur more than one time in the list.
Finally, the following formula can be used in Conditional Formatting to highlight the distinct values in B2:B11. If an element occurs once, it is highlighted. If it occurs more then once, then only the first occurrence is highlighted. =COUNTIF($B$2:B2,B2)=1 As you can see, only the first or only occurrences of the elements are highlighted. If an element is duplicated, as is 'b', the duplicate elements are not highlighted.
Functions For Duplicates All of the formulas described above for Conditional Formatting can also be used in worksheet cells. They are all array formulas, so you must select the range for the results, type in the formula, and press CTRL SHIFT ENTER. The results of each formula will be a series of True or False values. The True results correspond to those cells that are highlighted in Conditional Formatting and the False results correspond to those cells that are not highlighted by Conditional Formatting.
Counting Distinct Entries In A Range The following formulas will return the number of distinct items in the range B2:B11. Remember, all of these are array formulas. The following formula is the longest but most flexible. It will properly count a list that contains a mix of numbers, text strings, and blank cells. =SUM(IF(FREQUENCY(IF(LEN(B2:B11)>0,MATCH(B2:B11,B2:B11,0),""), IF(LEN(B2:B11)>0,MATCH(B2:B11,B2:B11,0),""))>0,1)) If your data does not have any blank entries, you can use the simpler formula below. =SUM(1/COUNTIF(B2:B11,B2:B11)) If your data has only numeric values or blank cells (no string text entries), you can use the following formula: =SUM(N(FREQUENCY(B2:B11,B2:B11)>0))
Array Formulas Many of the formulas described here are Array Formulas, which are a special type of formula in Excel. If you are not familiar with Array Formulas, click here.
Array To Column Sometimes it is useful to convert an MxN array into a single column of data, for example for charting (a data series must be a single row or column). Click here for more details.
Averaging Values In A Range You can use Excel's built in =AVERAGE function to average a range of values. By using it with other functions, you can extend its functionality. For the formulas given below, assume that our data is in the range A1:A60.
Averaging Values Between Two Numbers Use the array formula =AVERAGE(IF((A1:A60>=Low)*(A1:A60=5)*(A1:A10=5)*(A1:A10Options dialog, click on the Calculation tab, and check the Interations check box. Then, enter the following formula in cell B1: =MAX(A1:A10,B1) Cell B1 will contian the highest value that has ever been present in A1:A10, even if that value is deleted from the range. Use the =MIN function to get the lowest ever value. Another method to do this, without using circular references, is provided by Laurent Longre, and uses the CALL function to access the Excel4 macro function library. Click here for details.
Left Lookups The easiest way do table lookups is with the =VLOOKUP function. However, =VLOOKUP requires that the value returned be to the right of the value you're looking up. For example, if you're looking up a value in column B, you cannot retrieve values in column A. If you need to retrieve a value in a column to the left of the column containing the lookup value, use either of the following formulas: =INDIRECT(ADDRESS(ROW(Rng)+MATCH(C1,Rng,0)-1,COLUMN(Rng)-ColsToLeft)) Or =INDIRECT(ADDRESS(ROW(Rng)+MATCH(C1,Rng,0)-1,COLUMN(A:A) )) Where Rng is the range containing the lookup values, and ColsToLeft is the number of columns to the left of Rng that the retrieval values are. In the second syntax, replace "A:A" with the column containing the retrieval data. In both examples, C1 is the value you want to look up.
Minimum And Maximum Values In A Range Of course you can use the =MIN and =MAX functions to return the minimum and maximum values of a range. Suppose we've got a range of numeric values called NumRange. NumRange may contain duplicate values. The formulas below use the following example:
Address Of First Minimum In A Range To return the address of the cell containing the first (or only) instance of the minimum of a list, use the following array formula: =ADDRESS(MIN(IF(NumRange=MIN(NumRange),ROW(NumRange))),COLUMN(NumRange),4) This function returns B2, the address of the first '1' in the range.
Address Of The Last Minimum In A Range To return the address of the cell containing the last (or only) instance of the minimum of a list, use the following array formula: =ADDRESS(MAX(IF(NumRange=MIN(NumRange),ROW(NumRange)*(NumRange""))), COLUMN(NumRange),4) This function returns B4, the address of the last '1' in the range.
Address Of First Maximum In A Range To return the address of the cell containing the first instance of the maximum of a list, use the following array formula: =ADDRESS(MIN(IF(NumRange=MAX(NumRange),ROW(NumRange))),COLUMN(NumRange),4) This function returns B1, the address of the first '5' in the range.
Address Of The Last Maximum In A Range To return the address of the cell containing the last instance of the maximum of a list, use the following array formula: =ADDRESS(MAX(IF(NumRange=MAX(NumRange),ROW(NumRange)*(NumRange""))), COLUMN(NumRange),4) This function returns B5, the address of the last '5' in the range.
Most Common String In A Range The following array formula will return the most frequently used entry in a range: =INDEX(Rng,MATCH(MAX(COUNTIF(Rng,Rng)),COUNTIF(Rng,Rng),0)) Where Rng is the range containing the data.
Ranking Numbers Often, it is useful to be able to return the N highest or lowest values from a range of data. Suppose we have a range of numeric data called RankRng. Create a range next to RankRng (starting in the same row, with the same number of rows) called TopRng. Also, create a named cell called TopN, and enter into it the number of values you want to return (e.g., 5 for the top 5 values in RankRng). Enter the following formula in the first cell in TopRng, and use Fill Down to fill out the range: =IF(ROW()-ROW(TopRng)+1>TopN,"",LARGE(RankRng,ROW()-ROW(TopRng)+1)) To return the TopN smallest values of RankRng, use =IF(ROW()-ROW(TopRng)+1>TopN,"",SMALL(RankRng,ROW()-ROW(TopRng)+1)) The list of numbers returned by these functions will automatically change as you change the contents of RankRng or TopN.
Removing Blank Cells In A Range The procedures for creating a new list consisting of only those entries in another list, excluding blank cells, are described in NoBlanks.
Summing Every Nth Value You can easily sum (or average) every Nth cell in a column range. For example, suppose you want to sum every 3rd cell. Suppose your data is in A1:A20, and N = 3 is in D1. The following array formula will sum the values in A3, A6, A9, etc. =SUM(IF(MOD(ROW($A$1:$A$20),$D$1)=0,$A$1:$A$20,0)) If you want to sum the values in A1, A4, A7, etc., use the following array formula: =SUM(IF(MOD(ROW($A$1:$A$20)-1,$D$1)=0,$A$1:$A$20,0)) If your data ranges does not begin in row 1, the formulas are slightly more complicated. Suppose our data is in B3:B22, and N = 3 is in D1. To sum the values in rows 5, 8, 11, etc, use the following array formula: =SUM(IF(MOD(ROW($B$3:$B$22)-ROW($B$3)+1,$D$1)=0,$B$3:B$22,0)) If you want to sum the values in rows 3, 6, 9, etc, use the following array formula: =SUM(IF(MOD(ROW($B$3:$B$22)-ROW($B$3),$D$1)=0,$B$3:B$22,0))
Miscellaneous Sheet Name
Suppose our active sheet is named "MySheet" in the file C:\Files\MyBook.Xls. To return the full sheet name (including the file path) to a cell, use =CELL("filename",A1) Note that the argument to the =CELL function is the word "filename" in quotes, not your actual filename. This will return "C:\Files\[MyBook.xls]MySheet" To return the sheet name, without the path, use =MID(CELL("filename",A1),FIND("]",CELL("filename",A1))+1, LEN(CELL("filename",A1))-FIND("]",CELL("filename",A1))) This will return "MySheet"
File Name Suppose our active sheet is named "MySheet" in the file C:\Files\MyBook.Xls. To return the file name without the path, use =MID(CELL("filename",A1),FIND("[",CELL("filename",A1))+1,FIND("]", CELL("filename",A1))-FIND("[",CELL("filename",A1))-1) This will return "MyBook.xls" To return the file name with the path, use either =LEFT(CELL("filename",A1),FIND("]",CELL("filename",A1))) Or =SUBSTITUTE(SUBSTITUTE(LEFT(CELL("filename",A1),FIND("]", CELL("filename",A1))),"[",""),"]","") The first syntax will return "C:\Files\[MyBook.xls]" The second syntax will return "C:\Files\MyBook.xls" In all of the examples above, the A1 argument to the =CELL function forces Excel to get the sheet name from the sheet containing the formula. Without it, and Excel calculates the =CELL function when another sheet is active, the cell would contain the name of the active sheet, not the sheet actually containing the formula.
View more...
Comments