COUNTIF
The COUNTIF function returns the number of cells in a collection that satisfy a given condition.
COUNTIF(test-array, condition)
test-array: The collection containing the values to be tested. test-array can contain any values.
condition: An expression that compares or tests values and results in the boolean value TRUE or FALSE. condition can include comparison operators, constants, the ampersand concatenation operator, references, and wildcards. You can use wildcards to match any single character or multiple characters in the expression. You can use a ? (question mark) to represent one character, an * (asterisk) to represent multiple characters, and a ~ (tilde) to specify that the following character should be matched rather than used as a wildcard. condition can also contain a REGEX function instead of wildcards.
Notes
Each test-array value is compared to condition. If the value meets the conditional test, it is included in the count.
Examples |
---|
The information in the following table is not meaningful, but is useful to illustrate the type of arguments COUNTIF includes in its result. Given the following table: |
A | B | C | D | |
---|---|---|---|---|
1 | 100 | 200 | 300 | 400 |
2 | lorem | ipsum | dolor | sit |
3 | 100 | 200 | 300 | sit |
4 | TRUE | TRUE | FALSE | FALSE |
5 | 200 | 400 |
=COUNTIF(A1:D1, ">0") returns 4, because all cells in the collection have a value greater than 0. =COUNTIF(A1:D1, "<>100") returns 3, because three cells in the collection have a value that is not 100. You can also use the ≠ operator =COUNTIF(A3:D3, ">=100") returns 3, because all three numbers are greater than or equal to 100 and the text value is ignored in the comparison. You can also use the ≥ operator. =COUNTIF(A1:D5, "=ipsum") returns 1, because the text string "ipsum" appears once in the collection referenced by the range. =COUNTIF(A1:D5, "=*t") returns 2, because a string ending in the letter "t" appears twice in the collection referenced by the range. |
Example using REGEX |
---|
Given the following table: |
A | B | |
---|---|---|
1 | 45 | |
2 | 41 | Aaron |
3 | 29 | |
4 | 64 | |
5 | 12 | Sarah |
=COUNTIF(B1:B5, REGEX("([A-Z0-9a-z._%+-]+)@([A-Za-z0-9.-]+\.[A-Za-z]{2,4})")) returns the number of cells in B1:B5 that contain an email address. |
Example—Survey results |
---|
This example brings together the illustrations used throughout the statistical functions. It is based on a hypothetical survey. The survey was short (only five questions) and had a very limited number of respondents (10). Each question could be answered on a scale of 1 to 5 (perhaps the range from "never" to "always"), or not answered. Each survey was assigned a number (ID#) before mailing. The following table shows the results. Questions that were answered outside the range (incorrect) or not answered are indicated with a blank cell in the table. |
A | B | C | D | E | F | |
---|---|---|---|---|---|---|
1 | ID# | Q1 | Q2 | Q3 | Q4 | Q5 |
2 | 101 | 5 | 4 | 4 | 3 | 4 |
3 | 105 | 3 | 2 | 2 | 3 | 3 |
4 | 102 | 4 | 4 | 4 | 4 | |
5 | 104 | 3 | 4 | 2 | 4 | 3 |
6 | 107 | 4 | 3 | 3 | ||
7 | 106 | 4 | 3 | 3 | 4 | |
8 | 109 | 3 | 4 | 1 | 3 | 4 |
9 | 111 | 5 | 2 | 2 | 5 | 3 |
10 | 121 | 4 | 2 | 3 | 3 | 4 |
11 | 115 | 3 | 3 | 3 | 3 |
To illustrate some of the functions, assume that the survey number included an alphabetic prefix and that the scale was A–E, instead of 1-5. The table would then look like this: |
A | B | C | D | E | F | |
---|---|---|---|---|---|---|
1 | ID# | Q1 | Q2 | Q3 | Q4 | Q5 |
2 | 101 | E | D | D | C | D |
3 | 105 | C | B | B | C | C |
4 | 102 | D | D | D | D | |
5 | 104 | C | D | B | D | C |
6 | 107 | D | C | C | ||
7 | 106 | D | C | C | D | |
8 | 109 | C | D | A | C | D |
9 | 111 | E | B | B | E | C |
10 | 121 | D | B | C | D | |
11 | 115 | C | C | C | C | C |
Using this table of data and some of the available statistical functions, you can gather information about the survey results. Keep in mind that the example is purposely small so results may seem obvious. However, if you had 50, 100, or more respondents and perhaps many more questions, the results would not be obvious. |
Function and arguments | Description of result |
---|---|
=CORREL(B2:B11, C2:C11) | Determines the correlation of question 1 and question 2 using linear regression analysis. Correlation is a measure of how much two variables (in this case, answers to survey questions) change together. Specifically, this would look at the question: If a respondent answered question 1 with a higher (or lower) value than the average for question 1, did the respondent also answer question 2 with a higher (or lower) value than the average for question 2? In this case, the responses are not particularly well correlated (-0.1732). |
=COUNT(A2:A11) or =COUNTA(A2:A11) | Determines the total number of surveys returned (10). Note that if the survey number was not numeric, you would need to use COUNTA instead of COUNT. |
=COUNT(B2:B11) or =COUNTA(B2:B11) | Determines the total number of answers to the first question (9). By extending this formula across the row, you could determine the total number of answers to each question. Because all the data is numeric, COUNTA returns the same results. If, however, the survey had used A through E, instead of 1 through 5, you would need to use COUNTA to tally the results. |
=COUNTBLANK(B2:B11) | Determines the number of empty cells, representing invalid or no answers. If you extended this formula across the row you would find that question 3 (column D) had 3 invalid or not-answered responses. This might cause you to look at this question on the survey to see if it was controversial or poorly worded, as no other question had more than 1 incorrect or not-answered response. |
=COUNTIF(B2:B11, "=5") | Determines the number of respondents that gave a 5 to a particular question (in this case, question 1). If you extended this formula across the row, you would learn that only questions 1 and 4 had any respondents give the question a 5. Had the survey used A through E for the range, you would have used =COUNTIF(B2:B11, "=E"). |
=COUNTIF(B2:B11, "<>5") | Determines the number of respondents that did not give a 5 to a particular question (in this case, question 1). |
=COVAR(B2:B11, C2:C11) | Determines the covariance of question 1 and question 2. Covariance is a measure of how much two variables (in this case, answers to survey questions) change together. Specifically, this would look at the question: If a respondent answered question 1 with a higher (or lower) value than the average for question 1, did the respondent also answer question 2 with a higher (or lower) value than the average for question 2? Note: COVAR would not work with the table using a scale of A–E, as it requires numeric arguments. |
=STDEV(B2:B11) or =STDEVP(B2:B11) | Determines the standard deviation—one measure of dispersion—of the answers to question 1. If you extend this formula across the row, you would see that the answers to question 3 had the highest standard deviation. If the results represented responses from the entire population being studied, rather than a sample, STDEVP would be used instead of STDEV. Note that STDEV is the square root of VAR. |
=VAR(B2:B11) or =VARP(B2:B11) | Determines the variance—one measure of dispersion—of the answers to question 1. If you extended this formula across the row, you would see that the answers to question 5 had the lowest variance. If the results represented responses from the entire population being studied, rather than a sample, VARP would be used instead of VAR. Note that VAR is the square of STDEV. |