Using Grep to Count Number of Match

In this guide, we learn how to use the grep command to count the number of matches in the file(s) or a directory.

Counting with Grep

One of the useful features of grep is to count the number of lines that match a pattern. This is done using the -c or --count option.

Example:

grep -c Linux samplefile.txt

This command will print out a number representing how many lines in "samplefile.txt" contain the word "Linux". Count 5 in the output indicates that there are 5 lines that contain the match.

Counting every individual occurrence

As we see in the previous section the grep -c command only counts the number of lines it matches. What about counting every individual occurrence of a match on each line? The grep command by direct does have the option to count all individual occurrences of a pattern.

The workaround is to use grep with the -o option and wc -l command.

Example:

grep -o Linux samplefile.txt | wc -l

using grep to count all occurrence of a match

Where -o tell grep to output only the matching parts and wc -l count the number of lines from its input. In the above command, it will print out a number representing how many times the word "Linux" appears in "samplefile.txt".

Remember the option -i to tell grep to ignore case and -w for exact word matching - those can be combined based on your requirement.

Count occurrences in a directory with grep:

What about telling grep to take a count of occurrences in a directory and its subdirectories? Then we can combine -r or -R option for the recursive operation. Example:

grep -o -r Linux dir1/ | wc -l

grep to count occurrences in a directory

In case you want to count matches with specific files inside a directory run:

grep -o "Linux" ./*.txt | wc -l

This command counts all individual occurrences of the word "Linux" in all .txt files within the current directory.

2. Syntax

The basic syntax for grep counting:

grep -c <pattern> filename

Where c stands for count

Let's use the following sample file named text.txt

3. Counting Matching Lines

Let's first check how we can use grep to search for a specific pattern in a given file.

Example:

$ grep "unix" test.txt

This will just match the pattern “unix” and print matching lines highlighting the matching pattern. Note that grep is case-sensitive by default.

the output of the grep command searching for the string 'unix' in the file test.txt

To get the count of lines where this pattern is matching we need to use -c option. Where -c counts the number of matching lines.

$ grep -c "unix" test.txt

2 occurrences of the word "unix" in the file "test.txt".

Output 2 indicates two lines have the matching pattern "unix".

4. Counting Multiple Matches in a Line

To find multiple matches per line we need to use -o option with grep. The -o option extracts each occurrence of the pattern on a separate line,

$ grep -o "unix" test.txt

Terminal screenshot displaying the output of the command 'grep -o "unix" test.txt'.

This will search for the word "unix" in the file test.txt and display each occurrence of "unix" on a separate line.

Now we can use wc -l to filter out the count of each match. Alternatively, use the -c option.

$ grep -o "unix" test.txt | wc -l
or
$ grep -o "unix" test.txt | grep -c "unix"

Terminal screenshot displaying the count of each occurrences of the word 'unix' in the file 'test.txt' using the 'grep' command followed by the 'wc' command.

To grep the number of unique occurrences, type:

$ grep -o "unix" test.txt | sort | uniq -c

Let's look at another example using regular expression.

$ grep -o " u[a-z]*" test.txt

The regular expression u[a-z]* matches "u" followed by zero or more lowercase letters. So it will match words such as "unix," "ubuntu," and "uTorrent."

Terminal screenshot displaying the results of the 'grep -o " u[a-z]*" test.txt' command

This will output the total count of occurrences of the pattern " u[a-z]*" in the file test.txt.

$ grep -o " u[a-z]*" test.txt | wc -l

This means that there are two occurrences of the pattern " u[a-z]*" in the file test.txt.

4.1 Counting Matches Multiple Files

Grep functionality is not just limited to a single file, we can also use it to match a pattern in multiple files.

To understand this let's create another file named test2.txt with some sample data.

Let's first search for the word "unix" in all files with a .txt extension in the current directory.

$ grep "unix" *.txt

Terminal output showing the results of the 'grep "unix" *.txt' command

Now let's display the count of matching lines for each file.

$ grep -c "unix" *.txt
or
$ grep -c "unix" test.txt test2.txt

In this example, test.txt contains 2 lines with the word "unix", test2.txt contains 1 line.

5. Case-Insensitive Matching

By using the -i option, grep treats uppercase and lowercase characters as equivalent during the search, effectively making it case-insensitive.

For example:

$ grep -i "unix" test.txt

grep case-Insensitive matching using -i option

It searches for all patterns as unix, Unix,uNix etc. irrespective of the case. To get the counts of the simply add -c option to get the count.

$ grep -ic "unix" test.txt

grep case-Insensitive matching using -i option and get the count

Ignore the case and let's do count of unique matches:

$ grep -oi "unix" test.txt | sort | uniq -c

6. Grep Count Recursive

The -r option is used to search for a pattern recursively and search for the pattern in each file and every directory from the current location.

The current directory contains two files (test.txt and test2.txt and a directory (opsys).

The following command will search for the string "unix" in all files within the current directory and its subdirectories. If a match is found, grep will display the line containing the matched pattern along with the corresponding file name.

$ grep -r "unix" *

Terminal output displaying the recursive search results for the pattern 'unix' using the 'grep -r "unix" *' command

To count the number of occurrences of the string "unix" in all files within the current directory and its subdirectories:

$ grep -or "unix" * | wc -l

Output 9 indicates the total count of occurrences of the string "unix" in test.txt, test2.txt, and opsys/text2.txt.

7. Count Commented / Empty Lines

grep can be a handy tool for identifying commented lines and empty files within a shell script. Here we will use the following sample file named test.txt

7.1. Commented Lines (#)

Use the following grep command to count the number of lines in a file that start with a hash symbol (#).

$ grep -c "^#" test.txt

count commented line in the file named text.txt

Output 1 indicates that there is only 1 commented line that begins with a hash symbol.

7.8. Empty Lines

"^$" is a regular expression pattern that matches empty lines. The caret (^) represents the beginning of a line, and the dollar sign ($) represents the end of a line. When both are combined without any characters in between, it represents an empty line.

The following grep command counts the number of blank lines in a file named test.txt.

$ grep -c "^$" test.txt

count empty lines in the file named text.txt

The output indicates the file contains 4 blank lines.

8. Conclusion

In conclusion, grep is a powerful tool for efficiently counting matches in text files. It offers a straightforward syntax and a range of options to meet diverse matching needs. Whether it's counting matching lines, multiple matches within a line, or searching recursively in directories, grep provides an effective solution.

About The Author

Bobbin Zachariah

Bobbin Zachariah is an experienced Linux engineer who has been supporting infrastructure for many companies. He specializes in Shell scripting, AWS Cloud, JavaScript, and Nodejs. He has qualified Master’s degree in computer science. He holds Red Hat Certified Engineer (RHCE) certification and RedHat Enable Sysadmin.

Using Grep to Count Number of Match

Counting with Grep

Counting every individual occurrence

2. Syntax

3. Counting Matching Lines

4. Counting Multiple Matches in a Line

4.1 Counting Matches Multiple Files

5. Case-Insensitive Matching

6. Grep Count Recursive

7. Count Commented / Empty Lines

7.1. Commented Lines (#)

7.8. Empty Lines

8. Conclusion

About The Author

Bobbin Zachariah

Comments

Leave a Reply

Leave a Comment Cancel reply