Linux Pipe Command: Streamline Your Workflow with Examples

Written by: Bobbin Zachariah   |   Last updated: April 30, 2023

The pipe operator "|" in Linux is a nifty way to connect multiple commands together as a new whole command, to get a desired output. The beauty of pipes in Linux is that they take inspiration from a concept that we all understand - physical pipelines.

Think of a production line in a factory. Each machine in the production line takes raw materials, processes them, then passes them along to the next machine in the line, and so on. Similarly, with pipes in Linux, each command in a pipeline takes data, manipulates it, and passes it along to the next command.

No need to bother with manually feeding data from one command to another, pipes let you automate the process. This clever solution is a must for every user, even if you're a hobbyist or a newcomer to the Linux scene.

In this tutorial, we'll learn basic and advanced pipe usage through multiple examples and commonly used pipeline commands. Let's dive right in.

How Pipes Work in Linux

It's true that Linux commands read input from your keyboard or write output to your screen. Not only that, but they can also handle input and output from other commands. Thanks to pipes, you can forward one command's output to the next command as input. In other words, the result of one command is the source of the other - this chain of commands is called a pipeline.

Understanding Input and Output

So, what do we mean exactly when we say "reading input" or "writing output"?

When you type a command and hit Enter in the terminal, you're feeding data on stdin, or standard input (reading input). On the other hand, when a command prints some type of result, it appears on stdout, or standard output (writing output). These are two fundamental data streams when using Linux pipes.

The pipe operator simply redirects data on stdout of one command to the stdin of the other.

Understanding Pipes

Let's delve even further! A pipe is essentially a block of data that acts as a buffer and comes equipped with two file descriptors - one for reading and the other for writing. The right end of the pipe is for reading and the left end is for writing,

Think of a pipe as a tool that directs the output of the first process to its right end, where the second process is ready and waiting to receive it as input. This allows the two processes to work together seamlessly, with the output of one being directly fed into the input of the other.

It's like a relay race, where the baton (data) passes from one runner (process) to the next until it reached the finish line.

Pipes are in fact an inter-process communication mechanism provided in Linux. They're designed to allow programs (processes ) to communicate with each other, by forwarding the output of a program (left side of the pipe) to the input of the other program (right side of the pipe).

Using Pipelines

As mentioned before, a pipeline is basically a sequence of two or more commands connected by the control operator or vertical bar "|". So, when you see a command line with a pipe, it's called a pipeline.

To use pipes, you should keep this synopsis in mind:[ command1  | command2 | {…} commandN ].

Each command in a pipeline processes the data passing through it in a specific way. The pipeline passes the output of  command1 to command2 as input, then passes the output of command2 to the next, and so on until it reaches commandN.

Typically, each command should be responsible for performing a specific operation on the data, such as filtering, sorting, transforming, or aggregating it.

Examples and Use Cases

The idea of using pipes to connect multiple commands may seem confusing theoretically. But once you see some real-life examples, you'll realize it's actually quite simple.

Let's explore pipes in action through a variety of examples and possibilities that you're likely to encounter in the future.

Getting Started

Suppose you want to search a large directory for files with the word "zaaiy" using the grep command. To achieve this, consider taking one step at a time when you manipulate data through multiple commands. Especially when dealing with long or confusing pipelines.

First, let's list the directory in question:

$ ls 1000files/
{1000 lines}

Your next step is to feed the above command's output to the grep command as input. Just add the pipe operator | and then the grep command with the desired options:

$ ls thousand_files | grep zaaiy

Done! The command prints all entries that include the "zaaiy" string:

pipe operator | and then the grep command

While we're at it, let's add another command to the mix to print out how many lines the grep command has found. Simply add another pipe symbol and the wc -l command just as so:

$ ls thousand_files | grep zaaiy | wc -l
pipe symbol and the wc -l command

Just like that, you can build customizable commands to achieve a specific goal. It's a simple concept actually: one command writes data to the standard output, the pipe operator forwards it to the standard input of the next command, and that command reads it.

But before jumping into experimenting with pipes on your own, let's check further examples that can introduce you to other commands which often work hand in hand with pipes.

Dealing with Lengthy Output

Many commands and files in Linux produce lengthy output, requiring you to scroll back to the top - this can be annoying. To streamline this process, you can pipe any type of long output to the less command line utility. This allows you to view it line by line or one page at a time.

For example, Linux offers the set command to view the system environment variables and shell functions. But this command can print out an overwhelming result. To view the result in a more friendly way, let's pipe the output to the less command:

$ set | less

The pipeline mentioned above is a handy tool to help you sift through long output right from your terminal. Simply press Enter to view the next line, or hit the space button to move to the next page. Moreover, You can also use the UP, DOWN, LEFT, and RIGHT buttons on your keyboard to intuitively navigate through the output. Finally, to exit the view, simply press Q.

Let's check one more example so you'll get a hang of this. Running the command ls -la in the /etc directory can produce a long output that can be challenging to navigate. But, as with the previous example, you can simplify things by piping the output to the less command:

$ ls -la /etc | less
piping the output to the less command

Sorting Output

Another common command paired with pipes is the sort command, which you can use to print output in a specific order. Suppose we want to print the following list of companies:

$ cat companies.txt
SolarZU 250M
PostalLight 32M
Bouhannana 351M
RoyalOak 45M
Almada 274M
Streamo 142M
Bingboom 210M
...

To sort the above output by name, simply introduce the sort command after the pipe operator:

$ cat companies.txt | sort
 sort command after the pipe operator

By default, the sort command will sort the first column (company names) in alphabetical order.

Let's make it more interesting and add the head command to capture only the top 3 companies by value (second column):

$ cat companies.txt | sort -nrk2 | head -n3

The sort command in the above pipeline uses the -k2 option to sort the output based on the second column, and the -nr option to sort it numerically in descending order. And for the head command, it uses the -n3 option to return only the first 3 lines.

Updating Output

A common tool you can use to update output in a certain way is the sed command. Using the previous file example, let's print the top 3 companies by value, but we'll replace "Bouhannana" with "Smara" in the final output:

$ cat companies.txt | sort -nk2 | head -n3 | sed 's/Bouhannana/Smara/g'

In the sed part of our pipeline, it uses the s subcommand to substitute "Bouhannana" with "Smara" and the g subcommand to replace all occurrences of "Bouhannana" (if there are any).

The new output should look like this:

The idea of pipes should be a little bit clearer now.

Dealing With Advanced Pipelines

Combining commands with pipes can simplify complex operations in just one line without wasting time or effort. For instance, Say you want to list all the owner groups of files in a directory, e.g. /etc directory. This may seem complicated at first, but you should always break down the pipeline into small parts.

First, list the content of the directory using the ls -la command. Then, extract only the results in the fourth column (group types) using the awk command, sort them, and finally, eliminate duplicate lines with the uniq command.

Check out the example pipeline below for a better understanding:

$ ls -la | awk '{ print $4 }' | sort |  uniq 
example 1 for advanced pipelines

The $4 of the above awk command, grabs the fourth column and discards the rest of the output. Keep in mind that you should sort data before piping it to the uniq command, otherwise, it won't return unique entries.

To take it up a notch, say you want to get the names of all files (the ninth column) that were last modified in April/"Apr". Once again, construct the pipeline command by command so you won't get confused:

$ ls -la | grep "Apr" | awk '{ print $9 }'
example 2 for advanced pipelines

Same as before, but this time you want to list the filenames along with their size (the fifth column), sorted by the day of the month (the seventh column). To achieve this, simply modify the previous pipeline by sorting the grep results by the seventh column and displaying the desired columns using the awk command:

$ ls -lah | grep "Apr" | sort -nk7 | awk '{ print $5 " ~> " $9 }'
example 3 for advanced pipelines

The command prints filenames and their sizes in a human-readable format thanks to the -h option. It also sorts the entries based on the day of the month (seventh column) before grabbing the 5th and 9th columns with the awk command.

To be sure that the sort command actually sorted the output, let's exclude the awk part from the previous pipeline:

$ ls -lah | grep "Apr" | sort -nk7
example 4 for advanced pipelines

Conclusion

After diving into the world of pipes in Linux, it's clear that understanding the mechanics of stdin and stdout is paramount. But it's not just about mastering the fundamentals - it's also important to explore frequently used pipe-friendly commands to unlock even more potential. By getting comfortable with pipes and other related tools like awk, grep, and sort, you'll be able to streamline your workflows and make your command line work effectively and efficiently.

About The Author

Bobbin Zachariah

Bobbin Zachariah

Bobbin Zachariah is an experienced Linux engineer who has been supporting infrastructure for many companies. He specializes in Shell scripting, AWS Cloud, JavaScript, and Nodejs. He has qualified Master’s degree in computer science. He holds Red Hat Certified Engineer (RHCE) certification and RedHat Enable Sysadmin.

SHARE

Comments

Please add comments below to provide the author your ideas, appreciation and feedback.

Leave a Reply

Leave a Comment