Most of us use the Linux sort command without thinking about it – we just type “sort” followed by the text file name we want to sort. However, the sort command Linux can do so much more than that!
In this tutorial, we will explore the sort command in more detail, including how to use it to sort data in various ways and how to customize the sort using options.
Sort Command
The Linux sort command is a Linux utility that allows you to sort lines of text files alphabetically or numerically. It is often used with other commands, such as grep and uniq, to analyze and manipulate text data. The sort command assumes data is in the ASCII format.
Syntax:
sort [options] <filename>
By default, the sort command makes it an easy task to view the information in alphabetical order. Let’s work with a sample text file named textfile.txt. We will see the following if we view the file’s content using the cat command.
cat textfile.txt
Now, let’s use the sort command on the file.
sort textfile.txt
The output has been sorted out alphabetically.
How to Use Sort Command
Here are a few practical examples of sort commands you will find helpful in various situations.
Input and Output
The sort command only displays the sorted output on the screen after arranging them. However, it is possible to save the sorted output to a separate file with the -o option.
sort -o <sorted file> <filename>
The command sorts the contents in textfile.txt and saves it to an output file named sorted_textfile.txt. Running the cat command outputs the contents of the new file.
Specifying the output file where the sorted results will be saved after the -o option is important.
With the sort command in Linux, we can also merge already sorted files. It takes the multiple input file as the sort key. We can do this with the -m option.
sort -m file1 file2 file3
With the cat command, we output the information in the sorted_numberfile.txt file. The command merges the contents of sorted_textfile.txt and sorted_numberfile.txt. This will only work on sorted files.
Sort command can also be used with the -i (ignore non-printing) option. This option treats all characters as if they are printable. This is useful when sorting files containing many non-printing characters, such as text files with many whitespaces.
sort -i <filename>
Numerical sorting
Sort numerical value by passing the -n option to the sort command. This will sort from the lowest to the highest. And the result is written to a standard output.
sort -n <filename>
The cat command outputs the contents of numberfile.txt. The file contains only numbers, so using the sort command without any options has no effect. When the -n option is used, the numbers are sorted.
Field-based sorting
This sorting method lets us (users) specify the fields we want to sort by. This is especially useful when sorting large files by multiple criteria. To sort the output of a command by a field, use the -k option followed by the field number. We can also use the -t option to specify the field separator.
Sort by Columns
Sorting is not just for one column. It can be done on a file with multiple columns. The -k option is used to specify the column number that should be sorted, without the -k option, sorting is performed using the entire line. To show the varieties of output possible, several examples are included.
Sample file with multiple columns:
To sort by a single column, for example sort by the first column use -k1,1. This means using the fields from #1 to #1, i.e. only the first field.
sort -k1,1 textfile.txt
The output shows the command sorted the lines of textfile.txt in ascending alphabetical order based on the first column of each line.
Same way to sort by the second column, type:
sort -k2,2n textfile.txt
or
sort -k2,2 textfile.txt
Textfile.txt has been sorted by the numerals on the second column.
Column field separator
By default, the delimiter for columns is whitespace or tab. But we can use a custom delimiter using the -t or --field-separator option in sort.
Our sample textfile.txt now contains multiple fields, this includes a person’s name, age, and country, all separated by a colon.
For example to sort numerically based on the second field, which is separated from the first field by a colon (:), type:
sort -t ‘:’ -nk2 textfile.txt
Where the -t
option specifies the field separator to use when sorting. The -n
option sorts the lines numerically. The -k
option specifies the field to sort on. The 2
after -k
specifies the second field.
Month Sort
Use -M option with the sort command to arrange data according to months listed in a file.
To explain we have a sample file with the sixth column is occupied by months that are arranged alphabetically.
To sort the data by month, enter the following in the command line:
sort -Mk6 monthfile.txt
The output shows the file has been sorted alphabetically based on the sixth column containing the month field.
Reverse sorting
With the -r option, we can sort the data in a file in reverse order. This could be reversed in alphabetical order or from highest to lowest.
sort -r <filename>
The output shows that the sorted data is in reversed order.
Unique Lines
With the -u option, sort removes duplicate entries from a file. This command can also be used on multiple files.
sort -u <filename>
In monthfile.txt, we have the same months. We can use the sort command with the -u option to remove duplicates.
sort -u -MK6 monthfile.txt
From the output of the command, there are no more duplicate months.
Human Readable Numbers
The -h flag tells sort to sort the input numerically rather than lexicographically (alphabetically).
Example:
sort -h <filename>
In this example, the sort command rearranges the lines in the file from smallest to largest, interpreting the units (K, G
) as powers of 10.
Check the input file for sortedness
The -c option is useful for quickly checking whether a file is already sorted, without having to sort the file again. There will be no output if the file has already been sorted. If an unsorted line is found, the command returns the first unsorted line.
sort -c <filename>
The first command shows that the contents of /etc/passwd still need to be sorted. And it also tells us the line that was out of place.
For the second command, we sort out the contents of /etc/passwd. We then pipe the output to sort -c to check if it has been sorted. Nothing was printed on the screen, proof that it has been sorted out.
Combine with Pipe Operator
The sort command can be used in combination with the |
(pipe) operator to sort the output of another command.
Example:
tail /etc/passwd | sort
We have used the tail command to output the last ten lines of /etc/passwd file. The result is then piped to the sort command, which sorts the lines alphabetically.
Conclusion
Sort command are often used in combination with other commands (such as ls or ps) to sort the output of those commands.
You can use the sort command with many different options to customize the way it sorts the data. For more information about the sort command, you can consult the sort man page by running man sort
in your terminal.
Navigate all-in-one place of Linux Commands for more learning.
Comments