The sort command in Linux is a command-line utility used for sorting text files or data streams. Its primary purpose is to arrange lines of text in a more structured and organized manner.
sort can perform a wide range of sorting operations, such as alphabetical or numerical sorting, ascending or descending order, and sorting by specific criteria within each line.
Syntax
The basic syntax for the sort command is as follows:
sort [OPTION]... [FILE]...
Options
Here are some commonly used options:
-r
: Sort in reverse order (descending).-n
: Sort numerically instead of lexicographically (for numbers).-k
: Sort by specific fields within each line.-b
: Ignore leading whitespace when sorting.-f
: Perform a case-insensitive sort.-u
: Remove duplicate lines, displaying only unique lines.-t
: Specify a custom field delimiter.-c
: Check if a file is already sorted (exit status indicates sorted or not).-o
: Output to a file instead of displaying on the terminal.-M
: Sort by month name.-h
: Sort human-readable numbers (e.g., 1K, 2M) correctly.
sort Command Examples
Sorting in Ascending Order
By default, the sort command sorts lines in ascending order (A to Z or 0 to 9).
Example 1: Sorting from a File
Let's assume we have a file named sampledata.txt
with the following content:
Banana 10
Cherry 5
Apple 7
Grapes 2
You can sort it in ascending order with this command:
sort sampledata.txt
Output:
Apple 7
Banana 10
Cherry 5
Grapes 2
This will sort lines in alphabetical order based on the first word in each line.
Example 2: Sorting from Standard Input
You can also provide input to sort via the standard input, which is useful when using pipelines. For example:
echo -e "banana\napple\ncherry" | sort
Sorting in Descending Order
To sort in descending order, where the highest value or latest character comes first, use the -r option.
Example 3: Sorting in Descending Order
sort -r sampledata.txt
Output
Grapes 2
Cherry 5
Banana 10
Apple 7
Sorting Numerically
To sort numerically, you can use the -n option.
Example 4: Numerical Sorting
Suppose we have a file named numbers.txt with the following content:
42
5
18
73
10
You can sort it numerically with this command:
sort -n numbers.txt
Output:
5
10
18
42
73
Sorting by Specific Field
You can sort lines based on a specific field (column) within each line using the -k option. Fields are separated by a delimiter (default is whitespace).
Example 5: Sorting by Second Field (Using Space as Delimiter)
Let's consider a file named samplefile.txt with the following content:
Banana sweet 10
Cherry sour 34
Apple yummy 5
Grapes lovely 40
To sort it by the second field, you can use this command:
sort -k2 samplefile.txt
Grapes lovely 40
Cherry sour 34
Banana sweet 10
Apple yummy 5
Consider if you want to sort by the third column containing numbers then you should use -n (numerical order) option along with -k option.
sort -n -k3 samplefile.txt
Output:
Apple yummy 5
Banana sweet 10
Cherry sour 34
Grapes lovely 40
Whereas to sort in descending order (based on the second field): sort -n -k3 -r samplefile.txt.
Example 6: Sorting by Third Field (Using a Different Delimiter)
To sort by the field using a different delimiter, you can use the -t option to specify the delimiter. Let's sort this file by the third field (city) using the colon :
as the delimiter.
Contents of data.txt:
Alice:28:New York
Bob:22:Los Angeles
Charlie:35:Chicago
David:30:San Francisco
Let's sort this file by the third field (city) using the colon :
as the delimiter:
sort -t':' -k3 data.txt
Output:
Charlie:35:Chicago
Bob:22:Los Angeles
Alice:28:New York
David:30:San Francisco
Example 7: Sorting column containing month
Let's take another example to sort column or field by Month
Contents of monthdata.txt:
Emily February
John January
Bob March
Alice May
Alan March
sort -k2M -k1,1 monthdata.txt
-k2M
sorts the lines based on the second column (-k2) as month names (M), considering the custom order of months.-k1,1
is used to specify that the first column should be used as a secondary sort key if the months are the same.
Output:
John January
Emily February
Alan March
Bob March
Alice May
Ignoring Leading Characters
The -b
option allows you to ignore leading whitespace characters when sorting.
Example 8: Ignoring Leading Whitespace
The following file spacedata.txt has this content:
Banana sweet
Cherry sour
Apple yummy
Grapes lovely
Command:
sort -b spacedata.txt
Output:
Apple yummy
Banana sweet
Cherry sour
Grapes lovely
The lines are sorted alphabetically, and leading whitespace characters are ignored, which is why "Apple" and "Banana" appear first in the sorted list.
What if you need to delete leading white space and then sort lines? Use something like this:
sed 's/^ *//' spacedata.txt | sort
Case-Insensitive Sorting
To perform a case-insensitive sort on this file, you can use the -f option.
Example 9: Case-Insensitive Sorting
Content of case_sensitive.txt:
shell
kernel
terminal
GNU
Debian
Linux
Command:
sort -f case_sensitive.txt
Output:
Debian
GNU
kernel
Linux
shell
terminal
From the output, you can see the lines are sorted alphabetically while ignoring the case of the letters.
Handling Duplicate Lines
To sort the file and remove duplicate lines, you can use the -u option.
Example 10: Removing Duplicate Lines
Contents of duplicates.txt:
shell
shell
kernel
terminal
Kernel
GNU
gnu
Debian
Linux
Command
sort -u duplicates.txt
Output:
Debian
gnu
GNU
kernel
Kernel
Linux
shell
terminal
In this example, "shell" appeared twice in the original file, but they are displayed only once in the sorted and de-duplicated output. To ignore the case you can try using sort -f -u duplicates.txt.
Saving Sorted Output to a File
The -o
option allows you to specify an output file directly.
Example 11: Saving Sorted Output to a File
sort -u duplicates.txt -o removed.txt
The above command removes any duplicate lines, and saves the sorted, unique lines into the removed.txt file.
The following command provides the same desired output, here we use > instead of -o:
sort -u duplicates.txt > removed.txt
Comments