The diff command, derived from the term "difference," is a command-line utility in Unix and Unix-like operating systems such as Linux. It compares files line by line and outputs the differences between them.
diff does not check for partial matches within a line; it checks if two lines are exactly the same or not. If they aren't, diff reports the line as different.
Primarily used for:
- File Comparison: Identifying the variations between two files or sets of files.
- Patch Creation: Generating patches by redirecting output, which can later be applied to files using the patch command.
- Conditional Logic: Within scripts implement conditional logic based on file differences.
Syntax
Below is the general syntax of the diff command:
diff [options] file1 file2
Where:
- file1 and file2 are the files that you want to compare.
- [options] are optional flags that you can use to modify how diff behave.
Common Options
Here’s a rundown of some commonly used options:
-y
,--side-by-side
: Show output in two columns.--suppress-common-lines
: Do not output common lines.-i
,--ignore-case
: Ignore case differences in file contents.-w, --ignore-all-space
: Ignore white space when comparing lines.-w
,--ignore-all-space
: Ignore all white space.-B
,--ignore-blank-lines
: Ignore changes whose lines are all blank.-r
,--recursive
: Recursively compare any subdirectories found.--exclude=PATTERN
: Exclude files that match PATTERN.-e
,--ed
: Output an ed script.-s
,--report-identical-files
: Report when two files are the same.
Output formats
The three primary formats include Normal, Unified, and Context.
1. Normal format
The default format produced by diff is known as the normal format. It provides instructions on how to change the first file (file1) to make it identical to the second file (file2).
Example output of classic normal format:
1,3c1,3
< Linux is a powerful OS.
< It is open-source and secure.
< Popular distributions include Ubuntu and Fedora.
---
> Linux is a potent operating system.
> It is renowned for its open-source nature and security.
> Famous distros include Ubuntu, Fedora, and Debian.
Interpretation of above output:
1,3c1,3
: Indicates that lines 1 to 3 in file1.txt need to be changed (c) to make it identical to file2.txt lines 1 to 3.- Lines starting with
<
: Show the original content in file1.txt. ---
: Serves as a separator between the content of file1.txt and file2.txt.- Lines starting with
>
: Display the corresponding content in file2.txt.
2. Unified Format
The unified format (-u
option) is commonly used for creating patches. It displays the differences with a few lines of context.
Example output of unified format:
--- file1.txt 2023-10-14 23:27:55.581336320 +0000
+++ file2.txt 2023-10-14 23:28:14.729796852 +0000
@@ -1,3 +1,3 @@
-Linux is a powerful OS.
-It is open-source and secure.
-Popular distributions include Ubuntu and Fedora.
+Linux is a potent operating system.
+It is renowned for its open-source nature and security.
+Famous distros include Ubuntu, Fedora, and Debian.
Interpretation of the above output:
--- file1.txt 2023-10-14 23:27:55.581336320 +0000
: Indicates that file1.txt is the original file and provides its timestamp.+++ file2.txt 2023-10-14 23:28:14.729796852 +0000
: Indicates that file2.txt is the modified file or the file to compare against and provides its timestamp.@@ -1,3 +1,3 @@
: This hunk identifier provides the context of the differences in the files.-1,3
indicates lines 1 to 3 in file1.txt, and+1,3
indicates lines 1 to 3 in file2.txt.- Lines starting with
-
: Indicate lines from file1.txt that will be removed or modified. - Lines starting with
+
: Indicate lines from file2.txt that are added or modified.
3. Context Format
The context format (-c
option) provides a view with context around the changes, making it easier to see where the modifications occurred.
Example
*** file1.txt 2023-10-14 23:27:55.581336320 +0000
--- file2.txt 2023-10-14 23:28:14.729796852 +0000
***************
*** 1,3 ****
! Linux is a powerful OS.
! It is open-source and secure.
! Popular distributions include Ubuntu and Fedora.
--- 1,3 ----
! Linux is a potent operating system.
! It is renowned for its open-source nature and security.
! Famous distros include Ubuntu, Fedora, and Debian.
Interpretation of the above output:
*** file1.txt 2023-10-14 23:27:55.581336320 +0000
: Specifies the first file being compared (file1.txt) and its timestamp.--- file2.txt 2023-10-14 23:28:14.729796852 +0000
: Specifies the second file being compared (file2.txt) and its timestamp.***************
: A separator that demarcates the header from the content section.*** 1,3 ****
: Indicates that lines 1 to 3 from file1.txt are displayed below.- Lines starting with
!
: Identify lines that are different between file1.txt and file2.txt. --- 1,3 ----
: Indicates that corresponding lines 1 to 3 from file2.txt are displayed below.
Examples
Let's look into some use case examples of diff command.
1. To create a side-by-side comparison while suppressing common lines
$ diff --side-by-side --suppress-common-lines file1.txt file2.txt
Linux is a powerful OS. | Linux is a potent operating system.
It is open-source and secure. | It is renowned for its open-source nature and security.
Popular distributions include Ubuntu and Fedora. | Famous distros include Ubuntu, Fedora, and Debian.
This can be particularly useful when comparing two files and you're only interested in seeing the lines that are different, making it easier to visually inspect differences without the distraction of identical lines.
For an easier and more visual side-by-side comparison of two files, you can use sdiff command.
2. Get only the additions between two files
diff -u oldfile.txt newfile.txt | grep '^\+' | sed -E -n '/^\+\+/!s/^\+//p'
$ cat oldfile.txt Linux is great. It is open-source. $ cat newfile.txt Linux is great. It is open-source. It is secure.
$ diff -u oldfile.txt newfile.txt --- oldfile.txt 2023-10-15 04:06:37.184032784 +0000 +++ newfile.txt 2023-10-15 04:06:56.776471003 +0000 @@ -1,2 +1,3 @@ Linux is great. It is open-source. +It is secure. $ diff -u oldfile.txt newfile.txt | grep '^\+' | sed -E -n '/^\+\+/!s/^\+//p' It is secure. $
We have used grep and sed to remove meta lines and leading '+" from the unified diff output.
3. Compare two files ignoring case
diff -i lowercase.txt uppercase.txt
If you run diff without the -i
option, it will consider these lines different because of the case differences.
$ cat lowercase.txt
linux is versatile.
$ cat uppercase.txt
linux is VERSATILE.
$ diff -i lowercase.txt uppercase.txt
$ diff lowercase.txt uppercase.txt
1c1
< linux is versatile.
---
> linux is VERSATILE.
$
4. Check two files identical
The -s
or --report-identical-files
option can be used to report when two files are the same, which is something diff does not do by default.
Example:
$ cat exactfile1.txt
Linux is wonderful.
It is used widely.
$ cat exactfile2.txt
Linux is wonderful.
It is used widely.
$ diff -s exactfile1.txt exactfile2.txt
Files exactfile1.txt and exactfile2.txt are identical
$
5. Creating and applying patches
Here is a simple example to illustrate creating a patch using diff and then applying it with patch.
file_original.txt
Hello, World!
This is a simple file.
Have a great day!
file_modified.txt
Hello, Universe!
This is a simple file.
Have an awesome day!
Let's create patch and apply the patch
diff -u file_original.txt file_modified.txt > file.patch
patch file_original.txt < file.patch
Output:
$ cat file_original.txt
Hello, Universe!
This is a simple file.
Have an awesome day!
6. Compare two directories using diff
$ diff -rq dir1 dir2
Only in dir1/: file1.txt
Only in dir1/: file2.txt
Only in dir2: file3.txt
Only in dir2: file4.txt
Here
-q
(--brief): Report only when files differ. When this option is used, diff doesn’t output the differences between the files, but simply indicates which files differ.-r
(--recursive): Recursively compare any subdirectories found.
Comments