How to Remove Commented and Blank Lines from Linux File

Written by: Dwijadas Dey   |   Last updated: March 13, 2024

Comments and blank lines are significant parts of any Linux configuration file. The readability will be abysmal if a code file or a configuration file does not have either comments or blank lines. However, in some situations,  you want to remove either comments/blank lines, or both from the entire file. This article will guide you through various options (grep,awk and sed commands) available in Linux to search and replace patterns using the command line interface.

Why do we need this?

Reduced file size

Removing comments or blank lines will substantially reduce file size for a large file and will be useful when disk space is limited or when transferring files over a slow network. 

Performance

Parsing a file with plenty of comments or blank files will slow down the performance of any application. Removing comments/blank lines will improve the overall performance of processing any files.

Clarity

From a developer's or system administrator's point of view comments and blank lines in files are necessary. This improves the code's maintainability. However, a large number of comments and blank lines adversely impact code clarity.  For better readability of codes, it is a good practice to remove unnecessary comments and blank lines.

Using Grep

Grep is an excellent utility for searching a text pattern in a file. It is useful when one needs to scrape through a file to find the number of occurrences, matching line numbers of a pattern. Remember, Grep does not remove either blank lines or patterns of text from a file. Use grep with the -v option to invert the sense of matching, to select non-matching lines.

Let us compile a small code snippet from php-fpm configuration files which looks like the following.

; For php_*flag, valid values are on, off, 1, 0, true, false, yes or no.  
      
    ; Defining 'extension' will load the corresponding shared extension from  
; extension_dir. Defining 'disable_functions' or 'disable_classes' will not  
; overwrite previously defined php.ini values, but will append the new value  
    ; instead.  
      
; Note: path INI options can be relative and will be expanded with the prefix  
; (pool, global or /usr)  
      
; Default Value: nothing is defined by default except the values in php.ini and  
; specified at startup with the -d argument  
;php_admin_value[sendmail_path] = /usr/sbin/sendmail -t -i -f  [[email protected]](mailto:[email protected])  
;php_flag[display_errors] = off  
;php_admin_value[error_log] = /var/log/fpm-php.www.log  
;php_admin_flag[log_errors] = on  
;php_admin_value[memory_limit] = 32M      
[www]  
user = www-data  
group = www-data  
listen = /run/php/php7.4-fpm.sock  
listen.owner = www-data  
listen.group = www-data  
pm = dynamic  
pm.max_children = 25  
pm.start_servers = 10  
pm.min_spare_servers = 5  
pm.max_spare_servers = 20  
pm.max_requests = 500

Now if we run the following grep command, it will select all the comments starting with the character ';' and print them in the terminal. The 'E' option signifies extended regular expression and ^[[:space:]]* in the regular expression means any number of leading spaces in the line.

$ grep -E '^[[:space:]]*;' php-fpm.conf

Output:

; For php_*flag, valid values are on, off, 1, 0, true, false, yes or no.  
; Defining 'extension' will load the corresponding shared extension from  
      ; extension_dir. Defining 'disable_functions' or 'disable_classes' will not  
    ; overwrite previously defined php.ini values, but will append the new value  
    ; instead.  
    ; Note: path INI options can be relative and will be expanded with the prefix  
    ; (pool, global or /usr)  
    ; Default Value: nothing is defined by default except the values in php.ini and  
      ; specified at startup with the -d argument  
    ;php_admin_value[sendmail_path] = /usr/sbin/sendmail -t -i -f  [[email protected]](mailto:[email protected])  
    ;php_flag[display_errors] = off  
    ;php_admin_value[error_log] = /var/log/fpm-php.www.log  
    ;php_admin_flag[log_errors] = on  
    ;php_admin_value[memory_limit] = 32M 

To select non-matching lines, use the 'v' option which effectively inverts the selection and prints them in the terminal.

$ grep -Ev '^[[:space:]]*;' php-fpm.conf

Output

      
      
      
[www]  
user = www-data  
group = www-data  
listen = /run/php/php7.4-fpm.sock  
listen.owner = www-data  
listen.group = www-data  
pm = dynamic  
pm.max_children = 25  
pm.start_servers = 10  
pm.min_spare_servers = 5  
pm.max_spare_servers = 20  
pm.max_requests = 500

To remove all blank lines, use the following command.

$ grep -Ev '^[[:space:]]*$' php-fpm.conf

Output:

; For php_*flag, valid values are on, off, 1, 0, true, false, yes or no.  
    ; Defining 'extension' will load the corresponding shared extension from  
; extension_dir. Defining 'disable_functions' or 'disable_classes' will not  
; overwrite previously defined php.ini values, but will append the new value  
    ; instead.  
; Note: path INI options can be relative and will be expanded with the prefix  
; (pool, global or /usr)  
; Default Value: nothing is defined by default except the values in php.ini and  
; specified at startup with the -d argument  
;php_admin_value[sendmail_path] = /usr/sbin/sendmail -t -i -f  [[email protected]](mailto:[email protected])  
;php_flag[display_errors] = off  
;php_admin_value[error_log] = /var/log/fpm-php.www.log  
;php_admin_flag[log_errors] = on  
;php_admin_value[memory_limit] = 32M      
[www]  
user = www-data  
group = www-data  
listen = /run/php/php7.4-fpm.sock  
listen.owner = www-data  
listen.group = www-data  
pm = dynamic  
pm.max_children = 25  
pm.start_servers = 10  
pm.min_spare_servers = 5  
pm.max_spare_servers = 20  
pm.max_requests = 500

To remove both blank lines and comments (";") with/without leading spaces use the following command

$ grep -Ev "^[[:space:]]*;|^[[:space:]]*$" php-fpm.conf

Output:

[www]  
user = www-data  
group = www-data  
listen = /run/php/php7.4-fpm.sock  
listen.owner = www-data  
listen.group = www-data  
pm = dynamic  
pm.max_children = 25  
pm.start_servers = 10  
pm.min_spare_servers = 5  
pm.max_spare_servers = 20  
pm.max_requests = 500

The following is a handy grep command to do the same:

grep "^[^;]" /etc/php/7.4/fpm/pool.d/www.conf

In some configuration files comments are denoted by #, in those cases simply replace ; and with #.

Using Sed

The name ‘sed’ in Linux is derived from the words ‘stream editor’. Unlike grep, sed can efficiently process text including search and replace in the input stream itself. 

To search and replace text with the sed command without any options, the command simply prints out the file in the terminal after processing. 

We will use the following Apache configuration file to test the next set of sed commands.

#
  # This is the main Apache HTTP server configuration file.  It contains the
 # configuration directives that give the server its instructions.
    # See <URL:http://httpd.apache.org/docs/2.2> for detailed information.
ServerRoot "/usr"

#
# Listen: Allows you to bind Apache to specific IP addresses and/or
# ports, instead of the default. See also the <VirtualHost>
# directive.
#Listen 12.34.56.78:80

Listen 80
#
# Dynamic Shared Object (DSO) Support
#

# Example:
# LoadModule foo_module modules/mod_foo.so
#

LoadModule authn_file_module libexec/apache2/mod_authn_file.so
LoadModule authn_dbm_module libexec/apache2/mod_authn_dbm.so

The following sed command will search and remove all the blank lines in the input stream and print them in the terminal.

$ sed  '/^[[:space:]]*$/d' sample-httpd.conf

Output

ServerRoot "/usr"


Listen 80


LoadModule authn_file_module libexec/apache2/mod_authn_file.so
LoadModule authn_dbm_module libexec/apache2/mod_authn_dbm.so

To search and remove both blank lines and comments, use the following sed command

$ sed  '/^[[:space:]]*$/d;/^[[:space:]]*#/d' sample-httpd.conf

Output:

ServerRoot "/usr"
Listen 80
LoadModule authn_file_module libexec/apache2/mod_authn_file.so
LoadModule authn_dbm_module libexec/apache2/mod_authn_dbm.so

So far, we have seen how to remove a commented line when it starts with a specific character. However, sometimes comments are nested inside specific set characters as can be seen in the following C program.

#include <stdio.h>
/* Sample C program
  that prints 'Hello World'
*/
int main()
{
	printf("Hello World");
        /* Return 0 */
	return 0;
}
 /* Program end here */

To remove comments using sed in such a situation, use the following sed command.

$ sed -r ':a; s%(.*)/\*.*\*/%\1%; ta; /\/\*/ !b; N; ba' sample-program.c

The ‘-r’ option instructs sed to use extended regular expression in the script. 

Output:

#include <stdio.h>

int main()
{
	printf("Hello World");
        
	return 0;
}

All the previous sed commands searches for a pattern and replace/remove them in the output stream which is in the terminal. To make changes in the file use the -i option, which means to edit the file in place.

$ sed  -i '/^[[:space:]]*$/d;/^[[:space:]]*#/d' sample-httpd.conf

Using Awk

AWK is another utility tool in Linux that is used for pattern searching and processing. It scans files one line at a time, matches the same with the user-supplied regular expression, and processes the line of text, performing replace or delete operations based on user-supplied options.

To search and remove all blank lines and print them in the terminal using awk, use the following ‘awk’ command.

$ awk '!/^[[:space:]]*$/' sample-httpd.conf

In the above command, awk looks for any number of spaces at the beginning of the line with the pattern ^[[:space:]]*  followed by the ‘$’ character that signifies the EOL marker. Then by negating the whole pattern (!/^[[:space:]]*$/) awk fetches the line that does not start with ‘$’, which means only non-blank lines.

Output:

#
  # This is the main Apache HTTP server configuration file.  It contains the
 # configuration directives that give the server its instructions.
    # See <URL:http://httpd.apache.org/docs/2.2> for detailed information.
ServerRoot "/usr"
#
# Listen: Allows you to bind Apache to specific IP addresses and/or
# ports, instead of the default. See also the <VirtualHost>
# directive.
#Listen 12.34.56.78:80
Listen 80
#
# Dynamic Shared Object (DSO) Support
#
# Example:
# LoadModule foo_module modules/mod_foo.so
#
LoadModule authn_file_module libexec/apache2/mod_authn_file.so
LoadModule authn_dbm_module libexec/apache2/mod_authn_dbm.so

The following awk command searches all the lines starting with ‘#’, removes and prints them in the terminal.

$ awk '!/^[[:space:]]*#/' sample-httpd.conf

Output:

ServerRoot "/usr"
Listen 80
LoadModule authn_file_module libexec/apache2/mod_authn_file.so
LoadModule authn_dbm_module libexec/apache2/mod_authn_dbm.so

In many instances, comments in the configuration file start and also end with specific characters like in a C program that starts with /* and ends with */. The following awk command searches and removes all single line comments inside /* */

$ awk '/\*|^\/\// {next} {print}' sample-program.c

Output:

#include <stdio.h>

int main()
{
	printf("Hello World");
        
	return 0;
}

In the above command, {next} forces awk to stop processing the current input stream, {print} the non-matching part of the file before moving to process the next one.

About The Author

Dwijadas Dey

Dwijadas Dey

Dwijadas Dey is an open-source software advocate and tech enthusiast. He has been writing blogs about emerging technology for more than 10 years for different cloud service providers and tech sites. He also provides hands-on training and coaching on varied subjects like Kubernetes, Ansible, OpenStack, and Data Streaming.

SHARE

Comments

Please add comments below to provide the author your ideas, appreciation and feedback.

Leave a Reply

Leave a Comment