Humans see files as readable text, but what about computers? For them, it is just an unsigned integer. If you try to understand file descriptors from a computer's side, you are not going to believe what is possible to do with those numbers!
File Descriptors
What is a file? This question has a pretty short answer, it is just a number. In Linux, everything is a file. Even devices are represented are files. Of course, they are special in contrast to plain text files, but they are still treated as files.
Each file has 3 file descriptors for input-output operations; which are stdin, stdout and stderr. Note that to be able to use file descriptors, file needs to be open.
All running processes (this includes opened files) are stored in '/proc' directory, all of them has a unique process id. Let's check the content of one of the directory in '/proc'
ls /proc/579
In the output, there is a folder called fd, a.k.a file descriptor. Now, let us try to check out the files in that folder:
ls -all /proc/579/fd
A process can have more then 3 file descriptors, and they can have different usage purposes. For the scope of this article, only 3 of them are going to be explained.
Types of File Descriptors
Which one of these are stdout, stdin and stderr?
- 0: stdin
- 1: stdout
- 2: stderr
First 3 file descriptors are always the same. This means that process id '579' uses a pipe for stdin, output goes to '/dev/null' which means it is not used, and stderr goes in to a file in users home directory.
After all the heavy technical knowledge, time for some fun by using file descriptors!
Relation Between File Descriptors and Redirection
File descriptors must be used with redirection operators, because they are user-defined. In the background, stdout and stdin uses redirection operators too, since they are standard they can be omitted.
To give an example:
echo 'linuxopsys'
Is the same as:
echo 'linuxopsys' >&1
For stdin:
read variable
Is the same as:
read variable <&0
Redirection operators are available on every shell since the beginning of unix. It is posix compliant, so they can be used on any Linux - Unix distribution that uses different shells.
Redirection operators can be used for different purposes as well. It can be used to nullify the output of any command. This can be pretty useful if output produced from the command wants to be hidden from the user.
echo 'hello' >/dev/null
As expected, this command is not going to produce any output. Please note that the command above is the as the next snippet:
echo 'hello' 1>/dev/null
There are some special files that can be found in /dev, which /dev/null is one of them.
- /dev/null: Used for silencing any process or command that generates output.
- /dev/random and /dev/urandom: Both of them are identical. They are used to generate random numbers.
- /dev/zero: Used to generate 0. This is useful when numbers with '0' value needed, because '0' value can not be written to a file by using echo.
General Usage of File Descriptors
Before playing around with file descriptors Linux, let us learn how to create them.
Creating a file descriptor and redirecting to stdout
exec 3>&1
In the snippet above, our file descriptor is 3. It is the first number that is usable, because as stated above, 0-1-2 are reserved and not usable.
3>&1 means that file descriptor 3 will write the output to file descriptor 1, which is known as stdout. Note that if redirection is going to be used on a file descriptor rather than a file itself, & symbol must be used.
exec 3>&1
echo 'linuxopsys' >&3
As you can see, whatever that is written to file descriptor 3 will show up in stdout. Note that to use a file descriptor, it must be redirected. If the work with the file descriptor is done, it must be closed:
exec 3>&-
Creating a file descriptor and redirecting to stdin
exec 3<&0
read variable <&3
echo $variable
To close it:
exec 3<&-
Creating a file descriptor and redirecting to stderr
For stderr, the process is nearly the same as stdout, since they are both used for producing output.
exec 3>&2
echo 'this is an error' >&3
File Descriptors to Create and Write to New Files
File descriptors only usage is not limited by redirection. It can also be used to create and write to new files as well. Lets proceed with some examples.
exec 4>file.txt
The syntax is pretty similar to the context above. When a filename has been used with a file descriptor, it means that any operation that is going to be used with the file descriptor is going to be redirected to that file.
exec 4>file.txt
echo 'hello' 1>&4
Note that no output has been produced from the command echo. Using a file does not duplicate stdout, it only redirects. How to duplicate stdout is going to be discussed in the next section of the article.
Do not forget to close the file descriptor:
exec 4>&-
Please do not delete file.txt that has been created above. We will use that file to create a new file descriptor, but this time it will be used to read input from it.
exec 4<file.txt
read variable <&4
echo $variable
As expected, file descriptors can be treated same as a file. If there were more text in the file, reading multiple variables from it will work same as stdin, it will read the variables one by one according to IFS.
IFS: Internal field seperator; which decides when to cut reading input for a single variable.
Solving a mystery: "sort tmp.txt > tmp.txt" What happened to the content in the file?
Before explaining why this is a mystery, please see for yourself. Create a file and put some words in it.
echo I love Linux but file descriptors are confusing | tr ' ' '\n' > tmp.txt
This command will create a new file called tmp.txt and put a newline between each word, which is done by tr command by changing each space to a newline character.
Everything is going as expected for now. Lets take a look at the output of the file again, but this time by sorting the content of the file:
Now let us try to write the sorted content to the same file.
sort tmp.txt > tmp.txt
cat tmp.txt
File is empty! What happened here?
The issue arises because sort command reads from the file, and after that it tries to redirect to the same file again. Using stdout and stdin at the same time will not work, as proved above.
This is where the file descriptors comes into play! This command can work correctly by using a file descriptor. Please create the file again with the same content.
exec 3<>tmp.txt && sort tmp.txt >&3
And... it works! A file descriptor has been opened for tmp.txt, for both input and output operations. Since file descriptors have their own stdout and stdin, the command works correctly.
File Descriptors to Duplicate Output of Any Command
File descriptors can be used to duplicate the output of commands by using the help of tee. By using this, outputs can both be seen from the terminal and saved to a file.
exec 3>file.txt
echo hello | tee >&3
This will make the output both to be produced on the screen, as well as save it to file.txt. Remember that to be able to do this, a helper command like tee needs to be used.
Conclusion
In this guide, we have discussed the capabilities and usage of file descriptors, and understood what do files mean in Linux. They have lots of usage areas, and implementing them on scripts will make your life easier. There is still much to do with file descriptors, what you can do with them is totally up to your imagination!
Comments