Linux system programming: Open file, read file and write file
This is my first article in what I’m hoping will be a series of articles on system programming for POSIX compliant operating systems with focus on Linux. Actually I’ve touched this topic a while ago when I wrote three articles about library programming on Linux (static libraries, dynamic libraries and dynamic libraries using POSIX API). In this series my goal is to go trough basics of Linux system programming from the easiest topics like open file, read file and file write to a bit more complicated things like Berkeley sockets network programming. So lets get started with environment setup and an example of program that copies source file into destination file using POSIX API system calls to demonstrate open(), read() and write() system calls on Linux operating system.
Configuring your environment
I’ll use my trustworthy Ubuntu Linux operating system but you can actually use any POSIX compliant operating system, the only difference will probably be that you will need to configure your environment differently. What we need to begin with Linux system programming is gcc compiler with related packages and POSIX related man pages. So here’s how to install this packages on Ubuntu based operating system:
sudo apt-get install build-essential manpages manpages-dev manpages-posix manpages-posix-dev
Basically that’s all you need to create serious system tools for Linux operating system. Later we will probably need some more libraries but we will install them when necessary.
open(), read() and write() system calls
Lets continue with our first system call open() whose purpose is to open file for reading or writing or to create new file. You should open it’s man page if you haven’t already done so using man 2 open command and read trough basics (2 is manual section number, use man man to read more about integrated manual section numbers). In the following example we also use read() and write() system calls to copy from one file descriptor to the other (both descriptors returned by open() system call) so it is wise to open their man pages as well ( man 2 read and man 2 write ). So here’s the example code for program that copies input file passed as first argument into output file passed as second argument:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59
/* ============================================================================ Name : sp_linux_copy.c Author : Marko Martinović Description : Copy input file into output file ============================================================================ */ #include #include #include #include #include #include #define BUF_SIZE 8192 int main(int argc, char* argv[]) { int input_fd, output_fd; /* Input and output file descriptors */ ssize_t ret_in, ret_out; /* Number of bytes returned by read() and write() */ char buffer[BUF_SIZE]; /* Character buffer */ /* Are src and dest file name arguments missing */ if(argc != 3){ printf ("Usage: cp file1 file2"); return 1; } /* Create input file descriptor */ input_fd = open (argv [1], O_RDONLY); if (input_fd == -1) { perror ("open"); return 2; } /* Create output file descriptor */ output_fd = open(argv[2], O_WRONLY | O_CREAT, 0644); if(output_fd == -1){ perror("open"); return 3; } /* Copy process */ while((ret_in = read (input_fd, &buffer, BUF_SIZE)) > 0){ ret_out = write (output_fd, &buffer, (ssize_t) ret_in); if(ret_out != ret_in){ /* Write error */ perror("write"); return 4; } } /* Close file descriptors */ close (input_fd); close (output_fd); return (EXIT_SUCCESS); }
If you have named this code file sp_linux_copy.c and if you want to name executable file sp_linux_copy to compile this program you would probably use something like this:
gcc -Wall -o sp_linux_copy sp_linux_copy.c
Then if your source file is named source_file.txt and if you want to name the destination file destination_file.txt you would run this program like this:
./sp_linux_copy source_file.txt destination_file.txt
Now lets go trough the code and explain tricky parts. First thing we must do is to include necessary header files. Man page of every system call tells you what header files you need to include to be able to use this system call. Second we will define constant we will use to define size of our buffer in bytes. Smaller buffer size will make our copy process longer but it will save memory. Next we open source and destination file descriptors, source with O_RDONLY to make it read only, destination with O_WRONLY | O_CREAT to make it writable and to create destination file with 0644 file system permission flags. In case of error we use perror() man 3 perror to print relatively user friendly error message.
Now we are ready to start copy process. We run read() and write() inside loop (because source file might be bigger than our buffer) to copy from one file into another. Important to notice is that write() is using number of bytes read from source file returned by read() so it would know how much to write into destination file. If number of bytes read (ret_in) and number of bytes written (ret_out) differ this indicates error so once again we use perror() to print out error description. At the end if all went well we do cleanup by closing both file descriptors and returning 0 (EXIT_SUCCESS) to indicate that program ended without errors.
That’s it for this introductory article on Linux system programming topic. In my next article I will show you few more examples on POSIX input/output and then move on to memory management related system calls.