Fastest way of reading a file in Linux?
On Linux what would be the fastest way of reading a file in to an array of bytes/to process the bytes? This can include memory-mapping, sys calls etc. I am not familiar with the many Linux-specific functions. In the past I have used boost memory mapping, but I need faster Linux-specific performance rather than portability.
Perhaps you need readv(), this will allow you to gather data into bytes of array. See man readv() for details.
2 Answers 2
mmap should be the fastest way to access the contents of a file if the file is large enough. There’s an initial cost for setting up the memory mappings, but that’s offset by not needing to copy the data from the page cache into userland. And if you want all the contents of the file, the cost to allocate the memory to your program should be more or less the same as the cost of mmap .
Your best bet, as always, is to test and benchmark.
Don’t let yourself get fooled by lazy stuff like memory mapping. Rather focus on what you really need. Do you really need to read the whole file into memory? Then the straight-forward way of opening, reading chunks in a loop, and closing the file will be as fast as it can be done.
But often you don’t really want that. Instead you might want to read specific parts, a block here, a block there, jump through the file, read a block at a specific position, etc.
Then still fseek ing out those positions and fread ing the blocks won’t have overheads worth mentioning. But it can be more convenient to use memory mapping to let the operating system or a library deal with stuff like memory allocation etc. It won’t get the job done faster, though.
How to read a file into a C program on Linux? [closed]
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
a.out file_name passes file_name as an argument to the program, but a.out < file_name pipes the contents of file_name to a.out via stdin . This isn't really a programming question - unix.stackexchange.com might be a better place to ask. It also has nothing to do with c in particular.
@IskarJarak: the < redirects rather than pipes the contents of the file. The program reads directly from the file; it is not given a pipe to read.
3 Answers 3
a.out file_name_here passes «file_name_here» as an argument.
Many programs on Unix systems are filters. If they are given file names to process, those are read. If they are given no file names, then they read standard input instead. An example of such a program is grep ; other examples include cat and sort .
The general solution, in outline, is:
extern void process_file(FILE *fp); // Where the real work is done int main(int argc, char **argv) < int rc = EXIT_SUCCESS; if (argc == 1) process_file(stdin); else < for (int i = 1; i < argc; i++) < FILE *fp = fopen(argv[i], "r"); if (fp == 0) < fprintf(stderr, "%s: failed to open file %s for reading\n", argv[0], argv[i]); rc = EXIT_FAILURE; >else < process_file(fp); fclose(fp); >> > return rc; >
This will process any command line arguments as files to be read, resorting to reading standard input if no files are specified on the command line. There are legions of extra tweaks you can make to this outline. You can easily add option processing with getopt() (or getopt_long() if you’re using GNU), and you can treat a file name of — as standard input if you wish. You can exit on failure to open a file if you think that’s appropriate (sometimes it is; sometimes it isn’t — grep doesn’t, for example). You can pass the file name to the process_file() function. You can have the process_file() function report success failure, and track whether everything worked, exiting with zero only if all the operations were successful.