Linux open file no cache

Is it possible in linux to disable filesystem caching for specific files? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.

This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.

I have some large files and i am ok with them being read at disk I/O capacity. I wish to have file system cache free for other files. Is it possible to turn of file system caching for specific files in linux ?

SO is for programming questions, not questions about using or configuring Linux and its applications. SuperUser or Unix & Linux would be better places for questions like this.

2 Answers 2

Your question hints that you might not be the author of the program you wish to control. If that’s the case the answer is «not easily». If you are looking for something where you just mark (e.g. via extended attributes) a particular set of files «nocache» the answer is no. At best you are limited to having a LD_PRELOAD wrapper around the program and the wrapper would have to be written carefully to avoid impacting all files the program would try to open etc.

If you ARE the author of the program you should take a look at using fadvise (or the equivalent madvise if you’re using mmap ) because after you have finished reading the data you can hint to the kernel that it should discard the pieces it cached by using the FADV_DONTNEED parameter (why not use FADV_NOREUSE ? Because with Linux kernels available at the time of writing it’s a no-op).

Читайте также:  Очистить консоль в линукс

Another technique if you’re the author would be to open the file with the O_DIRECT flag set but I do not recommend this unless you really know what you’re doing. O_DIRECT comes with a large set of usage constraints and conditions on its use (which people often don’t notice or understand the impact of until it’s too late):

  • You MUST do I/O in multiples of the disk’s block size (no smaller than 512 bytes but not unusual for it to be 4Kbytes and it can be some other larger multiple) and you must only use offsets that are similarly well aligned.
  • The buffers of your program will have to conform to an alignment rule.
  • Filesystems can choose not to support O_DIRECT so your program has to handle that.
  • Filesystems may simply choose to put your I/O through the page cache anyway ( O_DIRECT is a «best effort» hint).

NB: Not allowing caching to be used at all (i.e. not even on the initial read) may lead to the file being read in at speeds far below what the disk can achieve.

Источник

Read file without disk caching in Linux

I have a C program that runs only weekly, and reads a large amount of files only once. Since Linux also caches everything that’s read, they fill up the cache needlessly and this slows down the system a lot unless it has an SSD drive. So how do I open and read from a file without filling up the disk cache? Note: By disk caching I mean that when you read a file twice, the second time it’s read from RAM, not from disk. I.e. data once read from the disk is left in RAM, so subsequent reads of the same file will not need to reread the data from disk.

Читайте также:  Apache server configure in linux

You’d think Linux would have some configuration regarding disk caching. Either way, is this really a C problem? You would have the same problem regardless of the programming language, wouldn’t you? Have you tried running the program in valgrind? It could be that you have memory leaks.

Well, if you hadn’t asked for C you would’ve got more «Linux» answers. Please answer all of my questions: Have you tried running your program in valgrind?

2 Answers 2

I believe passing O_DIRECT to open() should help:

O_DIRECT (Since Linux 2.4.10)

Try to minimize cache effects of the I/O to and from this file. In general this will degrade performance, but it is useful in special situations, such as when applications do their own caching. File I/O is done directly to/from user space buffers. The O_DIRECT flag on its own makes at an effort to transfer data synchronously, but does not give the guarantees of the O_SYNC that data and necessary metadata are transferred. To guarantee synchronous I/O the O_SYNC must be used in addition to O_DIRECT.

There are further detailed notes on O_DIRECT towards the bottom of the man page, including a fun quote from Linus.

Источник

Is it possible in linux to disable filesystem caching for specific files?

I have some large files and i am ok with them being read at disk I/O capacity. I wish to have file-system cache free for other files. Is it possible to turn off file-system caching for specific files, on Linux? I wish to do this programmatically via native lib + java.

2 Answers 2

You can do so for an opened instance of the file, but not persistently for the file itself. You do so per instance of the opened file by using direct IO. I’m not sure how to do this in Java, but in C and C++, you pass the O_DIRECT flag to the open() call.

Читайте также:  Отличие операционных систем linux dos

Note however that this has a couple of potentially problematic implications, namely:

  • It’s downright dangerous on certain filesystems. Most notably, current versions of BTRFS have serious issues with direct IO when you’re writing to the file.
  • You can’t mix direct IO with regular cached I/O unless you use some form of synchronization. Cached writes won’t show up for certain to direct IO reads until you call fsync() or fdatasync() , and direct IO writes may not show up for cached IO reads ever.

There is however an alternative method if you can tolerate having the data temporarily in cache. You can use the POSIX fadvise interface (through the posix_fadvise system call on Linux) to tell the kernel you don’t need data from the file when you’re done reading it. By using the POSIX_FADV_DONTNEED flag, you can tell the kernel to drop a specific region of a particular file from cache. You can actually do this as you are processing the file too (by reading a chunk, and then immediately after reading calling posix_fadvise on that region of the file), though the regions you call this on have to be aligned to the system’s page size. This is generally the preferred portable method of handling things, as it works on any POSIX compliant system with the real-time extensions (which is pretty much any POSIX compliant system).

Источник

Оцените статью
Adblock
detector