Linux zero file size

In linux a file is not empty but the size is 0

I test whether a file is empty in Shell. test -s /sys/fs/cgroup/systemd/docker/d4e311735706485e748513bad611070e223cba76fdf4c72a1102d14b653da750/tasks It returns false, and I found its size is 0 when I use ls -lh , but when I use cat , I can get 4071 in this file, this means the file is not empty. I think maybe this file is too small, I create a file in my home directory, and echo 4071 to it, I find its size is not 0. Is the file in /sys/fs/cgroup special?

doesn’t necessarily mean «at the same time» — maybe the file is written to and being truncated successively? Is it a regular file at all, can you show the ls -l output?

@Stefan Hegny Thanks for your answer, here is the ls -lh output: -rw-r—r— 1 root root 0 Dec 11 15:47 tasks I think you are right, this file maybe is written to and beingtruncated, this file records the processes of the container, and it is updated all the time.

@ cheon at least I’m quite sure that it’s not a matter of «too small». Greater than zero in computer language means greater than zero and there is no other «too small» layer .

1 Answer 1

The file that you are dealing with is a special file which is a part of the cgroup file system.

To understand why this happens, let’s see what happens when you do test -e $filename .

We will be using strace command which prints the system calls a command does.

If you do strace test -e $filename , you will find this line in the results:

In this case it returns st_size = 0 which is the size of the file.

But the questions is what actually happens on the other side, inside the kernel:

When you try to deal with a file, you do a system call which goes to an intermediate layer in the kernel called the virtual file system which in turn calls the part responsible for the information needed. A stat system call will try to get the status out of the inode corresponding with file. The file system can create and manipulate the inode as it wants.

Cgroup is a special file system, when it adds a file (using the cgroup_add_file function defined in kernel/cgroup.c) it always passes size 0 to __kernfs_create_file so that is why any file inside /sys/fs/cgroups (created by cgroup fs) will always has a zero size regardless of the actual contents of the file.

For the other part, when cat the file. If you do strace cat $filename , that is what you will get:

open("$filename", O_RDONLY) = 3 read(3, ". ", 131072) = ### 

The read system call will go through the virtual file system to the kernel file system and using the file operations associated with the file, it will get you the needed data.

Читайте также:  Recording screen on linux

Cgroup fs has functions to generate the data in its files. This is how tasks file is defined in kernel/cgroup.c

So seq_start, seq_next, seq_stop and seq_show are the functions responsible for generating the information needed for the file. You can easily go to kernel/cgroups.c and check for what they do.

Please note that if you are trying to know if the cgroup still has tasks, an easier way is to use notify on release.

If the notify_on_release flag is enabled (1) in a cgroup, then whenever the last task in the cgroup leaves (exits or attaches to some other cgroup) and the last child cgroup of that cgroup is removed, then the kernel runs the command specified by the contents of the «release_agent» file in that hierarchy’s root directory, supplying the pathname (relative to the mount point of the cgroup file system) of the abandoned cgroup. This enables automatic removal of abandoned cgroups. The default value of notify_on_release in the root cgroup at system boot is disabled (0). The default value of other cgroups at creation is the current value of their parents’ notify_on_release settings. The default value of a cgroup hierarchy’s release_agent path is empty.

Источник

What is the concept of creating a file with zero bytes in Linux?

I could see the test file with 0 Bytes in the directory. But how does the Operating System handle a concept of 0 Bytes. If I put it in layman terms:

7 Answers 7

A file is (roughly) three separate things:

  • An «inode», a metadata structure that keeps track of who owns the file, permissions, and a list of blocks on disk that actually contain the data.
  • One or more directory entries (the file names) that point to that inode
  • The actual blocks of data themselves

When you create an empty file, you create only the inode and a directory entry pointing to that inode. Same for sparse files ( dd if=/dev/null of=sparse_file bs=10M seek=1 ).

When you create hardlinks to an existing file, you just create additional directory entries that point to the same inode.

I have simplified things here, but you get the idea.

nicely stated. while promoting one small conundrum by your «hard-links» paragraph: if one creates a hard-link to a empty file, which you state has no list of blocks, how can that hard-link point to the (same) list of blocks which don’t exist?

@Theophrastus Good point. I have made made my possible to simplify things. Actually between the list of blocks and the directory entries, there are metadata pertaining to the file (referred to by an inode number) and that contain file attributes (owner, permissions, . ) and extended attributes. The list of blocks is in there. So all the directory entries do not point directly to the list of blocks (the FAT way), but to metadata.

Читайте также:  Установка пакетов pip linux

Should be three separate things: A list of blocks that contain data; the blocks themselves; and a directory entry (or entries) that points to the list of blocks.

@MontyHarder, actually, I liked the simplicity of the explanation before, without getting so deep into terminology. This is the problem with community edited solutions like Wikipedia: simple explanations that get the idea across get mangled to the point of unreadability for beginners simply because experts aren’t satisfied with the technical precision.

@Wildcard Even if you’re a beginner, understanding the difference between an inode and a directory is important. When someone changes permissions/ownership of «a directory name» and thinks other links to the same inode will retain the old permissions/ownership, Something Very Bad could happen. We don’t have to delve into the details of how inodes reference direct blocks, indirect blocks, doubly- and triply-indirect blocks to get that it’s a list of blocks. Or that a list can be empty.

touch will create an inode, and ls -i or stat will show info about the inode:

$ touch test $ ls -i test 28971114 test $ stat test File: ‘test’ Size: 0 Blocks: 0 IO Block: 4096 regular empty file Device: fc01h/64513d Inode: 28971114 Links: 1 Access: (0664/-rw-rw-r--) Uid: ( 1000/1000) Gid: ( 1000/1000) Access: 2017-03-28 17:38:07.221131925 +0200 Modify: 2017-03-28 17:38:07.221131925 +0200 Change: 2017-03-28 17:38:07.221131925 +0200 Birth: - 

Notice that test uses 0 blocks. To store the data displayed, the inode uses some bytes. Those bytes are stored in the inode table. Look at the ext2 page for an example of an inode structure.

ls (or well, the stat(2) system call) tells you the size of the contents of the file. How much space the filesystem needs for bookkeeping is not part of that, and as an implementation detail, it’s not something that programs in general should care or even know about. Making the implementation details visible would make the filesystem abstraction less useful.

The file, itself, does not occupy any space, but the file system does, storing the filename, location, access rights to it and the like.

If you look at the space occupied by the directory entry, if you have a directory containing a thousand files that are 0 bytes in size, the directory will be bigger than a directory entry that has just 2 huge files.

props for mentioning that a file is an abstract concept that is not tightly linked with its physical representation on e.g. a disk.

Simple answer: Because it’s defined that way.

Longer answer: It’s defined that way because some operations are conceptually simpler:

  • If a file contains 20 letters «A», and you remove all «A»s, then the file will become 20 bytes shorter. The same operation on a file that consisted of just «AAAAAAAAAAAAAAAAAAAA» would have to deal with the special case of a vanishing file.
  • More practically, deleting the last line of a text file would need to be special-cased.
  • Text editors that regularly make a backup would need special-case code to deal with the situation that the user might delete the last line, go to lunch, then come back and add another line. Further complications arise if some other users created a file with that name in the mean time.
Читайте также:  What linux hackers use

You can do more things: * Error log files tend to be created empty, to be filled if and only if an error happens. * To find out how many errors happened, you count the number of lines in the log files. If the log file is empty, the number of errors is zero, which makes perfect sense. * Sometimes you see files where all the relevant text is in the file name, e.g. this-is-the-logging-directory . This prevents overeager administrators from deleting empty directories after installation, and it also prevents bugs where a program or a user accidentally creates a file where the program would like to see a directory later. The git program (and others) tend to ignore empty directories, and if a project/administrator/user wants to have a record that the directory exists even though it has no useful content (yet), you may see an empty file named empty or empty.directory .

No operations become more complicated:

  • Concatenating files: this is just a no-op with an empty file.
  • Searching for a string in a file: this is covered by the standard case of «if the file is shorter than the search term, it cannot contain the search term».
  • Reading from the file: programs need to deal with hitting the end of the file before they got what they expected, so again the case of a zero-length file does not involve extra thinking for the programmer: he’ll just hit end-of-file from the beginning.

In the case of files, the «there is a file recorded somewhere» aspect (inode and/or file name) comes on top of the above considerations, but file systems would not do that if empty files were useless.

In general, all of the above reasons except those related to file names apply to sequences. Most notably to strings, which are sequences of characters: Zero-length strings are commonplace inside of programs. String are usually disallowed at the user level if they don’t make sense: a file name is a string, and most file systems do not allow an empty string as a file name; internally, when creating file names from fragments, the program may well have an empty string as one of the fragments.

Источник

Оцените статью
Adblock
detector