Zero length file linux

What is the concept of creating a file with zero bytes in Linux?

I could see the test file with 0 Bytes in the directory. But how does the Operating System handle a concept of 0 Bytes. If I put it in layman terms:

7 Answers 7

A file is (roughly) three separate things:

  • An «inode», a metadata structure that keeps track of who owns the file, permissions, and a list of blocks on disk that actually contain the data.
  • One or more directory entries (the file names) that point to that inode
  • The actual blocks of data themselves

When you create an empty file, you create only the inode and a directory entry pointing to that inode. Same for sparse files ( dd if=/dev/null of=sparse_file bs=10M seek=1 ).

When you create hardlinks to an existing file, you just create additional directory entries that point to the same inode.

I have simplified things here, but you get the idea.

nicely stated. while promoting one small conundrum by your «hard-links» paragraph: if one creates a hard-link to a empty file, which you state has no list of blocks, how can that hard-link point to the (same) list of blocks which don’t exist?

@Theophrastus Good point. I have made made my possible to simplify things. Actually between the list of blocks and the directory entries, there are metadata pertaining to the file (referred to by an inode number) and that contain file attributes (owner, permissions, . ) and extended attributes. The list of blocks is in there. So all the directory entries do not point directly to the list of blocks (the FAT way), but to metadata.

Should be three separate things: A list of blocks that contain data; the blocks themselves; and a directory entry (or entries) that points to the list of blocks.

@MontyHarder, actually, I liked the simplicity of the explanation before, without getting so deep into terminology. This is the problem with community edited solutions like Wikipedia: simple explanations that get the idea across get mangled to the point of unreadability for beginners simply because experts aren’t satisfied with the technical precision.

@Wildcard Even if you’re a beginner, understanding the difference between an inode and a directory is important. When someone changes permissions/ownership of «a directory name» and thinks other links to the same inode will retain the old permissions/ownership, Something Very Bad could happen. We don’t have to delve into the details of how inodes reference direct blocks, indirect blocks, doubly- and triply-indirect blocks to get that it’s a list of blocks. Or that a list can be empty.

touch will create an inode, and ls -i or stat will show info about the inode:

$ touch test $ ls -i test 28971114 test $ stat test File: ‘test’ Size: 0 Blocks: 0 IO Block: 4096 regular empty file Device: fc01h/64513d Inode: 28971114 Links: 1 Access: (0664/-rw-rw-r--) Uid: ( 1000/1000) Gid: ( 1000/1000) Access: 2017-03-28 17:38:07.221131925 +0200 Modify: 2017-03-28 17:38:07.221131925 +0200 Change: 2017-03-28 17:38:07.221131925 +0200 Birth: - 

Notice that test uses 0 blocks. To store the data displayed, the inode uses some bytes. Those bytes are stored in the inode table. Look at the ext2 page for an example of an inode structure.

Читайте также:  Linux установка tar gz debian

ls (or well, the stat(2) system call) tells you the size of the contents of the file. How much space the filesystem needs for bookkeeping is not part of that, and as an implementation detail, it’s not something that programs in general should care or even know about. Making the implementation details visible would make the filesystem abstraction less useful.

The file, itself, does not occupy any space, but the file system does, storing the filename, location, access rights to it and the like.

If you look at the space occupied by the directory entry, if you have a directory containing a thousand files that are 0 bytes in size, the directory will be bigger than a directory entry that has just 2 huge files.

props for mentioning that a file is an abstract concept that is not tightly linked with its physical representation on e.g. a disk.

Simple answer: Because it’s defined that way.

Longer answer: It’s defined that way because some operations are conceptually simpler:

  • If a file contains 20 letters «A», and you remove all «A»s, then the file will become 20 bytes shorter. The same operation on a file that consisted of just «AAAAAAAAAAAAAAAAAAAA» would have to deal with the special case of a vanishing file.
  • More practically, deleting the last line of a text file would need to be special-cased.
  • Text editors that regularly make a backup would need special-case code to deal with the situation that the user might delete the last line, go to lunch, then come back and add another line. Further complications arise if some other users created a file with that name in the mean time.

You can do more things: * Error log files tend to be created empty, to be filled if and only if an error happens. * To find out how many errors happened, you count the number of lines in the log files. If the log file is empty, the number of errors is zero, which makes perfect sense. * Sometimes you see files where all the relevant text is in the file name, e.g. this-is-the-logging-directory . This prevents overeager administrators from deleting empty directories after installation, and it also prevents bugs where a program or a user accidentally creates a file where the program would like to see a directory later. The git program (and others) tend to ignore empty directories, and if a project/administrator/user wants to have a record that the directory exists even though it has no useful content (yet), you may see an empty file named empty or empty.directory .

No operations become more complicated:

  • Concatenating files: this is just a no-op with an empty file.
  • Searching for a string in a file: this is covered by the standard case of «if the file is shorter than the search term, it cannot contain the search term».
  • Reading from the file: programs need to deal with hitting the end of the file before they got what they expected, so again the case of a zero-length file does not involve extra thinking for the programmer: he’ll just hit end-of-file from the beginning.
Читайте также:  Jcpkcs11 2 alt linux

In the case of files, the «there is a file recorded somewhere» aspect (inode and/or file name) comes on top of the above considerations, but file systems would not do that if empty files were useless.

In general, all of the above reasons except those related to file names apply to sequences. Most notably to strings, which are sequences of characters: Zero-length strings are commonplace inside of programs. String are usually disallowed at the user level if they don’t make sense: a file name is a string, and most file systems do not allow an empty string as a file name; internally, when creating file names from fragments, the program may well have an empty string as one of the fragments.

Источник

Create empty multiple files in Unix that contain 0 bytes using awk or bash?

I am trying to create multiple empty files on Unix but I do not know if AWK command can do that or do I have to do it in bash? if yes could you help me, please? At this moment I use this command:

however every file that is created is 1 byte. Is there a way to create 0 byte file using awk or bash?

4 Answers 4

or touch $(seq 2 | sed ‘s/.*/empty&.txt/’) to eliminate the extra sed, or even touch $(seq -f ’empty%.0f.txt’ 2) ? Why use sed at all?

If you really want to do it in awk :

seq 2 | awk '" "empty" FNR ".txt")>' 

That’s not very clever though because it creates a whole new process for every line that awk reads.

@EdMorton’s suggestion is therefore preferable, as he uses awk ‘s internal printf rather than creating a separate process:

You would be better to use this:

which only creates one process. However, that will not reduce to zero the size of any pre-existing files you may have — I don’t generally like to assume folk are happy to lose data unless they explicitly say so in their question.

The touch <1..2>.txt is IMO the best answer so far! One process and it achieves the ultimate goal of zero byte size files in the convention asked for. +1

You could also try: seq 2 | awk ‘ «empty»NR».txt»>’ .

As pointed out elsewhere, touch won’t guarantee the files are empty. If you do use touch , you might want to test whether any of the filenames already correspond to (non-empty) files .

The touch command will create a file if it doesn’t exist. So you could do:

for i in `seq 1 2`; do touch empty$i.txt done 

If you don’t want to use any other commands like touch , and do it purely using bash commands, you could use the redirection operator > without a command in front of it:

for i in `seq 1 2`; do > empty$i.txt done 
> filename # The > truncates file "filename" to zero length. # If file not present, creates zero-length file (same effect as 'touch'). # (Same result as ": >", above, but this does not work with some shells.) 

The awk command can indeed create empty files on its own, without spawning something in system() :

$ awk 'BEGIN < printf("") >"test1" >' $ ls -l test1 -rw-r--r-- 1 ghoti staff 0 Jan 9 09:58 test1 $ 

Note that each output file is recorded in memory upon use, and that memory is not given back until the output file is closed. You can use the close() function to close a file.

But you by no means need awk for this, the touch works nicely from multiple shells (including bash of course), and the mechanism you use to generate filenames for it are numerous and varied.

Источник

Find and delete the zero size files and empty directories

thumb

How to, in the terminal, using the find utility, find and optionally delete all zero bytes/size/length files and empty directories in the specified directory including subdirectories.

Zero size files

To find all zero size files, simply use:

This commands will find all zero size files in the current directory with subdirectories and then print the full pathname for each file to the screen.

  • The ./ means start searching from the current directory. If you want to find files from another directory then replace the ./ with the path to needed directory. For example, to search everything under the system log directory you need to replace ./ with /var/log .
  • The -type f flag is specifies to find only files.
  • The -size 0 and -empty flags is specifies to find zero length files.

To find and then delete all zero size files, there are variants you can use:

find ./ -type f -size 0 -exec rm -f <> \; 
find ./ -type f -size 0 | xargs rm -f 
find ./ -type f -size 0 -delete 

The xargs will cause all the filenames to be sent as arguments to the rm -f commands. This will save processes that are forked everytime -exec rm -f is run. But is fails with spaces etc in file names.

The -delete is the best when it is supported by the find you are using (because it avoids the overhead of executing the rm command by doing the unlink() call inside find() .

Empty directories

To find all empty directories, simply use:

This command will find all empty directories in the current directory with subdirectories and then print the full pathname for each empty directory to the screen.

  • The ./ means start searching from the current directory. If you want to find files from another directory then replace the ./ with the path to needed directory. For example, to search everything under the system log directory you need to replace ./ with /var/log .
  • The -type d flag is specifies to find only directories.
  • The -empty flag is specifies to find empty directories.

To find and then delete all empty directories, use:

find ./ -depth -type d -empty -exec rmdir <> \; 
find ./ -type d -empty -delete 

The -delete is the best when it is supported by the find you are using.

If this article has helped you then please leave a comment

Buy me a coffee!

If this post helped you out and you’d like to show your support, please consider fueling future posts by buying me a coffee cup!

Arthur Gareginyan

Arthur is a designer and full stack software engineer. He is the founder of Space X-Chimp and the blog My Cyber Universe. His personal website can be found at arthurgareginyan.com.

Featured WordPress Plugin

Social Media Follow Buttons Bar

Featured WordPress Plugin

Head and Footer Scripts Inserter

Featured WordPress Plugin

My Custom Functions

Источник

Оцените статью
Adblock
detector