Linux file system naming

Unix file naming convention [closed]

Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.

I was wondering what is the naming convention for files in Unix? I am not sure about this, but I think there is perhaps a universal naming convention that one should follow? For example, I want to name a file say: backup with part 2 and random Should I do it like this: backup_part2_random OR backup-part2-random OR backup.part2.random I hope the question is clear. Basically, I want to choose a format that conforms to the Unix philosophy.

As a general comment re the «conventions» . I’ve just read all the answers so far, and it struck me how odd it is that there is almost an obscession with using only one case in a system where (I think) one of its strengths is the ability to meaningfully use both cases. Was the original design (case-sensitive) an over design). just musing

14 Answers 14

. is used to separate a filetype extension, e.g. foo.txt .

— or _ is used to separate logical words, e.g. my-big-file.txt or sometimes my_big_file.txt . — is better because you don’t have to press the Shift key (at least with a standard US English PC keyboard), others prefer _ because it looks more like a space.

So if I understand your example, backup-part2-random or backup_part2_random would be closest to the normal Unix convention.

CamelCase is normally not used on Linux/Unix systems. Have a look at file names in /bin and /usr/bin . CamelCase is the exception rather than the rule on Unix and Linux systems.

( NetworkManager is the only example I can think of that uses CamelCase, and it was written by a Mac developer. Many have complained about this choice of name. On Ubuntu, they have actually renamed the script to network-manager .)

For example, on /usr/bin on my system:

$ ls -d [A-Z]* | wc -w # files starting with a capital 6 $ ls -d *_* | wc -w # files containing an underscore 178 $ ls -d *-* | wc -w # files containing a minus/dash 409 

and even then, none of the files starting with a capital uses CamelCase:

$ ls -d [A-Z]* GET HEAD POST X11 Xvnc Xvnc4 

The . char can also be used to rotate things, not only to specify an extension. For example my.log my.log.1 my.log.2.gz .

+1 for the references. (@Proletariat, the ls output from /usr/bin is a reference. This is a question about conventions.)

Far more important that a particular convention is being consistent. Pick a style, and stick with it.

My take on Unix/Linux filename conventions:

  • Unix/Linux filesystems don’t inherently support the notion of an extension. The concept of a file extension completely exists as something supported by utilities such as cp , ls , or the shell you are using. I believe it is this way on NTFS as well, but I could be wrong.
  • Executables, including shell scripts, usually never have any type of extension. Scripts will have a hashbang line (i.e. #!/bin/bash ) that identifies what program should interpret it.
  • Any executable that is two letters long is super important. So don’t name your executables two-letter filenames. Any file in /etc ending in tab is also super important, such as fstab , mtab , inittab .
  • Sometimes .d is appended to directory names, particularly in /etc , but this isn’t widespread (UPDATE: https://serverfault.com/questions/240181/what-does-the-suffix-d-mean-in-linux)
  • rc is widely used for configuration scripts or files, either prepending (e.g., rc.local ) or suffixing ( .vimrc )
  • The Unix/Linux community has never had a three-character limit on extensions and frowns upon shortening well know extensions to fit. For example, don’t use .htm at the end of HTML files on Unix/Linux, use .html .
  • In a set of files, a filename is sometimes capitalized, or in all caps, so it appears at the head of a directory listing. The classic example is Makefile in source packages. Only do this for stuff like README .
  • ~ is used to identify a backup file or a directory, as in important_stuff~ , or /etc~ . Many shells will expand a lone ~ to $HOME .
  • Library files almost always begin with lib . Exception is zlib and probably a few others.
  • Scripts that are called by inetd sometimes are tagged with a leading in. , such as in.tftpd .
  • The ending z in vmlinuz means zipped, but I’ve never seen any other file named this way.
Читайте также:  Simple linux добавить репозиторий

I often see shell scripts with a .sh «extension» on them. I personally find it somewhat annoying, but I have to admit that I may be ignorant of some good reason for using the .sh .

A thing comes to mind that it’s useful to emphasize the fact that it is a text-based script and not a binary.

@DanMoulding, personally, I use .sh on scripts which are (1) not intended to be run interactively, but only from other scripts/programs, or (2) are designed for sourcing rather than execution. For the former they must be executable; for the latter I leave the executable bit off and use the shebang line only for documentation of what shell the functions are written for.

@Wildcard I have since (6 years ago) gotten into this same habit. The extension actually makes a lot of sense for sourcing script bits. For instance, from an executable script written for zsh (i.e. #!/bin/zsh at the top) you know you can safely source another file with the .zsh extension and be sure that it contains legal zsh code. If your executable script is strictly Bourne Shell compliant (i.e. #!/bin/sh at the top), then you’d know that sourcing that .zsh file is going to be problematic.

I find using «.sh», «.py», «.pl», etc., is convenient, and some text editors (e.g., Geany) use those to make a first guess at the proper syntax highlighting scheme.

    In the Naming Variables, Functions, and Files section of the GNU Coding Standards you’ll find:

Please use underscores to separate words in a name, so that the Emacs word commands can be useful within them. Stick to lower case;

  • 44.6% of the time only dash is used
  • 54.1% of the time only underscore
  • 1.2% of the time a file uses both.
Читайте также:  Linux file exist test

Interestingly, the source for git weighs in at 85% for dashes, 3.8% for underscores, and 11.1% for both.

The choice is clear, debate over. 😉

Personal opinion: I use dashes for aesthetic and shift key reasons. If you’re working on a team, take a vote. But to reiterate what’s been said, be consistent.

* or «be_all and end_all» if you like

In unix filename is just a string, unlike DOS, where filename was composed from name and extension. So any of given filenames is completely acceptable.

But many programs still use file suffixes beginning with dot to distinguish different file types, i.e. Apache Web Server uses suffixes to set correct MIME type in answer headers.

While gelraen is 100% correct: Unix/Linux as such does not care about file extensions, modern flavours of Linux do care in so far as some shell extensions provide special identification (colours or otherwise) of certain file types and file managers provide automatic associations with programs. But just as important is for the human user to know which file is what type. To that end it is convenient to stick to a standard scheme not just consistent for yourself but with others. In this respect things should not be overly different than MS Windows (or MIME).

That said sometimes several different extension styles can match the same purpose. Thus .tar.gz is equivalent to .tgz, .tar.bz2 = .tbz, .ps.gz is often shortened as .ps (confusingly) and I’m sure there are many more.

@jonescb, yes of course. My point about it being confusing is that when I see .ps I expect a non compressed file (which I should be able to cat or less), but often .ps files are compressed and should in fact be .ps.gz for clarity (as they require zcat or zless for source code viewing). Some people decided to just suffix compressed PostScript files with .ps anyway because some common ps viewers actually don’t mind whether they are compressed or not.

Stick to alphanumeric filenames. Avoid spaces or replace spaces with underscores ( _ ). Limit punctuation in file names to periods (.), underscores ( _ ), and hyphens (-). Generally filenames are lowercase, but I use CamelCase when I have multiple words in the filename.

Use extensions which indicate the type of file. Programs do not need extensions as the execute bit is used to indicate programs, and the shells know how to run programs of various types. It is common but not required to (.sh) for shell scripts, and (.pl) for perl scripts. The Windows executable extensions .bat, .com, .scr, and .exe indicate Windows executables on Unix.

Pick a standard and stick to it. But it won’t break things if you avoid it.

Hidden (or dot) files have names starting with a period. These normally don’t show up in directory listings. Use ‘ls -a’ to include the dot files in the list.

It’s not «bad» versus «good». It’s «this is how it’s usually done». It’s a convention the OP was asking for. The reason? It could be because Unix people don’t like pressing Shift, it could be because old systems only had UPPERCASE, or for another reason. I’m not sure.

Читайте также:  Mount sharing windows folder linux

@Mikel I also program Java where CamelCase is a convention. Sometimes patterns and conventions conflict.

@ultrasawblade Thanks, shows how often I script Windows. I tried to skip the rarer executable extensions like cmd, pif, vb*, wsh, and the rest of them.

Characters you should not use in filenames:

Character delimiters you should use to make names easier to read:

(In some cases «:» has special meaning though)

Of course, you can’t even use «/» in filenames. Everything else is possible. And if you want to make it hard to access, even useful 😉

The list is actually a lot longer, including control and non-ASCII characters. Yes, you can have a backspace as part of a *nix file name.

More to the point, most *nix systems only disallow two specific characters in file names: the / path separator, and the \0 (ASCII zero) string terminator.

To add to what others have said, I’d just say that while accented letters and many special characters are legal in filenames they can cause issues in any of the following scenarios:

  • You share your filesystem with other computers, particularly with different operating systems;
  • You share files with others (and although email tends to be quite good with conversions, sometimes it just does not work);
  • You use shell scripts to automate some tasks (spaces are particularly problematic, though there are many ways to deal with them);
  • You use a file share from another computer.
cat foo_bar-env.sh foo_bar() < echo baz >EOF 

One convention is to use «_» to replace spaces as separators between words. Other characters could be used to replace spaces, but there are slightly stronger conventional uses for «-» and «.» in pathnames, so «_» is usually preferred.

Spaces are legal in pathnames, but are conventionally avoided, because they require quoting the pathname («foo bar») or escaping the spaces (foo\ bar). A properly written shell script will quote variables that may include spaces, particularly pathnames, but failing to do so is a common oversight, and it’s a lot of extra typing when doing a one-off command entered at the command line.

Using «-» to separate clusters of numbers, as in timestamps or serial numbers, is a convention commonly used outside the context of filesystems. Using «.» to separate «file extensions» that indicate the type of file is very common, and some important tools depend on it. For instance, the package management system on Red Hat Enterprise Linux and its derivatives, RPM, expects package files to end with «.rpm». The traditional tarball is a tar file («.tar») that has been gzipped («.gz»), and so ends in «.tar.gz».

So putting these together, you often end up with filenames that look like, «home_backup_2017-07-01.tar.gz»

Источник

Оцените статью
Adblock
detector