Перевод строки
Перевод строки, или разрыв строки — продолжение печати текста с новой строки, то есть с левого края на строку ниже, или уже на следующей странице.
Терминология
Возврат каретки (англ. carriage return, CR) — управляющий символ ASCII (0x0D, 1310, ‘\r’), при выводе которого курсор перемещается к левому краю поля, не переходя на другую строку. Этот управляющий символ вводится клавишей «Enter». Будучи записан в файле, в отдельности рассматривается как перевод строки только в системах Macintosh.
Подача строки (от англ. line feed, LF — «подача [бумаги] на строку») — управляющий символ ASCII (0x0A, 10 в десятичной системе счисления, ‘\n’), при выводе которого курсор перемещается на следующую строку. В случае принтера это означает сдвиг бумаги вверх, в случае дисплея — сдвиг курсора вниз, если ещё осталось место, и прокрутку текста вверх, если курсор находился на нижней строке. Возвращается ли при этом курсор к левому краю или нет, зависит от реализации.
Способы представления
Способ представления перевода строки в текстовом файле часто зависит от используемой операционной системы:
LF ( ASCII 0x0A) используется в Multics, UNIX, UNIX-подобных операционных системах (GNU/Linux, AIX, Xenix, Mac OS X, FreeBSD и др.), BeOS, Amiga UNIX, RISC OS и других;
CR ( ASCII 0x0D) используется в 8-битовых машинах Commodore, машинах TRS-80, Apple II, системах Mac OS до версии 9 и OS -9;
CR+LF ( ASCII 0x0D 0x0A) используется в DEC RT-11 и большинстве других ранних не-UNIX- и не-IBM-систем, а также в CP/M, MP/M (англ.), MS -DOS, OS /2, Microsoft Windows, Symbian OS , протоколах Интернет.
Конвертирование
Способы конвертирования файла:
Файл → Сохранить как… → Конец строки → …
tr -d '\r' dos.file >unix.file tr -d '\015' dos.file >unix.file sed --in-place 's/$/\r/' unix2dos.file sed --in-place 's/\x0d$//' dos2unix.file
How can I have a newline in a string in sh?
What should I do to have a newline in a string? Note: This question is not about echo. I’m aware of echo -e , but I’m looking for a solution that allows passing a string (which includes a newline) as an argument to other commands that do not have a similar option to interpret \n ‘s as newlines.
13 Answers 13
If you’re using Bash, you can use backslash-escapes inside of a specially-quoted $’string’ . For example, adding \n :
STR=$'Hello\nWorld' echo "$STR" # quotes are required here!
If you’re using pretty much any other shell, just insert the newline as-is in the string:
Bash recognizes a number of other backslash escape sequences in the $» string. Here is an excerpt from the Bash manual page:
Words of the form $'string' are treated specially. The word expands to string, with backslash-escaped characters replaced as specified by the ANSI C standard. Backslash escape sequences, if present, are decoded as follows: \a alert (bell) \b backspace \e \E an escape character \f form feed \n new line \r carriage return \t horizontal tab \v vertical tab \\ backslash \' single quote \" double quote \nnn the eight-bit character whose value is the octal value nnn (one to three digits) \xHH the eight-bit character whose value is the hexadecimal value HH (one or two hex digits) \cx a control-x character The expanded result is single-quoted, as if the dollar sign had not been present. A double-quoted string preceded by a dollar sign ($"string") will cause the string to be translated according to the current locale. If the current locale is C or POSIX, the dollar sign is ignored. If the string is translated and replaced, the replacement is double-quoted.
Difference between CR LF, LF and CR line break types
I’d like to know the difference (with examples if possible) between CR LF (Windows), LF (Unix) and CR (Macintosh) line break types.
Very similar, but not an exact duplicate. \n is typically represented by a linefeed, but it’s not necessarily a linefeed.
CR and LF are ASCII and Unicode control characters while \r and \n are abstractions used in certain programming languages. Closing this question glosses over fundamental differences between the questions and perpetuates misinformation.
@AdrianMcCarthy It’s a problem with the way close votes act as answers in a way; an answer claiming the two were the same could be downvoted and then greyed out as very, very wrong, but it only takes 4 agreeing votes (comparable to upvotes) to have a very wrong close happen, with no way to counter the vote until after it’s happened.
This formulation of the question is admittedly better, but it is still for all practical purposes the same question.
10 Answers 10
CR and LF are control characters, respectively coded 0x0D (13 decimal) and 0x0A (10 decimal).
They are used to mark a line break in a text file. As you indicated, Windows uses two characters the CR LF sequence; Unix (and macOS starting with Mac OS X 10.0) only uses LF; and the classic Mac OS (before 10.0) used CR.
An apocryphal historical perspective:
As indicated by Peter, CR = Carriage Return and LF = Line Feed, two expressions have their roots in the old typewriters / TTY. LF moved the paper up (but kept the horizontal position identical) and CR brought back the «carriage» so that the next character typed would be at the leftmost position on the paper (but on the same line). CR+LF was doing both, i.e., preparing to type a new line. As time went by the physical semantics of the codes were not applicable, and as memory and floppy disk space were at a premium, some OS designers decided to only use one of the characters, they just didn’t communicate very well with one another 😉
Most modern text editors and text-oriented applications offer options/settings, etc. that allow the automatic detection of the file’s end-of-line convention and to display it accordingly.
so actually Windows is the only OS that uses these characters properly, Carriage Return, followed by a Line Feed.
Would it be accurate, then, to say that a text file created on Windows is the most compatible of the three i.e. the most likely to display on all three OS subsets?
@Hashim it might display properly but trying to run a textual shell script with carriage returns will usually result in an error
Rolf — that statement assumes that keeping old terminology/technology in new technology is correct. CRLF = 2 bytes. CR = 1, LF = 1. With as often as they are used, that actually translates to a huge amount of data. Once again, Windows has chosen to be different from the entirety of the *NIX world.
This is a good summary I found:
The Carriage Return (CR) character ( 0x0D , \r ) moves the cursor to the beginning of the line without advancing to the next line. This character is used as a new line character in Commodore and early Macintosh operating systems (Mac OS 9 and earlier).
The Line Feed (LF) character ( 0x0A , \n ) moves the cursor down to the next line without returning to the beginning of the line. This character is used as a new line character in Unix-based systems (Linux, Mac OS X, etc.)
The End of Line (EOL) sequence ( 0x0D 0x0A , \r\n ) is actually two ASCII characters, a combination of the CR and LF characters. It moves the cursor both down to the next line and to the beginning of that line. This character is used as a new line character in most other non-Unix operating systems including Microsoft Windows, Symbian and others.
The «vertical tab»-character moves the cursor down and keep the position in the line, not the LF-character. The LF is EOL.
@Vicrobot Developers will often split a string or perform other operations with the exact sequence \r\n , so \n\r would not match. Also one would think that text editors also treat the two characters as one sequence and don’t separatly go «oh, now I have to go down one line» and «oh, now I have to move to the front». Were it so then yes, you could freely swap the order around
It’s really just about which bytes are stored in a file. CR is a bytecode for carriage return (from the days of typewriters) and LF similarly, for line feed. It just refers to the bytes that are placed as end-of-line markers.
There is way more information, as always, on Wikipedia.
I think it’s also useful to mention that CR is the escape character \r and LF is the escape character \n . In addition, Wikipedia:Newline.
In Simple words CR and LF is just end of line and new line according to this link , is this correct ?
@shaijut CR stands for Carriage Return. That was what returned the carriage on typewriters. So, mostly correct.
The superior LFCR option is sadly missing. Its benefit is that by doing the Line Feed first, the Selectric golfball can’t smear the just printed line with still fresh ink upon executing the Carriage Return
Actually, it’s not a typewriter but «teletype», old computer client terminals with mechanical print heads and paper, where CR/LF were required for computers to behave properly. If you just did CR, you would have a bunch of characters on top of each other on a paper. If you just did LF, your text lines would slowly migrate to the right on the paper. CR/LF were required for proper teletype based computing. An old Star Trek game would dump «F
Carriage Return (Mac pre-OS X)
Line Feed (Linux, Mac OS X)
Carriage Return and Line Feed (Windows)
If you see ASCII code in a strange format, they are merely the number 13 and 10 in a different radix/base, usually base 8 (octal) or base 16 (hexadecimal).
The \r and \n only works in some programming languages, although it seems to be universal among programming languages that use backslash to indicate special characters.
Jeff Atwood has a blog post about this: The Great Newline Schism
The sequence CR+LF was in common use on many early computer systems that had adopted teletype machines, typically an ASR33, as a console device, because this sequence was required to position those printers at the start of a new line. On these systems, text was often routinely composed to be compatible with these printers, since the concept of device drivers hiding such hardware details from the application was not yet well developed; applications had to talk directly to the teletype machine and follow its conventions. The separation of the two functions concealed the fact that the print head could not return from the far right to the beginning of the next line in one-character time. That is why the sequence was always sent with the CR first. In fact, it was often necessary to send extra characters (extraneous CRs or NULs, which are ignored) to give the print head time to move to the left margin. Even after teletypes were replaced by computer terminals with higher baud rates, many operating systems still supported automatic sending of these fill characters, for compatibility with cheaper terminals that required multiple character times to scroll the display.