Удалить перенос строки linux

Содержание

Remove carriage return in Unix
21 Answers 21
Removing \r on any UNIX® system:
With sed :
With tr :
Difference between sed and tr :
Testing:
Удаление знаков переноса строки в bash
3 ответа 3
Удаление знаков переноса с возвратом строки
4 ответа 4

Remove carriage return in Unix

I think using tr -d command is the simplest method, but I am wondering how to remove just the last carriage return?

21 Answers 21

I’m going to assume you mean carriage returns ( CR , «\r» , 0x0d ) at the ends of lines rather than just blindly within a file (you may have them in the middle of strings for all I know). Using this test file with a CR at the end of the first line only:

$ cat infile hello goodbye $ cat infile | od -c 0000000 h e l l o \r \n g o o d b y e \n 0000017

dos2unix is the way to go if it’s installed on your system:

$ cat infile | dos2unix | od -c 0000000 h e l l o \n g o o d b y e \n 0000016

If for some reason dos2unix is not available to you, then sed will do it:

$ cat infile | sed 's/\r$//' | od -c 0000000 h e l l o \n g o o d b y e \n 0000016

If for some reason sed is not available to you, then ed will do it, in a complicated way:

$ echo ',s/\r\n/\n/ > w !cat > Q' | ed infile 2>/dev/null | od -c 0000000 h e l l o \n g o o d b y e \n 0000016

If you don’t have any of those tools installed on your box, you’ve got bigger problems than trying to convert files 🙂

To fix issue on mac you can also prefix the single-quote sed string with $ like so: sed $’s@\r@@g’ |od -c (but if you would replace with \n you would need to escape it)

Not great: 1. doesn’t work inplace, 2. can replace \r also not at EOL (which may or may not be what you want. ).

1. Most unixy tools work that way, and it’s usually the safest way to go about things since if you screw up you still have the original. 2. The question as stated is to remove carriage returns, not to convert line endings. But there are plenty of other answers that might serve you better.

If your tr does not support the \r escape, try ‘\015’ or perhaps a literal ‘^M’ (in many shells on many terminals, ctrl-V ctrl-M will produce a literal ctrl-M character).

The simplest way on Linux is, in my humble opinion,

-i will edit the file in place, while the .bak will create a backup of the original file by making a copy of your file and adding the extension .bak at the end. (You can specify what ever you want after the -i , or specify only -i to not create a backup.)

The strong quotes around the substitution operator ‘s/\r//’ are essential. Without them the shell will interpret \r as an escape+r and reduce it to a plain r , and remove all lower case r . That’s why the answer given above in 2009 by Rob doesn’t work.

And adding the /g modifier ensures that even multiple \r will be removed, and not only the first one.

I would advise not to use the -i flag since it modifies the original file, and it might be dangerous if you wish to keep it unchanged

If you’re matching on \r$ then /g does nothing, because it will only replace the last character before the end of the line. For example printf ‘foo\r\r\r\n’ | sed ‘s/\r$//g’ | od -c keeps two \r s. ‘s/\r\+$//’ will do what you want (though I don’t know that repeated carriage returns is really something to be concerned about).

tr -d '\r' < filewithcarriagereturns >filewithoutcarriagereturns

There’s a utility called dos2unix that exists on many systems, and can be easily installed on most.

sed -i s/\r// or somesuch; see man sed or the wealth of information available on the web regarding use of sed .

One thing to point out is the precise meaning of «carriage return» in the above; if you truly mean the single control character «carriage return», then the pattern above is correct. If you meant, more generally, CRLF (carriage return and a line feed, which is how line feeds are implemented under Windows), then you probably want to replace \r\n instead. Bare line feeds (newline) in Linux/Unix are \n .

I am trying to use —> sed ‘s/\r\n/=/’ countryNew.txt > demo.txt which does not work. «tiger» «Lion.»

are we to take that to mean you’re on a mac? I’ve noticed Darwin sed seems to have different commands and feature sets by default than most Linux versions.

FYI, the s/\r// doesn’t seem to remove carriage returns on OS X, it seems to remove literal r chars instead. I’m not sure why that is yet. Maybe it has something to do with the way the string is quoted? As a workaround, using CTRL-V + CTRL-M in place of \r seems to work.

If you are a Vi user, you may open the file and remove the carriage return with:

Note that you should type ^M by pressing ctrl-v and then ctrl-m.

Not great: if the file has CR on every line (i.e. is a correct DOS file), vim will load it with filetype=dos, and not show ^M -s at all. Getting around this is a ton of keystrokes, which is not what vim is made for ;). I’d just go for sed -i , and then `-e ‘s/\r$//g’ to limit the removal to CRs at EOL.

Someone else recommend dos2unix and I strongly recommend it as well. I’m just providing more details.

If installed, jump to the next step. If not already installed, I would recommend installing it via yum like:

dos2unix fileIWantToRemoveWindowsReturnsFrom.txt

Once more a solution. Because there’s always one more:

It’s nice because it’s in place and works in every flavor of unix/linux I’ve worked with.

Removing \r on any UNIX® system:

Most existing solutions in this question are GNU-specific, and wouldn’t work on OS X or BSD; the solutions below should work on many more UNIX systems, and in any shell, from tcsh to sh , yet still work even on GNU/Linux, too.

Tested on OS X, OpenBSD and NetBSD in tcsh , and on Debian GNU/Linux in bash .

With sed :

In tcsh on an OS X, the following sed snippet could be used together with printf , as neither sed nor echo handle \r in the special way like the GNU does:

sed `printf 's/\r$//g'` input > output

With tr :

Difference between sed and tr :

It would appear that tr preserves a lack of a trailing newline from the input file, whereas sed on OS X and NetBSD (but not on OpenBSD or GNU/Linux) inserts a trailing newline at the very end of the file even if the input is missing any trailing \r or \n at the very end of the file.

Testing:

Here’s some sample testing that could be used to ensure this works on your system, using printf and hexdump -C ; alternatively, od -c could also be used if your system is missing hexdump :

% printf 'a\r\nb\r\nc' | hexdump -C 00000000 61 0d 0a 62 0d 0a 63 |a..b..c| 00000007 % printf 'a\r\nb\r\nc' | ( sed `printf 's/\r$//g'` /dev/stdin > /dev/stdout ) | hexdump -C 00000000 61 0a 62 0a 63 0a |a.b.c.| 00000006 % printf 'a\r\nb\r\nc' | ( tr -d '\r' < /dev/stdin >/dev/stdout ) | hexdump -C 00000000 61 0a 62 0a 63 |a.b.c| 00000005 %

If you’re using an OS (like OS X) that doesn’t have the dos2unix command but does have a Python interpreter (version 2.5+), this command is equivalent to the dos2unix command:

python -c "import sys; import fileinput; sys.stdout.writelines(line.replace('\r', '\n') for line in fileinput.input(mode='rU'))"

This handles both named files on the command line as well as pipes and redirects, just like dos2unix . If you add this line to your ~/.bashrc file (or equivalent profile file for other shells):

alias dos2unix="python -c \"import sys; import fileinput; sys.stdout.writelines(line.replace('\r', '\n') for line in fileinput.input(mode='rU'))\""

. the next time you log in (or run source ~/.bashrc in the current session) you will be able to use the dos2unix name on the command line in the same manner as in the other examples.

Don’t know why someone gave ‘-1’. This is a perfectly good answer (and the only one which worked for me).

@FractalSpace This is a terrible idea! It completely wrecks all the spacing in the file and leaves all the contents of the file subject to interpretation by the shell. Try it with a file that contains one line a * b .

%0d is the carriage return character. To make it compatabile with Unix. We need to use the below command.

dos2unix fileName.extension fileName.extension

try this to convert dos file into unix file:

For UNIX. I’ve noticed dos2unix removed Unicode headers form my UTF-8 file. Under git bash (Windows), the following script seems to work nicely. It uses sed. Note it only removes carriage-returns at the ends of lines, and preserves Unicode headers.

#!/bin/bash inOutFile="$1" backupFile="$~" mv --verbose "$inOutFile" "$backupFile" sed -e 's/\015$//g' "$inOutFile"

If you are running an X environment and have a proper editor (visual studio code), then I would follow the reccomendation:

Just go to the bottom right corner of your screen, visual studio code will show you both the file encoding and the end of line convention followed by the file, an just with a simple click you can switch that around.

Just use visual code as your replacement for notepad++ on a linux environment and you are set to go.

Or use Notepad++ ‘s command to Edit / EOL Conversion / Unix (LF) on your Windows system before copying the file to your Linux system.

Using sed on Git Bash for Windows

The first version uses ANSI-C quoting and may require escaping \ if the command runs from a script. The second version exploits the fact that sed reads the input file line by line by removing \r and \n characters. When writing lines to the output file, however, it only appends a \n character. A more general and cross-platform solution can be devised by simply modifying IFS

IFS=$'\r\n' # or IFS+=$'\r' if the lines do not contain whitespace printf "%s\n" $(cat infile) > outfile IFS=$' \t\n' # not necessary if IFS+=$'\r' is used

Warning: This solution performs filename expansion ( * , ? , [. ] and more if extglob is set). Use it only if you are sure that the file does not contain special characters or you want the expansion.
Warning: None of the solutions can handle \ in the input file.

Источник

Удаление знаков переноса строки в bash

Регулярка извлекает с файла кусок многострочного текста. Следующая задача — получить из него одну строку. Попробовал sed «s/\r\n//» . Комбинации опробованы различные. Гуглю различные вариации обозначения знака переноса, не получается. За направление правильного гуления скажу большое спасибо ). текст — utf-8.

stroka3 stroka2 stroka1 stroka1 stroka2 stroka3 — результат \n разнообразно опробован и без результата. различные флаги sed и tr

Что-то вы делаете не так 🙂 $ cat in.txt stroka3 stroka2 stroka1 stroka1 stroka2 stroka3 $ cat in.txt | tr -s ‘\r\n’ ‘ ‘ stroka3 stroka2 stroka1 stroka1 stroka2 stroka3 $ cat in.txt | tr -d ‘\r\n’ stroka3stroka2stroka1stroka1stroka2stroka3

3 ответа 3

cat in.txt | tr -s '\r\n' ' ' > out.txt

Или, если склеить строки (в примере выше \r\n меняется на пробел):

cat in.txt | tr -d '\r\n' > out.txt

P.S. ‘\r\n’ меняем на ‘\n’ для unix-переводов строк.

смотрим содержимое файла in.txt ~ $ cat in.txt stroka3

stroka2 stroka1 stroka1 stroka2 stroka3

загоняем всё что в файле в переменную

с помощью echo выводим содержимое переменной

~ $ echo "$string" # переменная в кавычках выдаст с переносами строк stroka3 stroka2 stroka1 stroka1 stroka2 stroka3 ~ $ echo $string # переменная БЕЗ кавычкех выдаст БЕЗ переносов строк stroka3 stroka2 stroka1 stroka1 stroka2 stroka3

Источник

Удаление знаков переноса с возвратом строки

при выгрузке случился баг и после data3 добавился знак переноса строки \n и все сьехало на две строки. data3 обрамлен » (двойные ковычки) файл очень большой 1 мил строк и в ручную не вариант переделывать подскажите как через sed убрать перенос строки для того чтобы сьехавший хвост вытянулся в одну строку сейчас файл имеет такой вид

data1,data2,"data3 ",data4 data1,data2,"data3 ",data4 data1,data2,"data3 ",data4

data1,data2,"data3",data4 data1,data2,"data3",data4 data1,data2,"data3",data4

4 ответа 4

[VladD@Kenga] [00:59:25] [~] $> cat xx.txt data1,data2,"data3 ",data4 data1,data2,"data3 ",data4 data1,data2,"data3 ",data4 [VladD@Kenga] [00:59:32] [~] $> sed 'N;s/\n"/"/' xx.txt data1,data2,"data3",data4 data1,data2,"data3",data4 data1,data2,"data3",data4

Для более сложных случаев (возможны «обыкновенные» строки) попробуйте так:

[VladD@Kenga] [01:35:47] [~] $> cat xx.txt header "data1",data2,"data3 ",data4 intermediate data data1,"data2 ","data3 ",data4 data1,data2,"data3 ",data4 [VladD@Kenga] [01:35:52] [~] $> sed '/^",/; x' xx.txt header "data1",data2,"data3",data4 intermediate data data1,"data2","data3",data4 data1,data2,"data3",data4 [VladD@Kenga] [01:35:57] [~] $> sed '/^",/; x' xx.txt | sed '1d' header "data1",data2,"data3",data4 intermediate data data1,"data2","data3",data4 data1,data2,"data3",data4

Внимение: последняя строка должна заканчиваться переводом строки, иначе она будет «проглочена»!

Объяснение: нам необходимо, когда мы видим строку, начинающуюся с кавычки, знать предыдущую строку, чтобы склеить их. Для этого мы «задерживаем» вывод строк, отправляя их в hold space вместо вывода, и выводя вместо этого предыдущую строку, лежащую там же ( x ).

Для случая, когда строка начинается с кавычки ( /^»/ ) начинаем действовать. В hold space лежит предыдущая строка, пристыковываем к ней текущую ( H ), и обмениваем hold space с pattern space ( x ), чтобы можно было обработать текст. Удаляем \n ( s/\n// ), и отправляем назад строку в hold space, чтобы проанализировать и вывести её на следующем цикле. Обрубок строки, который получился в pattern space, удаляем, и завершаем эту итерацию ( d ).

Источник