Linux remove windows line endings

Find and remove DOS line endings on Ubuntu

I have found that many of my files have DOS line endings. In VI they look like this: «^M». I don’t want to modify files that don’t have these DOS line endings. How do I do this using a bash script? Thanks! EV

8 Answers 8

grep -URl ^M . | xargs fromdos 

grep gets you a list of all files under the current directory that have DOS line endings.

-U makes grep consider line endings instead of stripping them away by default

-l makes it list only the filenames and not the matching lines

then you’re piping that list into the converter command (which is fromdos on ubuntu, dos2unix where i come from).

NOTE: don’t actually type ^M . instead, you’ll need to press then to insert the ^M character and make grep understand what you’re going for. or, you could type in $’\r’ in place of ^M (but i think that may only work for bash. ).

On tcsh, and probably on csh too, you can get the same effect with grep -URl «\r» . | xargs fromdos .

if you need it to work for files with spaces in their names, try grep -URl ^M . | xargs -I<> dos2unix «<>» instead.

One way using GNU coreutils :

On ubuntu, you use the fromdos utility

The above example would take a MS-DOS or Microsoft Windows file or other file with different line separators and format the file with new line separators to be read in Linux and Unix.

cat origin_file.txt | sed "s/^M//" > dest_file.txt 

You have to do the same thing mentioned above, ctl-V then ctl-M to get that character. This is preferable for me because it is portable across many platforms and keeps it simple within bash.

on ubuntu I also find this works:

cat origin_file.txt | sed «s/\r//» > dest_file.txt

Note if you’re converting multi-byte files you need to take extra care, and should probably try to use the correct iconv or recode from-encoding specifications.

If it’s a plain ASCII file, both of the below methods would work.

The flip program, in Debian the package is also called flip , can handle line-endings. From the manual:

When asked to convert a file to the same format that it already has, flip causes no change to the file. Thus to convert all files to **IX format you can type flip -u * and all files will end up right, regardless of whether they were in MS-DOS or in **IX format to begin with. This also works in the opposite direction. 

Or you could use GNU recode:

a: ASCII text, with CRLF line terminators b: ASCII text, with CRLF line terminators 

Convert to unix line-endings:

a: ASCII text b: ASCII text 

recode abbreviates dos line-endings as pc , so the logic with pc.. is: convert from pc format to the default, which is latin1 with unix line-endings.

Читайте также:  Turnkey linux domain controller

Источник

Remove Line Endings From a File

announcement - icon

The Kubernetes ecosystem is huge and quite complex, so it’s easy to forget about costs when trying out all of the exciting tools.

To avoid overspending on your Kubernetes cluster, definitely have a look at the free K8s cost monitoring tool from the automation platform CAST AI. You can view your costs in real time, allocate them, calculate burn rates for projects, spot anomalies or spikes, and get insightful reports you can share with your team.

Connect your cluster and start monitoring your K8s costs right away:

1. Overview

A common task when working on the Linux command-line is searching for a string or a pattern and then replacing or deleting it. However, there are special characters that can cause this common task to be less trivial than we anticipate.

In this tutorial, we’ll explore several approaches to remove newline characters using tools such as tr, awk, Perl, paste, sed, Bash, and the Vim editor.

2. Preparing Our Example File

Before we start, let’s create a text file named some_names.txt that we’ll use to apply all our strategies:

The goal is to end up with a CSV-like file with the content:

Martha,Charlotte,Diego,William,

3. Using tr

To delete or replace some characters by specific others, we think of tr because it’s easy to use.

The command tr uses the standard input ( stdin ), performs some operations (translate, squeeze, delete), and then copies the result to the standard output ( stdout ).

We’ll now focus on the “delete” operation. With the parameter -d, we define a set of characters that we want tr to remove.

Since we just want to delete the newlines, we place only this character in the set and then redirect the standard output to a new CSV file:

Now, let’s see the content of our CSV file:

$ cat some_names.txt Martha,Charlotte,Diego,William,

4. Using awk

The awk program is a well-known, powerful, and useful tool that allows us to process text using patterns and actions.

It lets us perform some operations in a very straightforward way, with the help of some tricks:

$ awk 1 ORS='' some_names.txt > some_names.csv

Let’s see the content of our CSV file:

$ cat some_names.csv Martha,Charlotte,Diego,William,

Let’s take a closer look to understand how we solved the problem.

We wrote the pattern “1” because it evaluates to true (allowing the record to be processed), then, with the absence of action, awk makes the default action, which is to print the entire record terminated with the value of the ORS variable.

Читайте также:  Процесс загрузки операционных систем linux

Then we define the ORS (Output Record Separator) variable, which is set to newline by default, to be the empty string.

Following these two steps, we consumed every record, then printed them using the empty string as the output record separator. In other words, we simply ignored the newline.

Another way is to use it as an awk program text:

And an extended version of that would be:

Here, we do the same, but this time, we use the BEGIN pattern, which executes the action of defining the ORS variable before any of the input is read, and then, printing the $0 variable, which contains the whole record (usually an entire line of the input).

5. Using Perl

Perl is a language that has a great set of features for text processing.

We’ll use the Perl interpreter in a sed-like way:

$ perl -pe 's/\n//' some_names.txt > some_names.csv

Let’s take a look at how this command works:

  • -p tells Perl to assume the following loop around our program
  • -e tells Perl to use the next string as a one-line script
  • ‘s/\n//’ is the script that instructs to Perl to remove the \n character

And now, let’s review our CSV file:

$ cat some_names.csv Martha,Charlotte,Diego,William,

6. Using paste

The paste program is a utility that merges lines of files, but we can also use it to remove newlines.

Let’s try with the next one-liner:

$ paste -sd "" some_names.txt > some_names.csv 

Now, let’s check our CSV file:

$ cat some_names.csv Martha,Charlotte,Diego,William,

We’re able to achieve this because paste has the parameters -s, which pastes one file at a time leaving each one as a row, and -d, which allows us to define the empty string as the delimiter.

With these two paste options, we can get what we want without mentioning the newline.

7. Using sed

When we talk about processing text, the sed stream editor usually comes to mind, regardless of the problem.

The script ‘s//replacement/’ is commonly used in sed.

Let’s use it to replace the line endings and see what happens:

$ sed 's/\n//g' some_names.txt Martha, Charlotte, Diego, William,

And there’s no change because sed reads one line at a time, and then the newline is always stripped off before it’s placed into the pattern space.

Let’s try with this new one-liner:

$ sed ':label1 ; N ; $! b label1 ; s/\n//g' some_names.txt > some_names.csv

Next, let’s see what’s inside our CSV file:

$ cat some_names.csv Martha,Charlotte,Diego,William,

Now we have what we wanted.

Let’s break down each section (separated by the semicolon) of the script to understand how it works:

  • :label1 creates a label named label1
  • N tells sed to append the next line into the pattern space
  • $! b label1 tells sed to branch (go to) our label label1 if not the last line
  • s/\n//g removes the \n character from what is in the pattern space
Читайте также:  Configure ftp in linux

In other words, with all these pieces together, we construct a loop that finishes when sed is in the last line of the input.

8. Using a Bash Command-Line Script

Bash is installed in most Linux distributions, so we could try to use it to get what we want.

One option that we could use is a while loop:

$ while read row do printf "$row" done < some_names.txt >some_names.csv

Here, in the while loop and with the help of the Bash built-in read, we read the content of the file some_names.txt, and then we assign each line to the variable row.

After that, the built-in printf prints that line without the newline. And finally, we redirect the output to our CSV file.

We can achieve the same with the help of the readarray built-in, the IFS variable, and the parameter expansion mechanism:

$ OLDIFS=$IFS ; IFS='' ; readarray -t file_array < some_names.txt ; echo "$" > some_names.csv ; IFS=$OLDIFS

Bash is full of tricks, and we’re using a few of them here. Let’s understand it section by section:

  • OLDIFS=$IFS: We save the current variable IFS into the OLDIFS variable.
  • IFS=”: We define IFS to the empty string
  • With readarray -tfile_array, we assign to the array file_array the content of the some_names.txt file removing the newline from each row
  • With “$”, Bash expands each value of the array file_array, separated by the first character of the IFS variable
  • Finally, we restore the IFS variable

But we can be a little trickier using a subshell:

$ ( readarray -t file_array < some_names.txt; IFS=''; echo "$" > some_names.csv; )

This is equivalent while keeping our current IFS variable safe, thanks to the fact that variables inside the subshell aren’t visible outside of it.

It’s worth mentioning that the IFS variable is special . The default value of the Bash IFS variable is , or ” \t\n”.

Finally, let’s see what is now inside our CSV file :

$ cat some_names.csv Martha,Charlotte,Diego,William, 

9. Using the Vim Editor

In Linux, we have many editor flavors, but let’s focus on one of the most famous.

Vim (Vi Improved) is an editor equipped with a lot of useful utilities.

Let’s open our example file into the Vim editor:

$ vim some_names.txt Martha, Charlotte, Diego, William,

Next, let’s write the command %s/\n// and save it to our CSV file.

Right now, we have something like this:

Martha,Charlotte,Diego,William,

Now, let’s save the content into a file called some_names.csv.

To finish this section, let’s understand what happened. With the command s/\n//, we remove every \n character. And with the % sign, Vim applies this in all the lines of the file.

10. Conclusion

Removing newlines leads us to think about strategies beyond those that delete common characters. In this article, we’ve reviewed some of these strategies using commands like tr, awk, Perl, paste, sed, Bash, and the Vim editor.

Источник

Оцените статью
Adblock
detector