Linux remove trailing spaces

How to remove extra spaces in bash?

Do you really want to remove the whitespace in HEAD , or just provide the expansion of $HEAD without whitespace to another command? The shell provides better tools for controlling the output of expansion than it does tools for just mutating a variable in place.

10 Answers 10

or maybe you want to save it in a variable:

To remove leading and trailing whitespaces, do this:

NEWHEAD=$(echo "$HEAD" | tr -s " ") NEWHEAD=$ NEWHEAD=$

What’s the point of calling tr when you don’t quote $HEAD , so bash does word splitting and therefore collapses the whitespace on itself?

$ echo "$HEAD" | awk '$1=$1' how to remove extra spaces 

Take advantage of the word-splitting effects of not quoting your variable

$ HEAD=" how to remove extra spaces " $ set -- $HEAD $ HEAD=$* $ echo ">>>$HEAD>>how to remove extra spaces 

If you don't want to use the positional paramaters, use an array

One dangerous side-effect of not quoting is that filename expansion will be in play. So turn it off first, and re-enable it after:

Note that the behavior depends on the value IFS . With the default IFS , namely $' \t\n' you'll also replace tabs and newlines by a space.

This horse isn't quite dead yet: Let's keep beating it!*

Read into array

Other people have mentioned read , but since using unquoted expansion may cause undesirable expansions all answers using it can be regarded as more or less the same. You could do

Extended Globbing with Parameter Expansion

$ shopt -s extglob $ HEAD="$" HEAD="$" HEAD="$" $ printf '"%s"\n' "$HEAD" "how to remove extra spaces" 

*No horses were actually harmed – this was merely a metaphor for getting six+ diverse answers to a simple question.

This is the only answer that really makes sense here. It's sad ugly hacks and semi-broken answers are upvoted instead…

In honor of this answer (which avoids echo, shell expansion and all the other pitfalls all the other answers went into), I deleted my own incorrect answer.

Two more notes: read needs -r to prevent processing backslash escapes, and -d "" to normalize newlines rather than truncating at first newline. Both options need to be passed before -a .

You introduced an error when applying @MichałGórny's suggestion: read -rd'' -a HEAD will not work, it needs to be read -r -d '' -a HEAD or read -rd '' -a HEAD . Also, IFS should be mentioned, as this will only work if IFS contains the space character.

Here's how I would do it with sed:

string=' how to remove extra spaces ' echo "$string" | sed -e 's/ */ /g' -e 's/^ *\(.*\) *$/\1/' => how to remove extra spaces # (no spaces at beginning or end) 

The first sed expression replaces any groups of more than 1 space with a single space, and the second expression removes any trailing or leading spaces.

echo -e " abc \t def "|column -t|tr -s " "

tr -s " " will squeeze multiple spaces to single space

BTW, to see the whole output you can use cat - -A : shows you all spacial characters including tabs and EOL:

echo -e " abc \t def "|column -t|tr -s " "|cat - -A

Whitespace can take the form of both spaces and tabs. Although they are non-printing characters and unseen to us, sed and other tools see them as different forms of whitespace and only operate on what you ask for. ie, if you tell sed to delete x number of spaces, it will do this, but the expression will not match tabs. The inverse is true- supply a tab to sed and it will not match spaces, even if the number of them is equal to those in a tab.

A more extensible solution that will work for removing either/both additional space in the form of spaces and tabs (I've tested mixing both in your specimen variable) is:

or we can tighten-up @Frontear 's excellent suggestion of using xargs without the tr :

However, note that xargs would also remove newlines. So if you were to cat a file and pipe it to xargs , all the extra space- including newlines- are removed and everything put on the same line ;-).

Both of the foregoing achieved your desired result in my testing.

Источник

How to remove trailing whitespaces with sed?

I have a simple shell script that removes trailing whitespace from a file. Is there any way to make this script more compact (without creating a temporary file)?

sed 's/[ \t]*$//' $1 > $1__.tmp cat $1__.tmp > $1 rm $1__.tmp 

I used the knowledge I learned from this question to create a shell script for recursively removing trailing whitespace.

Your solution is actually better when using MinGW due to a bug in sed on Windows: stackoverflow.com/questions/14313318/…

Note that using cat to overwrite the original file rather than mv will actually replace the data in the original file (ie, it will not break hard links). Using sed -i as proposed in many solutions will not do that. IOW, just keep doing what you're doing.

12 Answers 12

You can use the in place option -i of sed for Linux and Unix:

Be aware the expression will delete trailing t 's on OSX (you can use gsed to avoid this problem). It may delete them on BSD too.

If you don't have gsed, here is the correct (but hard-to-read) sed syntax on OSX:

Three single-quoted strings ultimately become concatenated into a single argument/expression. There is no concatenation operator in bash, you just place strings one after the other with no space in between.

The $'\t' resolves as a literal tab-character in bash (using ANSI-C quoting), so the tab is correctly concatenated into the expression.

"sed: Not a recognized flag: i –" This happens on OSX. You need to add an extension for the backup file after -i on Macs. e.g.: sed -i .bak 's/[ \t]*$//' $1

@GoodPerson If you were't kidding, you likely forget to escape the t 🙂 \t is a tab, for those who may not already know.

@SeanAllred was not kidding: its utterly broken unless you happen to be using GNU sed (which is broken in so many other ways)

At least on Mountain Lion, Viktor's answer will also remove the character 't' when it is at the end of a line. The following fixes that issue:

codaddict's answer has the same problem on OS X (now macOS). This is the only solution on this platform.

Thanks to codaddict for suggesting the -i option.

The following command solves the problem on Snow Leopard

Like @acrollet says, you cannot use \t with sed other than GNU sed and it gets interpreted as a literal letter t . The command only appears to work, probably because there are no TAB's in the trailing whitespace nor a t at the end of a sentence in your file. Using '' without specifying a backup suffix is not recommended.

If the resolution is indicated for Snow Leopard only, maybe the question should be 'how to remove trailing whitespace on Macos. '

It is best to also quote $1:

var1="\t\t Test String trimming " echo $var1 Var2=$(echo "$" | sed 's/^[[:space:]]*//;s/[[:space:]]*$//') echo $Var2 

Hey, that's just what I needed! The other sed solutions posted had issue integrating with a piped (and piped and piped. ) variable assignment in my bash script, but yours worked out of the box.

I have a script in my .bashrc that works under OSX and Linux (bash only !)

function trim_trailing_space() < if [[ $# -eq 0 ]]; then echo "$FUNCNAME will trim (in place) trailing spaces in the given file (remove unwanted spaces at end of lines)" echo "Usage :" echo "$FUNCNAME file" return fi local file=$1 unamestr=$(uname) if [[ $unamestr == 'Darwin' ]]; then #specific case for Mac OSX sed -E -i '' 's/[[:space:]]*$//' $file else sed -i 's/[[:space:]]*$//' $file fi >
SRC_FILES_EXTENSIONS="js|ts|cpp|c|h|hpp|php|py|sh|cs|sql|json|ini|xml|conf" function find_source_files() < if [[ $# -eq 0 ]]; then echo "$FUNCNAME will list sources files (having extensions $SRC_FILES_EXTENSIONS)" echo "Usage :" echo "$FUNCNAME folder" return fi local folder=$1 unamestr=$(uname) if [[ $unamestr == 'Darwin' ]]; then #specific case for Mac OSX find -E $folder -iregex '.*\.('$SRC_FILES_EXTENSIONS')' else #Rhahhh, lovely local extensions_escaped=$(echo $SRC_FILES_EXTENSIONS | sed s/\|/\\\\\|/g) #echo "extensions_escaped:$extensions_escaped" find $folder -iregex '.*\.\('$extensions_escaped'\)$' fi >function trim_trailing_space_all_source_files()

For those who look for efficiency (many files to process, or huge files), using the + repetition operator instead of * makes the command more than twice faster.

sed -Ei 's/[ \t]+$//' "$1" sed -i 's/[ \t]\+$//' "$1" # The same without extended regex 

I also quickly benchmarked something else: using [ \t] instead of [[:space:]] also significantly speeds up the process (GNU sed v4.4):

sed -Ei 's/[ \t]+$//' "$1" real 0m0,335s user 0m0,133s sys 0m0,193s sed -Ei 's/[[:space:]]+$//' "$1" real 0m0,838s user 0m0,630s sys 0m0,207s sed -Ei 's/[ \t]*$//' "$1" real 0m0,882s user 0m0,657s sys 0m0,227s sed -Ei 's/[[:space:]]*$//' "$1" real 0m1,711s user 0m1,423s sys 0m0,283s 

In the specific case of sed , the -i option that others have already mentioned is far and away the simplest and sanest one.

In the more general case, sponge , from the moreutils collection, does exactly what you want: it lets you replace a file with the result of processing it, in a way specifically designed to keep the processing step from tripping over itself by overwriting the very file it's working on. To quote the sponge man page:

sponge reads standard input and writes it out to the specified file. Unlike a shell redirect, sponge soaks up all its input before writing the output file. This allows constructing pipelines that read from and write to the same file.

Источник

How do I trim leading and trailing whitespace from each line of some output?

I would like to remove all leading and trailing spaces and tabs from each line in an output. Is there a simple tool like trim I could pipe my output into? Example file:

test space at back test space at front TAB at end TAB at front sequence of some space in the middle some empty lines with differing TABS and spaces: test space at both ends 

To anyone looking here for a solution to remove newlines, that is a different problem. By definition a newline creates a new line of text. Therefore a line of text cannot contain a newline. The question you want to ask is how to remove a newline from the beginning or end of a string: stackoverflow.com/questions/369758, or how to remove blank lines or lines that are just whitespace: serverfault.com/questions/252921

21 Answers 21

Would trim leading and trailing space or tab characters 1 and also squeeze sequences of tabs and spaces into a single space.

That works because when you assign something to one of the fields, awk rebuilds the whole record (as printed by print ) by joining all fields ( $1 , . $NF ) with OFS (space by default).

To also remove blank lines, change it to awk ';NF' (where NF tells awk to only print the records for which the N umber of F ields is non-zero). Do not do awk '$1=$1' as sometimes suggested as that would also remove lines whose first field is any representation of 0 supported by awk ( 0 , 00 , -0e+12 . )

1 (and possibly other blank characters depending on the locale and the awk implementation)

The only thing I don't like about this approach is that you lose repeating spaces within the line. For example, echo -e 'foo \t bar' | awk ';1'

The command can be condensed like so if you're using GNU sed :

Example

Here's the above command in action.

$ echo -e " \t blahblah \t " | sed 's/^[ \t]*//;s/[ \t]*$//' blahblah 

You can use hexdump to confirm that the sed command is stripping the desired characters correctly.

$ echo -e " \t blahblah \t " | sed 's/^[ \t]*//;s/[ \t]*$//' | hexdump -C 00000000 62 6c 61 68 62 6c 61 68 0a |blahblah.| 00000009 

Character classes

You can also use character class names instead of literally listing the sets like this, [ \t] :

Example

$ echo -e " \t blahblah \t " | sed 's/^[[:blank:]]*//;s/[[:blank:]]*$//' 

Most of the GNU tools that make use of regular expressions (regex) support these classes (here with their equivalent in the typical C locale of an ASCII-based system (and there only)).

 [[:alnum:]] - [A-Za-z0-9] Alphanumeric characters [[:alpha:]] - [A-Za-z] Alphabetic characters [[:blank:]] - [ \t] Space or tab characters only [[:cntrl:]] - [\x00-\x1F\x7F] Control characters [[:digit:]] - 8 Numeric characters [[:graph:]] - [!-~] Printable and visible characters [[:lower:]] - [a-z] Lower-case alphabetic characters [[:print:]] - [ -~] Printable (non-Control) characters [[:punct:]] - [!-/:-@[-` 

Using these instead of literal sets always seems like a waste of space, but if you're concerned with your code being portable, or having to deal with alternative character sets (think international), then you'll likely want to use the class names instead.

References

Источник

Читайте также:  Как кильнуть процесс linux
Оцените статью
Adblock
detector