Linux replace with regex

Find and replace text within a file using commands

@Akiva If you include regex special characters in your search sed will match them. Add a -r flag if you want to use extended REs instead.

@mcExchange If it’s specifically the / character that you need to match, you can just use some other character as the separator (e.g. ‘s_old/text_new/text_g’ ). Otherwise, you can put a \ before any of $ * . [ \ ^ to get the literal character.

@BrianZ As far as the file system is concerned the output of sed is a new file with the same name. It’s one of the commonly reported bugs that are not bugs

The OSX command sed -i ‘.bak’ ‘s/original/new/g’ file.txt can also be run with a zero-length extension sed -i » ‘s/original/new/g’ file.txt , which will generate no backup.

MacOS users will have to add »» after -i as a parameter for -i ed.gs/2016/01/26/os-x-sed-invalid-command-code so that the file will be overwritten.

There’s multitude of ways to achieve it. Depending on the complexity of what one tries to achieve with string replacement, and depending on tools with which user is familiar, some methods may be preferred more than others.

In this answer I am using simple input.txt file, which you can use to test all examples provided here. The file contents:

roses are red , violets are blue This is an input.txt and this doesn't rhyme 

BASH

Bash isn’t really meant for text processing, but simple substitutions can be done via parameter expansion , in particular here we can use simple structure $ .

#!/bin/bash while IFS= read -r line do case "$line" in *blue*) printf "%s\n" "$" ;; *) printf "%s\n" "$line" ;; esac done < input.txt 

This small script doesn't do in-place replacement, meaning that you would have to save new text to new file, and get rid of the old file, or mv new.txt old.txt

AWK

AWK, being a text processing utility, is quite appropriate for such task. It can do simple replacements and much more advanced ones based on regular expressions. It provides two functions: sub() and gsub() . The first one only replaces only the first occurrence, while the second - replaces occurrences in whole string. For instance, if we have string one potato two potato , this would be the result:

$ echo "one potato two potato" | awk '1' one banana two banana $ echo "one potato two potato" | awk '1' one banana two potato 

AWK can take an input file as argument, so doing same things with input.txt , would be easy:

Depending on the version of AWK you have, it may or may not have in-place editing, hence the usual practice is save and replace new text. For instance something like this:

awk '1' input.txt > temp.txt && mv temp.txt input.txt 

SED

Sed is a line editor. It also uses regular expressions, but for simple substitutions it's sufficient to do:

What's good about this tool is that it has in-place editing, which you can enable with -i flag.

Perl

Perl is another tool which is often used for text processing, but it's a general purpose language, and is used in networking, system administration, desktop apps, and many other places. It borrowed a lot of concepts/features from other languages such as C,sed,awk, and others. Simple substitution can be done as so:

perl -pe 's/blue/azure/' input.txt 

Like sed, perl also has the -i flag.

Python

This language is very versatile and is also used in a wide variety of applications. It has a lot of functions for working with strings, among which is replace() , so if you have variable like var="Hello World" , you could do var.replace("Hello","Good Morning")

Simple way to read file and replace string in it would be as so:

python -c "import sys;lines=sys.stdin.read();print lines.replace('blue','azure')" < input.txt 

With Python, however, you also need to output to new file , which you can also do from within the script itself. For instance, here's a simple one:

#!/usr/bin/env python import sys import os import tempfile tmp=tempfile.mkstemp() with open(sys.argv[1]) as fd1, open(tmp[1],'w') as fd2: for line in fd1: line = line.replace('blue','azure') fd2.write(line) os.rename(tmp[1],sys.argv[1]) 

This script is to be called with input.txt as command-line argument. The exact command to run python script with command-line argument would be

$ python ./myscript.py input.txt 

Of course, make sure that ./myscript.py is in your current working directory and for the first way, ensure it is set executable with chmod +x ./myscript.py

Python can also have regular expressions , in particular, there's re module, which has re.sub() function, which can be used for more advanced replacements.

@TapajitDey Yes, tr is another great tool, but note that it is for replacing sets of characters ( for example tr abc cde would translate a to c , b to d . It is a bit different from replacing whole words as with sed or python

There are a number of different ways to do this. One is using sed and Regex. SED is a Stream Editor for filtering and transforming text. One example is as follows:

marco@imacs-suck: ~$ echo "The slow brown unicorn jumped over the hyper sleeping dog" > orly marco@imacs-suck: ~$ sed s/slow/quick/ < orly >yarly marco@imacs-suck: ~$ cat yarly The quick brown unicorn jumped over the hyper sleeping dog 

Another way which may make more sense than < strin and >strout is with pipes!

marco@imacs-suck: ~$ cat yarly | sed s/unicorn/fox/ | sed s/hyper/lazy/ > nowai marco@imacs-suck: ~$ cat nowai The quick brown fox jumped over the lazy sleeping dog 

Indeed this can be reduced further: sed -i'.bak' -e 's/unicorn/fox/g;s/hyper/brown/g' yarly will take file yarly and do the 2 changes in-place whilst making a backup. Using time bash -c "$COMMAND" to time it suggests that this version is a ~5 times faster.

You can use Vim in Ex mode:

  1. % select all lines
  2. s substitute
  3. g replace all instances in each line
  4. x write if changes have been made (they have) and exit

Through awk's gsub command,

In the above example, all the 1's are replaced by 0's irrespective of the column where it located.

If you want to do a replacement on a specific column, then do like this,

It replaces 1 with 0 on the first column only.

$ echo 'foo' | perl -pe 's/foo/bar/g' bar 

sed is the stream editor, in that you can use | (pipe) to send standard streams (STDIN and STDOUT specifically) through sed and alter them programmatically on the fly, making it a handy tool in the Unix philosophy tradition; but can edit files directly, too, using the -i parameter mentioned below.
Consider the following:

sed -i -e 's/few/asd/g' hello.txt 

s/ is used to substitute the found expression few with asd :

/g stands for "global", meaning to do this for the whole line. If you leave off the /g (with s/few/asd/ , there always needs to be three slashes no matter what) and few appears twice on the same line, only the first few is changed to asd :

The few men, the few women, the brave.

The asd men, the few women, the brave.

This is useful in some circumstances, like altering special characters at the beginnings of lines (for instance, replacing the greater-than symbols some people use to quote previous material in email threads with a horizontal tab while leaving a quoted algebraic inequality later in the line untouched), but in your example where you specify that anywhere few occurs it should be replaced, make sure you have that /g .

The following two options (flags) are combined into one, -ie :

-i option is used to edit in place on the file hello.txt .

-e option indicates the expression/command to run, in this case s/ .

Note: It's important that you use -i -e to search/replace. If you do -ie , you create a backup of every file with the letter 'e' appended.

Examples: to replace all occurrences [logdir', ''] (without [] ) with [logdir', os.getcwd()] in all files that are result of locate command, do:

locate tensorboard/program.py | xargs sed -i -e "s/old_text/NewText/g" 
locate tensorboard/program.py | xargs sed -i -e "s/logdir', ''/logdir', os.getcwd()/g" 

where [tensorboard/program.py] is file to search

Hi. Your choice of strings ( logdir', '' -> /logdir', os.getcwd() ) makes this answer hard to parse. Also, it's worth specifying that your answer first locates the files to use sed on, because it's not part of the question.

I choose this answer for all they use tensorboard in keras, who want to change command from: tensorboard --logdir='/path/to/log/folder/' to use: tensorboard only, when staying in logs folder. it is very convenient

Finding and replacing across many files

In order to achieve as close to lightning speed or warp velocity as possible when doing find and replace across multiple files (maybe even thousands or millions) in massive filesystems such as huge code repos, I recommend using Ripgrep ( rg ) which is awesome and incredibly fast. Unfortunately, it doesn't support find and replace in files, and according to the author probably never will (update: definitely never will), so we have to use some work-arounds.

Ripgrep-based Method 1 (the easy, but more-limited 1-liner)

Use Ripgrep ( rg ) to find files containing the matches, then pipe a list of those files to sed to do the text replacement in those files:

# Replace `foo` with `bar` in all matching files in the current # directory and down rg 'foo' --files-with-matches | xargs sed -i 's|foo|bar|g' 

Here is the 2nd work-around: use my rgr (Ripgrep Replace) wrapper script I have written around Ripgrep which adds the -R option to replace contents on your disk. Installation is simple. Just follow the instructions at the top of that file. Usage is super simple too:

# Replace `foo` with `bar` in all matching files in the current # directory and down rgr 'foo' -R bar 

Since it's a wrapper around rg , it also provides access to all of Ripgrep's other options and features. Here is the full help menu at this moment, with more usage examples:

rgr ('rgr') version 0.1.0 RipGrep Replace (rgr). This program is a wrapper around RipGrep ('rg') in order to allow some extra features, such as find and replace in-place in files. It doesn't rely on 'sed'. It uses RipGrep only. Since it's a wrapper around 'rg', it forwards all options to 'rg', so you can use it as a permanent replacement for 'rg' if you like. Currently, the only significant new option added is '-R' or '--Replace', which is the same as RipGrep's '-r' except it MODIFIES THE FILES IN-PLACE ON YOUR FILESYSTEM! This is great! If you think so too, go and star this project (link below). USAGE (exact same as 'rg') rgr [options] [paths. ] OPTIONS ALL OPTIONS ACCEPTED BY RIPGREP ('rg') ARE ALSO ACCEPTED BY THIS PROGRAM. Here are just a few of the options I've amended and/or would like to highlight. Not all Ripgrep options have been tested, and not all of them ever will be by me at least. -h, -?, --help Print help menu -v, --version Print version information. --run_tests Run unit tests (none yet). -d, --debug Turn on debug prints. '-d' is not part of 'rg' but if you use either of these options here it will auto-forward '--debug' to 'rg' under-the-hood. -r , --replace Do a dry-run to replace all matches of regular expression 'regex' with 'replacement_text'. This only does the replacement in the stdout output; it does NOT modify your disk! -R , --Replace THIS IS THE ONE! Bingo! This is the sole purpose for the creation of this wrapper. This option will actually replace all matches of regular expression 'regex' with 'replacement_text' ON YOUR DISK. It actually modifies your file system! This is great for large code-wide replacements when programming in large repos, for instance. --stats Show detailed statistics about the ripgrep search and replacements made. SEE ALSO 'rg -h' AND 'man rg' FOR THE FULL HELP MENU OF RIPGREP ITSELF, WHICH OPTIONS ARE ALSO ALLOWED AND PASSED THROUGH BY THIS 'rgr' WRAPPER PROGRAM. EXAMPLE USAGES: rgr foo -r boo Do a *dry run* to replace all instances of 'foo' with 'boo' in this folder and down. rgr foo -R boo ACTUALLY REPLACE ON YOUR DISK all instances of 'foo' with 'boo' in this folder and down. rgr foo -R boo file1.c file2.c file3.c Same as above, but only in these 3 files. rgr foo -R boo -g '*.txt' Use a glob filter to replace on your disk all instances of 'foo' with 'boo' in .txt files ONLY, inside this folder and down. Learn more about RipGrep's glob feature here: https://github.com/BurntSushi/ripgrep/blob/master/GUIDE.md#manual-filtering-globs rgr foo -R boo --stats Replace on your disk all instances of 'foo' with 'boo', showing detailed statistics. Note to self: the only free lowercase letters not yet used by 'rg' as of 3 Jan. 2021 are: -d, -k, -y This program is part of eRCaGuy_dotfiles: https://github.com/ElectricRCAircraftGuy/eRCaGuy_dotfiles by Gabriel Staples. 

Ripgrep installation

sudo apt update && sudo apt install ripgrep 

Источник

Читайте также:  Add linux host to zabbix
Оцените статью
Adblock
detector