Shell script Bash, Check if string starts and ends with single quotes
it works for a sentence like ‘an amazing apa’. So I Believe it has to do with the single quote sign. Any ideas on how I can solve it?
Explanation for aa is: a matches the character a literally (case sensitive) * Quantifier — Matches between zero and unlimited times, as many times as possible, giving back as needed (greedy) a matches the character a literally (case sensitive) I will match any a in the text
3 Answers 3
@chichi: These are valid Bash solutions. Run [[ «‘ab'» =~ ^\’.*\’$ ]] && echo yes and [[ «‘ab'» == \’*\’ ]] && echo yes in your shell. Unless both of them output yes , you must be using a shell other than Bash.
I am writing the complete bash script so you won’t have any confusion:
#! /bin/bash text1="'helo there" if [[ $text1 =~ ^\'.*\'$ ]]; then echo "text1 match" else echo "text1 not match" fi text2="'hello babe'" if [[ $text2 =~ ^\'.*\'$ ]]; then echo "text2 match" else echo "text2 not match" fi
Save the above script as matchCode.sh
text1 not match text2 match
Ask if you have any confusion.
I think your solution would be more readable if you put the check in an extra function. That way you don’t have duplicate code and people can grasp what the solution does at first glance.
However, I suspect you may be confused over quotes that are part of the shell syntax vs. quotes that are actually part of the string:
- In a POSIX-like shell such as Bash, ‘My name is Mozart’ is a single-quoted string whose content is the literal My name is Mozart — without the enclosing ‘ . That is, the enclosing ‘ characters are a syntactic elements that tell the shell that everything between them is the literal contents of the string.
- By contrast, to create a string whose content is actually enclosed in ‘ — i.e., has embedded ‘ instances, you’d have to use something like: «‘My name is Mozart'» . Now it is the enclosing » instances that are the syntactic elements that bookend the string content.
- Note, however, that using a «. » string (double quotes) makes the contents subject to string interpolation (expansion of embedded variable references, arithmetic and command substitutions; none in the case at hand, however), so it’s important to know when to use ‘. ‘ (literal strings) vs. «. » (interpolated strings).
- Embedding ‘ instances in ‘. ‘ strings is actually not supported at all in POSIX-like shells, but in Bash, Ksh, and Zsh there’s another string type that allows you to do that: ANSI C-quoted strings, $’. ‘ , in which you can embed ‘ escaped as \’ : $’\’My name is Mozart\»
- Another option is to use string concatenation: In POSIX-like shells, you can place substrings employing different quoting styles (including unquoted tokens) directly next to one another in order to form a single string: «‘»‘My Name is Mozart'»‘» would also give you a string with contents ‘My Name is Mozart’ .
POSIX-like shells also allow you to escape individual, unquoted characters (meaning: neither part of a single- nor a double-quoted string) with \ ; therefore, \»My name is Mozart’\’ yields the same result.
The behavior of Bash’s == operator inside [[ . ]] (conditionals) may have added to the confusion:
If the RHS (right-hand side — the operand to the right of operator == ) is quoted, Bash treats it like a literal; only unquoted strings (or variable references) are treated as (glob-like) patterns:
‘*’ matches literal * , whereas * (unquoted!) matches any sequence of characters, including none.
- [[ $TEXT == ‘*’ ]] would only ever match the single, literal character * .
- [[ $TEXT == /’*/’ ]] , because it mistakes / for the escape character — which in reality is \ — would only match literal /*/ ( /’*/’ is effectively a concatenation of unquoted / and single-quoted literal */ ).
- [[ $TEXT == a*a ]] , due to using an unquoted RHS, is the only variant that actually performs pattern matching: any string that starts with a and ends with a is matched, including aa (because unquoted * represents any sequence of characters).
To verify that Cyrus’ commands do work with strings whose content is enclosed in (embedded) single quotes, try these commands, which — on Bash, Ksh, and Zsh — should both output yes .
[[ "'ab'" == \'*\' ]] && echo yes # pattern matching, indiv. escaped ' chars. [[ "'ab'" =~ ^\'.*\'$ ]] && echo yes # regex operator =~
How to check if a string ends with another in Bash
In this tutorial, we are going to learn about how to check if a string ends with another string in Bash or UNIX shell.
Consider, we have the following string:
Now, we need to check if the last character (“by”) of the above string is matching with another substring/word.
Checking the string ends with another
We can use the double equals ( == ) comparison operator in bash, to check if a string ends with another substring.
name="ruby" if [[ $name == *by ]] # * is used for pattern matching then echo "true"; else echo "false"; fi
In the above code, if a $name variable ends with by then the output is “true” otherwise it returns “false”.
Similarly, you can also check the last three characters of a string like this:
name="ruby" if [[ $name == *uby ]] # * is used for pattern matching then echo "true"; else echo "false"; fi
We can also use the parameter expansion syntax to access the last n characters from a string instead of typing manually.
name="ruby" if [[ $name == $::-3>* ]] # * is used for pattern matching then echo "true"; else echo "false"; fi
$ : it gets the last three characters from a string.
Grep for a string that ends with specific character
Is there a way to use extended regular expressions to find a specific pattern that ends with a string. I mean, I want to match first 3 lines but not the last:
file_number_one.pdf # comment file_number_two.pdf # not interesting testfile_number____three.pdf # some other stuff myfilezipped.pdf.zip some comments and explanations
I know that in grep, metacharacter $ matches the end of a line but I’m not interested in matching a line end but string end. Groups in grep are very odd, I don’t understand them well yet. I tried with group matching, actually I have a similar REGEX but it does not work with grep -E
4 Answers 4
Your example works with matching the space after the string also:
What you call «string» is similar to what grep calls «word». A Word is a run of alphanumeric characters. The nice thing with words is that you can match a word end with the special \> , which matches a word end with a march of zero characters length. That also matches at the end of line. But the word characters can not be changed, and do not contain punctuation, so we can not use it.
If you need to match at the end of line too, where there is no space after the word, use:
To include cases where the character after the file name is not a space character ‘ ‘, but other whitespace, like a tab, \t , or the name is directly followed by a comment, starting with # , use:
grep -E '\.pdf[[:space:]#]|\.pdf$' input.txt
I will illustrate the matching of word boundarys too, because that would be the perfect solution, except that we can not use it here because we can not change the set of characters that are seen as parts of a word.
The input contains foo as separate word, and as part of longer words, where the foo is not at the end of the word, and therefore not at a word boundary:
$ printf 'foo bar\nfoo.bar\nfoobar\nfoo_bar\nfoo\n' foo bar foo.bar foobar foo_bar foo
Now, to match the boundaries of words, we can use \ < for the beginning, and \>to match the end:
$ printf 'foo bar\nfoo.bar\nfoobar\nfoo_bar\nfoo\n' | grep 'foo\>' foo bar foo.bar foo
Note how _ is matched as a word char — but otherwise, wordchars are only the alphanumerics, [a-zA-Z0-9] .
Also note how foo an the end of line is matched — in the line containing only foo . We do not need a special case for the end of line.