bash - Substitution in text file **without** regular expressions
2013-11
I need to substitute some text inside a text file with a replacement. Usually I would do something like
sed -i 's/text/replacement/g' path/to/the/file
The problem is that both text
and replacement
are complex strings containing dashes, slashes, blackslashes, quotes and so on. If I escape all necessary characters inside text
the thing becomes quickly unreadable. On the other hand I do not need the power of regular expressions: I just need to substitute the text literally.
Is there a way to do text substitution without using regular expressions with some bash command?
It would be rather trivial to write a script that does this, but I figure there should exist something already.
When you don't need the power of regular expressions, don't use it. That is fine.
But, this is not really a regular expression.
sed 's|literal_pattern|replacement_string|g'
So, if /
is your problem, use |
and you don't need to escape the former.
ps: about the comments, also see this Stackoverflow answer on Escape a string for sed search pattern.
Update: If you are fine using Perl try it with \Q
and \E
like this,
perl -pe 's|\Qliteral_pattern\E|replacement_string|g'
RedGrittyBrick
has also suggested a similar trick with stronger Perl syntax in a comment here
You could also use perl's \Q
mechanism to "quote (disable) pattern metacharacters"
perl -pe 'BEGIN {$text = q{your */text/?goes"here"}} s/\Q$text\E/replacement/g'
I pieced together a few other answers and came up with this:
function unregex {
# This is a function because dealing with quotes is a pain.
# http://stackoverflow.com/a/2705678/120999
sed -e 's/[]\/()$*.^|[]/\\&/g' <<< "$1"
}
function fsed {
local find=$(unregex "$1")
local replace=$(unregex "$2")
shift 2
# sed -i is only supported in GNU sed.
#sed -i "s/$find/$replace/g" "$@"
perl -p -i -e "s/$find/$replace/g" "$@"
}
I have a file that looks something like this:
Heading -
- Completed foo
- More information
- Still more
* Need to complete bar
- Did baz (comment blah blah) ***
Another -
* Need to complete foo
- Completed bar (blah comment blah) ***
- Done baz
I need to run the text file through sed
to remove all of the lines that start with spaces (number varies) and a hyphen, and another space.
What is the regex or pattern I need to use with sed
to make the output look like this below?
Heading -
* Need to complete bar
Another -
* Need to complete foo
I used Phoshi's answer, assisted by Dennis Williamson, to help me come up with sed /^\s+-\s.*/d
which works as expected.
"s/\s*-\s.*//g"
should do it, I think.
That's \s to match a space, * to match zero or more of the preceding character (the space), a literal hyphen character, then another space, then .+ to match everything after it.
You should use egrep or grep for this task, sed is a stream editor, grep is more in line with the line-at-a-time philosophy.
You need a regex that matches the start of line, whitespace, hyphen, space. Sounds like this would work:
egrep -v '^[ ]+-[ ]' filename
The -v
option causes egrep to REMOVE the matching lines -- this is easier than building a regex that rejects the lines.
Example:
nobody$ egrep -v '^[ ]+-[ ]' /tmp/foof
Heading -
* Need to complete bar
Another -
* Need to complete foo
nobody$ cat /tmp/foof
Heading -
- Completed foo
- More information
- Still more
* Need to complete bar
- Did baz (comment blah blah) ***
Another -
* Need to complete foo
- Completed bar (blah comment blah) ***
- Done baz
nobody$ _
Dealing with Tab characters only means you need them in the bracket expressions,but that's hard to show online.