bash - Substitution in text file **without** regular expressions

25
2013-11
  • Andrea

    I need to substitute some text inside a text file with a replacement. Usually I would do something like

    sed -i 's/text/replacement/g' path/to/the/file
    

    The problem is that both text and replacement are complex strings containing dashes, slashes, blackslashes, quotes and so on. If I escape all necessary characters inside text the thing becomes quickly unreadable. On the other hand I do not need the power of regular expressions: I just need to substitute the text literally.

    Is there a way to do text substitution without using regular expressions with some bash command?

    It would be rather trivial to write a script that does this, but I figure there should exist something already.

  • Answers
  • nik

    When you don't need the power of regular expressions, don't use it. That is fine.
    But, this is not really a regular expression.

    sed 's|literal_pattern|replacement_string|g'

    So, if / is your problem, use | and you don't need to escape the former.

    ps: about the comments, also see this Stackoverflow answer on Escape a string for sed search pattern.


    Update: If you are fine using Perl try it with \Q and \E like this,
    perl -pe 's|\Qliteral_pattern\E|replacement_string|g'
    RedGrittyBrick has also suggested a similar trick with stronger Perl syntax in a comment here

  • glenn jackman

    You could also use perl's \Q mechanism to "quote (disable) pattern metacharacters"

    perl -pe 'BEGIN {$text = q{your */text/?goes"here"}} s/\Q$text\E/replacement/g'
    
  • Xiong Chiamiov

    I pieced together a few other answers and came up with this:

    function unregex {
       # This is a function because dealing with quotes is a pain.
       # http://stackoverflow.com/a/2705678/120999
       sed -e 's/[]\/()$*.^|[]/\\&/g' <<< "$1"
    }
    function fsed {
       local find=$(unregex "$1")
       local replace=$(unregex "$2")
       shift 2
       # sed -i is only supported in GNU sed.
       #sed -i "s/$find/$replace/g" "$@"
       perl -p -i -e "s/$find/$replace/g" "$@"
    }
    

  • Related Question

    text - using sed to remove lines in a file
  • eleven81

    I have a file that looks something like this:

    Heading - 
      - Completed foo
        - More information
        - Still more
      * Need to complete bar
      - Did baz (comment blah blah) ***
    
    Another - 
      * Need to complete foo
      - Completed bar (blah comment blah) ***
      - Done baz
    

    I need to run the text file through sed to remove all of the lines that start with spaces (number varies) and a hyphen, and another space.

    What is the regex or pattern I need to use with sed to make the output look like this below?

    Heading - 
      * Need to complete bar
    
    Another - 
      * Need to complete foo
    

  • Related Answers
  • eleven81

    I used Phoshi's answer, assisted by Dennis Williamson, to help me come up with sed /^\s+-\s.*/d which works as expected.

  • Phoshi

    "s/\s*-\s.*//g" should do it, I think.

    That's \s to match a space, * to match zero or more of the preceding character (the space), a literal hyphen character, then another space, then .+ to match everything after it.

  • Ryan Thompson

    You should use egrep or grep for this task, sed is a stream editor, grep is more in line with the line-at-a-time philosophy.

    You need a regex that matches the start of line, whitespace, hyphen, space. Sounds like this would work:

    egrep  -v  '^[ ]+-[ ]' filename
    

    The -v option causes egrep to REMOVE the matching lines -- this is easier than building a regex that rejects the lines.

    Example:

     nobody$ egrep -v  '^[ ]+-[ ]' /tmp/foof
     Heading - 
       * Need to complete bar
    
     Another - 
       * Need to complete foo
     nobody$ cat /tmp/foof
     Heading - 
       - Completed foo
         - More information
         - Still more
       * Need to complete bar
       - Did baz (comment blah blah) ***
    
     Another - 
       * Need to complete foo
       - Completed bar (blah comment blah) ***
       - Done baz
     nobody$ _
    

    Dealing with Tab characters only means you need them in the bracket expressions,but that's hard to show online.