search - Using vim to delete all lines except those that match an arbitrary set of strings

08
2014-07
  • Dave

    I use vim to remove all lines except those that match a particular string, e.g.

    :g!/[string that I want to remain in the editor]/d

    Works great. But what I really want, and haven't found anywhere, is a way to use vim to remove all except for multiple strings.

    For example, let's say I have a file open with the following information:

    Dave came at 12PM
    Lucy came at 11AM
    Trish came at 5PM
    John ate lunch at 2PM
    Virgil left at 3PM
    Dave left at 6PM
    

    and I want to only be left with events that mention Dave and John -- what vim command could I use to just end up with:

    Dave came at 12PM
    John ate lunch at 2PM
    Dave left at 6PM
    

    I realize I can use command-line tools like findstr in Windows and others in *nix, but I'm in vim pretty often and haven't been able to some up with any regex or vim command that will do this. Thank you!

  • Answers
  • Ingo Karkat

    The :global command that you reference in your question actually doesn't just take literal strings, it handles any regular expression. So, you just need to come up with one that has two branches, one for John and one for Dave. Voila:

    :g!/Dave\|John/d
    

    Note that this simplistic one would also match Johnny; you probably want to limit the matches to whole keywords:

    :g!/\<\(Dave\|John\)\>/d
    

    Regular expressions are a powerful feature of Vim; it's worthwhile to learn more about them. Get started at :help regular-expression.

  • Lieven Keersmaekers

    Following should do it

    :v/\v(Dave|John)/d
    

    Breakdown

    :v                  matches all lines not containing the search expression 
    /\vDave|John        search expression
    /d                  execute delete on all those lines 
    
  • drk.com.ar

    Use this:

    :%s/^[^Dave|John].*\n//
    

    Meaning:

    %            means search the whole file
    ^            at the beginning of the line
    [^Dave|John] something that isn't Dave nor John
    .*           match anything
    \n           new line character
    //           replace with nothing
    

  • Related Question

    need a regex that would delete line endings in text file, except those in blank lines
  • NotSuper

    Please see the image:

    alt text

    What regex would delete all line endings only from non-blank lines (not deleting them from blank lines? This is from a text file of over 8000 lines.

    64-bit Vista.


  • Related Answers
  • Dan

    My messy method would be to open it in word, do a find and replace on ^p^p (two end paragraphs in a row) with some character not used in the file, like "|". Then I would replace all ^p with just a space. Then I would go back and replace t he "|" with ^p.

  • Dennis Williamson

    If you're trying to convert paragraphs that have line breaks at the end of each line to continuous text within each paragraph:

    Now is the time for all good\n
    men to come to the aid of their\n
    country\n
    \n

    into

    Now is the time for all good men to come to the aid of their country\n
    \n

    Then something like this should work:

    sed -n '1{x;d};H;${x;s|\([^\n]\)\n\([^\n]\)|\1 \2|gp}' file
    

    or

    sed ':a;$!N;s|^\n||;s|\n\([^\n]\+\)$| \1|;ta;p;D' file
    
  • Rich Homolka

    It kind of depends on what regex package you have, whether you have lookahead or not.

    I'd personally do:

    -- remove trailing whitespace, this makes sure that 'blank' lines are \n\n

    s/^[ \t][ \t]*$//

    -- if it's a singular linefeed, substitute

    s/([^\n])\n([^\n])/\1 \2/

    it really depends on your regex package

  • nmuntz

    with sed i would do something like:

    sed 's/[ \t]*$//'