search - Using vim to delete all lines except those that match an arbitrary set of strings
2014-07
I use vim to remove all lines except those that match a particular string, e.g.
:g!/[string that I want to remain in the editor]/d
Works great. But what I really want, and haven't found anywhere, is a way to use vim to remove all except for multiple strings.
For example, let's say I have a file open with the following information:
Dave came at 12PM
Lucy came at 11AM
Trish came at 5PM
John ate lunch at 2PM
Virgil left at 3PM
Dave left at 6PM
and I want to only be left with events that mention Dave and John -- what vim command could I use to just end up with:
Dave came at 12PM
John ate lunch at 2PM
Dave left at 6PM
I realize I can use command-line tools like findstr in Windows and others in *nix, but I'm in vim pretty often and haven't been able to some up with any regex or vim command that will do this. Thank you!
The :global
command that you reference in your question actually doesn't just take literal strings, it handles any regular expression. So, you just need to come up with one that has two branches, one for John
and one for Dave
. Voila:
:g!/Dave\|John/d
Note that this simplistic one would also match Johnny
; you probably want to limit the matches to whole keywords:
:g!/\<\(Dave\|John\)\>/d
Regular expressions are a powerful feature of Vim; it's worthwhile to learn more about them. Get started at :help regular-expression
.
Following should do it
:v/\v(Dave|John)/d
Breakdown
:v matches all lines not containing the search expression
/\vDave|John search expression
/d execute delete on all those lines
Use this:
:%s/^[^Dave|John].*\n//
Meaning:
% means search the whole file
^ at the beginning of the line
[^Dave|John] something that isn't Dave nor John
.* match anything
\n new line character
// replace with nothing
Please see the image:
What regex would delete all line endings only from non-blank lines (not deleting them from blank lines? This is from a text file of over 8000 lines.
64-bit Vista.
My messy method would be to open it in word, do a find and replace on ^p^p (two end paragraphs in a row) with some character not used in the file, like "|". Then I would replace all ^p with just a space. Then I would go back and replace t he "|" with ^p.
If you're trying to convert paragraphs that have line breaks at the end of each line to continuous text within each paragraph:
Now is the time for all good\n
men to come to the aid of their\n
country\n
\n
into
Now is the time for all good men to come to the aid of their country\n
\n
Then something like this should work:
sed -n '1{x;d};H;${x;s|\([^\n]\)\n\([^\n]\)|\1 \2|gp}' file
or
sed ':a;$!N;s|^\n||;s|\n\([^\n]\+\)$| \1|;ta;p;D' file
It kind of depends on what regex package you have, whether you have lookahead or not.
I'd personally do:
-- remove trailing whitespace, this makes sure that 'blank' lines are \n\n
s/^[ \t][ \t]*$//
-- if it's a singular linefeed, substitute
s/([^\n])\n([^\n])/\1 \2/
it really depends on your regex package
with sed i would do something like:
sed 's/[ \t]*$//'