Answers: linux - can sed remove every line that contains 'foo'?

isildur

I read somewhere that:

sed -n -e '/foo/d' myinputfile.txt

would remove all occurences of 'foo' from myinputfile.txt.

However this does not seem to work for me. I am a sed noob and cannot seem to work this out. I am basically trying to run a bash script that calls sed on each line to remove a word from the input file and nothing happens when I run it.

Thanks :)

Ignacio Vazquez-Abrams

You read incorrectly.

However, while the sed expression itself is correct, the flags are not. sed normally outputs each line as it processes it to stdout, but -n suppresses this. The end result is that no lines are output. You must remove the -n if you want the proper output. You can then redirect this into another file, and then move that file into place.

command line - remove words containing non-alpha characters

command-line regex sed

dnkb

Given a text file with space separated string and a tab separated integer, I'd like to get rid of all words that have non-alpha characters but keep words consisting of alpha only characters and the tab plus the integer afterwards.

My attempts like the ones below didin't yield any good. What I was trying to express is something like: "replace anything within word boundaries that starts and ends with 0 or more whatever and there is at least one :digits: or :punct: in between".

sed 's/\b.*[:digits::punct:]+.*\b//g'
sed 's/\b.*[^:alpha:]+.*\b//g'

What am I missing? See sample input data below.

Thank you!

Input:

asdf 754m   563  
a2a 754mm   291  
754n    463  
754 ppp 1409  
754pin  4652  
pin pin 462  
754pins 652  
754 ppp </D>    1409  
<D> 754pin  4652  
pi$n pin    462  
754/p ins   652  
754 pp+p    1409  
754 p=in    4652

Desired output:

asdf    563  
    291  
    463  
ppp 1409  
    4652  
pin pin 462  
    652  
 ppp    1409  
    4652  
 pin    462  
 ins    652  
    1409  
    4652

Related Answers

Dennis Williamson

Basically this becomes a long list of things to delete:

sed -r 's/(^[[:digit:]]+\b|\b[[:digit:]]+[[:punct:]]*[[:alpha:]]+\b|\b[[:alpha:]]+[[:digit:]]+[[:alpha:]]+\b|\b[[:alpha:]]+[[:punct:]]+[[:alpha:]]+\b|[[:punct:]]+.*[[:punct:]]+)//g' file

Delete these:

digits at the beginning of the line
words that start with digits, may include punctuation, and end in alpha characters
words that consist of alpha chars, followed by digits, followed by alpha
words that consist of alpha, punct, alpha
sequences that begin and end with punct chars

Daisetsu

Wouldn't this best be solved with regular expressions?

([A-Z]+tab[0-9]+ ) or something like that

Area 51

So if I understand correctly you want to keep words that have either all words or all digits. But nothing else, if so something like this should work:

(^|\s+)([A-Za-z]+|\d+)((?=\s)|(?=$))

(Use with the multiline flag)

When run over your example input it will find every input that is either all digits or all words. This is an easier solution compared to finding every word that doesn't match, however you can use this to extract the data as opposed to replacing the invalid data.

Home

linux - can sed remove every line that contains 'foo'?