bash - Is there a way to print 2 lines (if any) before and after a target or print a placeholder if not?
2014-04
I'm not native english speaker so I hope to be clear.
I know about grep -C 2 "TARGET" inputfile
to select 2 rows before and after the row of the TARGET, but I'm not able to use it to manage my problem.
I have files structured like this
1 0 value1 value2 value3
2 H value1 value2 value3
3 H value1 value2 value3
4 H value1 value2 value3
5 H value1 value2 value3
6 0 value1 value2 value3
7 0 value1 value2 value3
8 H value1 value2 value3
9 0 value1 value2 value3
with several rows. The required solution would be a file like this
X X X X X
1 0 value1 value2 value3
2 H value1 value2 value3 *
3 H value1 value2 value3
4 H value1 value2 value3
1 0 value1 value2 value3
2 H value1 value2 value3
3 H value1 value2 value3 *
4 H value1 value2 value3
5 H value1 value2 value3
... all the other till
6 0 value1 value2 value3
7 0 value1 value2 value3
8 H value1 value2 value3 *
9 0 value1 value2 value3
X X X X X
where the TARGET is "H" , * is to indicate the selected row (but I don't need * in the output file) and X are placeholders to adjust the number of rows before or after the target! I tried also with awk and sed, with no results.
The same approach as Glenn Jackman's, but with a circular buffer instead of rotating the buffer on every input:
awk -v N=2 -v TARGET=" H " -v PLACE="X X X X X" '
function check(n, s, i) {
a[n%NN]=s
if (n>N&&a[(n-N)%NN]~TARGET) {
for (i=n+1;i<=n+NN;++i)
print a[i%NN]
print ""
}
}
BEGIN{
NN=2*N+1
a[0]=PLACE
for (i=1;i<=N;++i) { getline a[i]; a[i+N]=PLACE }
}
{ check(NR,$0) }
END{
for (i=NR+1;i<=NR+N;++i) check(i,PLACE)
}'
This will get you most of the way there:
awk -v n=2 -v target=" H " '
BEGIN {
lines[0]=""
for (i=1; i<=n; i++) {
lines[i]="X X X X X"
getline; lines[n+i]=$0
}
}
function rotate(i) {
for (i=1; i<=n*2; i++)
lines[i-1] = lines[i]
lines[n*2]=$0
}
function check(i) {
if (lines[n] ~ target) {
for (i=0; i<=n*2; i++)
print lines[i]
print ""
}
}
{ rotate(); check() }
END {
for (i=1; i<=n; i++) {
$0 = "X X X X X"; rotate(); check()
}
}
' inputfile
outputs
X X X X X
1 0 value1 value2 value3
2 H value1 value2 value3
3 H value1 value2 value3
4 H value1 value2 value3
1 0 value1 value2 value3
2 H value1 value2 value3
3 H value1 value2 value3
4 H value1 value2 value3
5 H value1 value2 value3
2 H value1 value2 value3
3 H value1 value2 value3
4 H value1 value2 value3
5 H value1 value2 value3
6 0 value1 value2 value3
3 H value1 value2 value3
4 H value1 value2 value3
5 H value1 value2 value3
6 0 value1 value2 value3
7 0 value1 value2 value3
6 0 value1 value2 value3
7 0 value1 value2 value3
8 H value1 value2 value3
9 0 value1 value2 value3
X X X X X
Given a log file, I will usually do something like this:
grep 'marker-1234' filter_log
What is the difference in using '' or "" or nothing in the pattern?
The above grep command will yield many thousands of lines; what I desire. Within those lines, There is usually one chunk of data I am after. Sometimes, I use awk to print out the fields I am after. In this case, the log format changes, I can't rely on position exclusively, not to mention, the actual logged data can push position forward.
To make this understandable, lets say the log line contained an IP address, and that was all I was after, so I can later pipe it to sort and unique and get some tally counts.
An example may be:
2010-04-08 some logged data, indetermineate chars - [marker-1234] (123.123.123.123) from: [email protected] to [email protected] [stat-xyz9876]
The first grep command will give me many thousands of lines like the above, from there, I want to pipe it to something, probably sed
, which can pull out a pattern within, and print only the pattern.
For this example, using an the IP address would suffice. I tried. Is sed
not able to understand [0-9]{1,3}. as a pattern? I had to [0-9][0-9][0-9]. which yielded strange results until the entire pattern created.
This is not specific to an IP address, the pattern will change, but I can use that as a learning template.
Thank you all.
I don't know what OS you're on, but on FreeBSD 7.0+ grep has a -o
option to return only the part that matches the pattern. So you could
grep "marker-1234" filter_log | grep -oE "[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}"
Returns a list of just IP addresses from the 'filter_log"...
This works on my system, but again, I don't know what your version of grep supports.
you can do all these in just one awk
command. No need to use any other tools
$ awk '/marker-1234/{for(o=1;o<=NF;o++){if($o~/[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+/)print $o } }' file
(123.123.123.123)
You can shorten the second grep
a little like this:
grep -Eo '([0-9]{1,3}\.){3}[0-9]{1,3}'
To answer your first question, double quotes allow the shell to do various things like variable expansion, but protect some metacharacters from needing to be escaped. Single quotes prevent the shell from doing those expansions. Using no quotes leaves things wide open.
$ empty=""
$ text1="some words"
$ grep $empty some_file
(It seems to hang, but it's just waiting for input since it thinks "some_file" is
the pattern and no filename was entered, so it thinks input is supposed to come
from standard input. Press Ctrl-d to end it.)
$ grep "$empty" some_file
(The whole file is shown since a null pattern matches everything.)
$ grep $text1 some_file
grep: words: No such file or directory
some_file:something
some_file:some words
(It sees the contents of the variable as two words, the first is seen as the
pattern, the second as one file and the filename as a second file.)
$ grep "$text1" some_file
some_file:some words
(Expected results.)
$ grep '$text1' some_file
(No results. The variable isn't expanded and the file doesn't contain a
string that consists of literally those characters (a dollar sign followed
by "text1"))
You can learn more in the "QUOTING" section of man bash
Look up the xargs
command. You should be able to do something like:
grep 'marker-1234' filter_log|xargs grep "("|cut -c1-15
This may not be it exactly, but xargs
is the command you want to use