bash - Is there a way to print 2 lines (if any) before and after a target or print a placeholder if not?

06
2014-04
  • bonimba3

    I'm not native english speaker so I hope to be clear.

    I know about grep -C 2 "TARGET" inputfile to select 2 rows before and after the row of the TARGET, but I'm not able to use it to manage my problem. I have files structured like this

    1 0 value1 value2 value3
    2 H value1 value2 value3
    3 H value1 value2 value3
    4 H value1 value2 value3
    5 H value1 value2 value3
    6 0 value1 value2 value3
    7 0 value1 value2 value3
    8 H value1 value2 value3
    9 0 value1 value2 value3
    

    with several rows. The required solution would be a file like this

    X X X X X
    1 0 value1 value2 value3
    2 H value1 value2 value3 *
    3 H value1 value2 value3 
    4 H value1 value2 value3
    
    1 0 value1 value2 value3
    2 H value1 value2 value3
    3 H value1 value2 value3 *
    4 H value1 value2 value3 
    5 H value1 value2 value3
    
    ... all the other till
    
    6 0 value1 value2 value3
    7 0 value1 value2 value3
    8 H value1 value2 value3 *
    9 0 value1 value2 value3
    X X X X X
    

    where the TARGET is "H" , * is to indicate the selected row (but I don't need * in the output file) and X are placeholders to adjust the number of rows before or after the target! I tried also with awk and sed, with no results.

  • Answers
  • rici

    The same approach as Glenn Jackman's, but with a circular buffer instead of rotating the buffer on every input:

    awk -v N=2 -v TARGET=" H " -v PLACE="X X X X X" '
      function check(n, s,     i) {
        a[n%NN]=s
        if (n>N&&a[(n-N)%NN]~TARGET) {
          for (i=n+1;i<=n+NN;++i)
            print a[i%NN]
          print ""
        }
      }
    
      BEGIN{
        NN=2*N+1
        a[0]=PLACE
        for (i=1;i<=N;++i) { getline a[i]; a[i+N]=PLACE }
      }
    
      { check(NR,$0) }
    
      END{
        for (i=NR+1;i<=NR+N;++i) check(i,PLACE)
      }'
    
  • glenn jackman

    This will get you most of the way there:

    awk -v n=2 -v target=" H " '
        BEGIN {
            lines[0]=""
            for (i=1; i<=n; i++) {
                lines[i]="X X X X X"
                getline; lines[n+i]=$0
            }
        }
        function rotate(i) {
            for (i=1; i<=n*2; i++) 
                lines[i-1] = lines[i]
            lines[n*2]=$0
        }
        function check(i) {
            if (lines[n] ~ target) {
                for (i=0; i<=n*2; i++) 
                    print lines[i]
                print ""
            }
        }
        { rotate(); check() }
        END {
            for (i=1; i<=n; i++) {
                $0 = "X X X X X"; rotate(); check()
            }
        }
    ' inputfile
    

    outputs

    X X X X X
    1 0 value1 value2 value3
    2 H value1 value2 value3
    3 H value1 value2 value3
    4 H value1 value2 value3
    
    1 0 value1 value2 value3
    2 H value1 value2 value3
    3 H value1 value2 value3
    4 H value1 value2 value3
    5 H value1 value2 value3
    
    2 H value1 value2 value3
    3 H value1 value2 value3
    4 H value1 value2 value3
    5 H value1 value2 value3
    6 0 value1 value2 value3
    
    3 H value1 value2 value3
    4 H value1 value2 value3
    5 H value1 value2 value3
    6 0 value1 value2 value3
    7 0 value1 value2 value3
    
    6 0 value1 value2 value3
    7 0 value1 value2 value3
    8 H value1 value2 value3
    9 0 value1 value2 value3
    X X X X X
    

  • Related Question

    sed - grepping a substring from a grep result
  • user17245

    Given a log file, I will usually do something like this:

    grep 'marker-1234' filter_log
    

    What is the difference in using '' or "" or nothing in the pattern?

    The above grep command will yield many thousands of lines; what I desire. Within those lines, There is usually one chunk of data I am after. Sometimes, I use awk to print out the fields I am after. In this case, the log format changes, I can't rely on position exclusively, not to mention, the actual logged data can push position forward.

    To make this understandable, lets say the log line contained an IP address, and that was all I was after, so I can later pipe it to sort and unique and get some tally counts.

    An example may be:

    2010-04-08 some logged data, indetermineate chars - [marker-1234] (123.123.123.123) from: [email protected] to [email protected] [stat-xyz9876]
    

    The first grep command will give me many thousands of lines like the above, from there, I want to pipe it to something, probably sed, which can pull out a pattern within, and print only the pattern.

    For this example, using an the IP address would suffice. I tried. Is sed not able to understand [0-9]{1,3}. as a pattern? I had to [0-9][0-9][0-9]. which yielded strange results until the entire pattern created.

    This is not specific to an IP address, the pattern will change, but I can use that as a learning template.

    Thank you all.


  • Related Answers
  • Chris S

    I don't know what OS you're on, but on FreeBSD 7.0+ grep has a -o option to return only the part that matches the pattern. So you could
    grep "marker-1234" filter_log | grep -oE "[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}"

    Returns a list of just IP addresses from the 'filter_log"...

    This works on my system, but again, I don't know what your version of grep supports.

  • user31894

    you can do all these in just one awk command. No need to use any other tools

    $ awk '/marker-1234/{for(o=1;o<=NF;o++){if($o~/[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+/)print $o }  }' file
    (123.123.123.123)
    
  • Dennis Williamson

    You can shorten the second grep a little like this:

    grep -Eo '([0-9]{1,3}\.){3}[0-9]{1,3}'
    

    To answer your first question, double quotes allow the shell to do various things like variable expansion, but protect some metacharacters from needing to be escaped. Single quotes prevent the shell from doing those expansions. Using no quotes leaves things wide open.

    $ empty=""
    $ text1="some words"
    $ grep $empty some_file
    (It seems to hang, but it's just waiting for input since it thinks "some_file" is 
    the pattern and no filename was entered, so it thinks input is supposed to come
    from standard input. Press Ctrl-d to end it.)
    $ grep "$empty" some_file
    (The whole file is shown since a null pattern matches everything.)
    $ grep $text1 some_file
    grep: words: No such file or directory
    some_file:something
    some_file:some words
    (It sees the contents of the variable as two words, the first is seen as the 
    pattern, the second as one file and the filename as a second file.)
    $ grep "$text1" some_file
    some_file:some words
    (Expected results.)
    $ grep '$text1' some_file
    (No results. The variable isn't expanded and the file doesn't contain a
    string that consists of literally those characters (a dollar sign followed
    by "text1"))
    

    You can learn more in the "QUOTING" section of man bash

  • Josh K

    Look up the xargs command. You should be able to do something like:

    grep 'marker-1234' filter_log|xargs grep "("|cut -c1-15

    This may not be it exactly, but xargs is the command you want to use