Answers: shell script - how to find number of substrings between delimiter?

Dinesh Dabhi

I have the string following

something(1)^^^something(2)^^^something(3)^^^ ... ^^^something(n)

how to find number of

something(s)

in the string.

MariusMatutiae

This command will do for you:

 awk -F " " '{print NF}' filename

and you can substitute your favorite field separatorfor the space. If you insist on using ^^^ as a field separator, then you should use

  awk -F '\\^\\^\\^'  "{print NF}" filename

terdon

I can give you a better answer if you show your real data but assuming you are looking for the longest non-whitespace containing string that ends with (), you could do this:

$ string="foo(bar)blah blah bob harry(baz) more stuff this(one) not that one ()"
$ echo $string | grep -Po '[^\s]+\([^\)]+?\)' | wc -l
3

Explanation

The -P flag for grep enables PCREs and the -o causes it to print each of the matching strings only and on a separate line.

The regular expression matches:

[^\s]+ : as many non-whitespace characters as possible
\(: an opening parenthesis
[^\)]+?\) : as many non ) characters as possible until the first ).

This would print:

$ echo $string | grep -Po '[^\s]+\([^\)]+?\)' 
foo(bar)
harry(baz)
this(one)

You then pass that through wc -l to count the number of lines.

bash - how to use grep, sed, and awk to parse tags?

bash grep sed awk

mechko

I want to write a script that finds a open/close tag pair in a text file and prepends a fixed string to each line between the pair. I figure I use grep to find the tag line numbers and either awk or sed to place the tags, however, I'm not sure how exactly to do it.

Can someone help?

Related Answers

mpez0

In awk:

START                  {noprefix="true"}
/<close tag regex>/    {noprefix="true"}
noprefix=="false"      {print "prefix", $0}
noprefix=="true"       {print $0}
/<open tag regex>/     {noprefix="false"}

9tat

It should be done by one of the traditionally syntax aware languages (yacc etc). Doing it with grep and the like may be okay for specific cases but regexp simply is not powerful enough to catch the subtleties of HTML

user18151

You should consider using yacc for it. It is NOT possible to do this with sed, awk or grep without a considerable amount of effort. As for learning yacc, it wouldn't take more time than it did for learning sed/awk/grep. And it will be really easy that way.

Home

shell script - how to find number of substrings between delimiter?

Explanation