shell script - how to find number of substrings between delimiter?
2014-04
I have the string following
something(1)^^^something(2)^^^something(3)^^^ ... ^^^something(n)
how to find number of
something(s)
in the string.
This command will do for you:
awk -F " " '{print NF}' filename
and you can substitute your favorite field separatorfor the space. If you insist on using ^^^ as a field separator, then you should use
awk -F '\\^\\^\\^' "{print NF}" filename
I can give you a better answer if you show your real data but assuming you are looking for the longest non-whitespace containing string that ends with ()
, you could do this:
$ string="foo(bar)blah blah bob harry(baz) more stuff this(one) not that one ()"
$ echo $string | grep -Po '[^\s]+\([^\)]+?\)' | wc -l
3
Explanation
The -P
flag for grep
enables PCREs and the -o
causes it to print each of the matching strings only and on a separate line.
The regular expression matches:
[^\s]+
: as many non-whitespace characters as possible\(
: an opening parenthesis[^\)]+?\)
: as many non)
characters as possible until the first)
.
This would print:
$ echo $string | grep -Po '[^\s]+\([^\)]+?\)'
foo(bar)
harry(baz)
this(one)
You then pass that through wc -l
to count the number of lines.
I want to write a script that finds a open/close tag pair in a text file and prepends a fixed string to each line between the pair. I figure I use grep to find the tag line numbers and either awk or sed to place the tags, however, I'm not sure how exactly to do it.
Can someone help?
In awk:
START {noprefix="true"}
/<close tag regex>/ {noprefix="true"}
noprefix=="false" {print "prefix", $0}
noprefix=="true" {print $0}
/<open tag regex>/ {noprefix="false"}
It should be done by one of the traditionally syntax aware languages (yacc etc). Doing it with grep and the like may be okay for specific cases but regexp simply is not powerful enough to catch the subtleties of HTML
You should consider using yacc for it. It is NOT possible to do this with sed, awk or grep without a considerable amount of effort. As for learning yacc, it wouldn't take more time than it did for learning sed/awk/grep. And it will be really easy that way.