regex - Which regular expressions (POSIX or extended) are allowed in awk?
2014-07
I'm wondering which kind of regular expressions (POSIX as in grep, or extended in egrep) am I using when type in regular expressions in awk?
In AWK you can use extended regular expressions (as in egrep
). From http://www.math.utah.edu/docs/info/gawk_5.html
The regular expressions in awk are a superset of the POSIX specification for Extended Regular Expressions (EREs).
I have the following regular expression:
([:digit:]{4})-([:digit:]{1,2})-([:digit:]{1,2})
It should get dates in this format:
2010-12-19
And I am using it on filenames that look like this:
2010-12-19-xxx-xxx-xxx.markdown
And, when I use it with grep
like this:
echo $POST | grep -oE "([:digit:]{4})-([:digit:]{1,2})-([:digit:]{1,2})" # $POST is the filename
It doesn't work, I just get emptiness.
Try this:-
echo $POST | grep -oE "[0-9]{4}-[0-9]{1,2}-[0-9]{1,2}"
If I try it here, I get:-
[andys@daedalus ~]$ echo "2010-12-19-aaa-bbb-ccc-ddd.markdown" | grep -oE "[0-9]{4}-[0-9]{1,2}-[0-9]{1,2}"
2010-12-19
Hope that's what you're looking for.
Andy's answer is fine, but if you want something closer to your original syntax, you could try:
echo $POST | egrep -oE "([[:digit:]]{4})-([[:digit:]]{1,2})-([[:digit:]]{1,2})"
You need egrep here for extended regular expressions, and the double brackets for character classes.
You don't need the parentheses, but you do need more square brackets. Character classes have the same characteristics as individual characters. Just as you might search for vowels like this: [aeiou]
, or digits like this: [0123456789]
or this: [0-9]
, you need to enclose a class such as [:digit:]
or [:upper:]
in a bracket expression as well: [[:xdigit:]]
(hex digits).
grep -oE "[[:digit:]]{4}-[[:digit:]]{1,2}-[[:digit:]]{1,2}"