regex - Which regular expressions (POSIX or extended) are allowed in awk?

08
2014-07
  • Zen

    I'm wondering which kind of regular expressions (POSIX as in grep, or extended in egrep) am I using when type in regular expressions in awk?

  • Answers
  • Matteo

    In AWK you can use extended regular expressions (as in egrep). From http://www.math.utah.edu/docs/info/gawk_5.html

    The regular expressions in awk are a superset of the POSIX specification for Extended Regular Expressions (EREs).


  • Related Question

    regex - Regular expression and grep not working
  • Wuffers

    I have the following regular expression:

    ([:digit:]{4})-([:digit:]{1,2})-([:digit:]{1,2})
    

    It should get dates in this format:

    2010-12-19
    

    And I am using it on filenames that look like this:

    2010-12-19-xxx-xxx-xxx.markdown
    

    And, when I use it with grep like this:

    echo $POST | grep -oE "([:digit:]{4})-([:digit:]{1,2})-([:digit:]{1,2})" # $POST is the filename
    

    It doesn't work, I just get emptiness.


  • Related Answers
  • Andy Smith

    Try this:-

    echo $POST | grep -oE "[0-9]{4}-[0-9]{1,2}-[0-9]{1,2}"
    

    If I try it here, I get:-

    [andys@daedalus ~]$ echo "2010-12-19-aaa-bbb-ccc-ddd.markdown" | grep -oE "[0-9]{4}-[0-9]{1,2}-[0-9]{1,2}"
    2010-12-19
    

    Hope that's what you're looking for.

  • frabjous

    Andy's answer is fine, but if you want something closer to your original syntax, you could try:

    echo $POST | egrep -oE "([[:digit:]]{4})-([[:digit:]]{1,2})-([[:digit:]]{1,2})"

    You need egrep here for extended regular expressions, and the double brackets for character classes.

  • Dennis Williamson

    You don't need the parentheses, but you do need more square brackets. Character classes have the same characteristics as individual characters. Just as you might search for vowels like this: [aeiou], or digits like this: [0123456789] or this: [0-9], you need to enclose a class such as [:digit:] or [:upper:] in a bracket expression as well: [[:xdigit:]] (hex digits).

    grep -oE "[[:digit:]]{4}-[[:digit:]]{1,2}-[[:digit:]]{1,2}"