Home > Archive > PostgreSQL Bugs > May 2005 > BUG #1687: Regular expression problem (II)









You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

 

Author BUG #1687: Regular expression problem (II)
Halley Pacheco de Oliveira

2005-05-31, 9:24 am


The following bug has been logged online:

Bug reference: 1687
Logged by: Halley Pacheco de Oliveira
Email address: halleypo@yahoo.com.br
PostgreSQL version: 7.4 and 8.0
Operating system: Linux and Windows
Description: Regular expression problem (II)
Details:

Maybe it would be easier to see the the problem I'm having with regular
expressions this way:

SELECT '192.168.0.15' SIMILAR TO
'([[:alnum:]_-]+).([[:alnum:]_-]+).([[:alnum:]_]+)';
?column?
----------
t

SELECT '192.168.0.15' SIMILAR TO '([\\w-]+).([\\w-]+).([\\w]+)';
?column?
----------
f

SELECT '192.168.0.15' ~
'^([[:alnum:]_-]+)\\.([[:alnum:]_-]+)\\.([[:alnum:]_]+)$';
?column?
----------
f

SELECT '192.168.0.15' ~ '^(([[:alnum:]_-]+)\\.){2}([[:alnum:]_]+)$';
?column?
----------
f

SELECT '192.168.0.15' ~ '^([\\w-]+)\\.([\\w-]+)\\.([\\w]+)$';
?column?
----------
f

SELECT '192.168.0.15' ~ '^(([\\w-]+)\\.){2}([\\w]+)$';
?column?
----------
f

Why does the first query gives a different output? It is not exactly the
same as the second query and similar to the others?

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Tom Lane

2005-05-31, 11:24 am

"Halley Pacheco de Oliveira" <halleypo@yahoo.com.br> writes:
> Maybe it would be easier to see the the problem I'm having with regular
> expressions this way:


> Maybe it would be easier to see the the problem I'm having with regular
> expressions this way:


> SELECT '192.168.0.15' SIMILAR TO
> '([[:alnum:]_-]+).([[:alnum:]_-]+).([[:alnum:]_]+)';
> ?column?
> ----------
> t


> SELECT '192.168.0.15' SIMILAR TO '([\\w-]+).([\\w-]+).([\\w]+)';
> ?column?
> ----------
> f


SIMILAR TO patterns are required to match the whole data string; so
the above fails because it only matches 3 digit groups not 4. The
others all fail because you put explicit ^ and $ into them.

The reason the first one works is that you put _ into the pattern, which
means "match anything" in SIMILAR-TO land; so it gets translated to "."
to be fed to the regular regexp engine. (Arguably that should not
happen inside square brackets, but similar_escape() isn't smart enough
to distinguish.) And that makes it possible for one of the
[]-expressions to match two digit groups plus the intervening dot.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Sponsored Links





Also available: Server administration forum archive | Web Design forum archive | Software forum archive | Hardware reviews archive | Programming forum archive

Copyright 2008 droptable.com