[relaxng-user] Imlementing regular expression

jcowan at reutershealth.com jcowan at reutershealth.com
Thu May 6 11:18:03 ICT 2004


David Tolpin scripsit:

> I am not sure it is important, or even usable,
> because Posix regexps are not Unicode-aware,
> and because implementing full XML Schema regular expressions
> took just one evening (and under one thousand lines in C).

Hmm.  DFA or NFA?  One advantage of not allowing extensions to regularity
is that the compactness and speed of a DFA becomes available.

W3C Schema regexes are broken in a number of ways: they are
massively incomplete, they provide facilities (like matching
by block name) that look useful but are actually very dangerous --
e.g. p{IsSinhalese} does not match all characters required for
Sinhalese text -- and generally should be replaced by something
more general and useful for RNG purposes.

-- 
John Cowan  www.ccil.org/~cowan  www.reutershealth.com  jcowan at reutershealth.com
Mr. Henry James writes fiction as if it were a painful duty.  --Oscar Wilde


More information about the relaxng-user mailing list