[relaxng-user] Documentation tokenization

Jeff Rafter lists at jeffrafter.com
Mon May 3 21:35:08 ICT 2004


Bob, you are my hero : )

> > I could also add that I am slightly confused by the additional [^Chars]
> > productions. Does this mean that those chars, if encountered, end the
> > production? Or that they may not appear in that position (or both?).
>
> None of the above. E.g., [^"&newline;] means it matches any character
> except  " or &newline;.

I get it I think. So for example, in the separator production we see:

| "#"  [^&newline;
#]  restOfLine

I now understand the "#" in there, Originally I thought &newline;
 was
redundant because restOfLine doesn't allow these characters anyway. I was
thinking the following be equivalent:

| "#"  [^#]  restOfLine

But now, based on what you said I see that these are *not* equivalent. The
plain text translation of this would be, "...or you can have a "#"
character. If you have a "#" character it can be followed by _any_ character
except #, then it will be followed by the rest of the line." Of course,
_any_ character would include the newline and #xA, which would be erroneous
if followed by the rest of line. This is why you need to include those
characters in the exception.

This all now seems plainly obvious, and I include it here mainly for the
sake of the archives. But for some reason I couldn't make it click without
your explanation. I don't know if this can be clarified with an example in
the specification at some point (or if it even needs to be).

Thanks again,
Jeff Rafter



More information about the relaxng-user mailing list