[DFDL-WG] First draft of appendix describing string literal matching
Steve Hanson
smh at uk.ibm.com
Tue Sep 3 09:54:39 EDT 2013
Thinking on 1 again, the combined table should be in section 6.3.1.
Appendices are usually non-normative. And the syntax table for DFDL
expressions is part of the main spec, as an analogy.
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848
From: Tim Kimber/UK/IBM at IBMGB
To: dfdl-wg at ogf.org,
Date: 03/09/2013 09:49
Subject: Re: [DFDL-WG] First draft of appendix describing string
literal matching
Sent by: dfdl-wg-bounces at ogf.org
Thanks for reviewing.
1. Let's drop tables 2 and 4 and replace with refs to the appendix, as
suggested
2. Agreed
3. Good point. I think the intention of %ES; was that it should be used on
its own. I don't see any point in allowing it to be a part of a
non-zero-length DFDL string literal. So I think your modification to the
grammar should be put into the spec.
regards,
Tim Kimber, DFDL Team,
Hursley, UK
Internet: kimbert at uk.ibm.com
Tel. 01962-816742
Internal tel. 37246742
From: Steve Hanson/UK/IBM
To: Mike Beckerle <mbeckerle.dfdl at gmail.com>,
Cc: Tim Kimber/UK/IBM at IBMGB
Date: 02/09/2013 17:46
Subject: Re: First draft of appendix describing string literal
matching
Good description. My comments:
1) Apart from the first three rows, the grammar table is pretty much
duplicating existing tables 2 and 4 in section 6.3.1. Suggest either that
the table is dropped from the appendix and anything that is missing is
added back into 6.3.1, or tables 2 and 4 are dropped and replaced by refs
to appendix. I think the latter is preferable as everything is then in a
single table.
2) There is a bug in the grammar for DfdlStringLiteral - there should not
be '{' and '}' - that's expression syntax.
3) For recognising ES, you say "The string part is recognized if the data
available for matching is zero-length". That's true if we insist that ES,
if present, must be present on its own. I'm not sure we actually say that.
If that is the intent, we should police this in the grammar. (Note IBM
DFDL does not give an error if it find '%ES;abc' ).
For 2) and 3) that would give:
DfdlStringLiteral
::=
(DfdlStringLiteralPart)+ | DfdlESEntity
DfdlCharClassName
::=
DfdlNLEntity | DfdlWSPEntity | DfdlWSPStarEntity | DfdlWSPPlusEntity
It still needs an errata, as it is a change to the spec document.
Needs references from 6.3.1.
Regards
Steve Hanson
Architect, IBM Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848
From: Mike Beckerle <mbeckerle.dfdl at gmail.com>
To: Tim Kimber/UK/IBM at IBMGB, Steve Hanson/UK/IBM at IBMGB,
Date: 30/08/2013 00:16
Subject: Re: First draft of appendix describing string literal
matching
I added this in current form as appendix D.
Will be in draft r14.4.
I did not create an erratum for this. It's a whole new section, not an
error correction or clarificatino. But we can add one if we think it
useful to point out this section.
There are no cross references to this section currently in the document.
We might find a few places we want to reference this from.
Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
www.tresys.com
Please note: Contributions to the DFDL Workgroup's email discussions are
subject to the OGF Intellectual Property Policy
On Wed, Aug 28, 2013 at 10:43 AM, Tim Kimber <KIMBERT at uk.ibm.com> wrote:
Thanks Mike.
I agree that the wording could be misinterpreted. Revised draft attached:
regards,
Tim Kimber, DFDL Team,
Hursley, UK
Internet: kimbert at uk.ibm.com
Tel. 01962-816742
Internal tel. 37246742
From: Mike Beckerle <mbeckerle.dfdl at gmail.com>
To: Tim Kimber/UK/IBM at IBMGB,
Cc: Steve Hanson/UK/IBM at IBMGB
Date: 20/08/2013 17:33
Subject: Re: First draft of appendix describing string literal
matching
I'm not sure I agree with the algorithm in the 1.3 section for the string
literal part "LiteralString".
I believe this algorithm is independent of what encoding the schema itself
is written in, i.e., what is on the <? xml encoding="..." ?> slug line at
the top of the schema file.
What you write in the schema file is read into memory, all characters are
converted to unicode codepoints by way of that reading process.
So these two statements in the Recognition Algorithm for LiteralString are
of concern:
"The characters in the DFDL schema will be encoded using the defined
encoding for the schema in which they appear."
I think this just muddies the waters. Elsewhere we should state that the
encoding used when authoring a DFDL schema file does not affect the
behavior of the schema. All schemas behave as if authored in utf-8, etc.
"The recognition algorithm must be able to compare character sequences
that are encoded using different encodings."
To me that says if I write my schema in ebcdic, but the
dfdl:encoding="ascii", that some algorithm other than mapping both into
unicode codepoints first and then comparing them is needed. I don't think
this is or should be true.
I think the division of things into what you call string literal parts is
needed due to raw byte, and due to character class entities. Outside of
that I think translation of everything to unicode should be sufficient.
...mike
On Thu, Aug 15, 2013 at 7:19 PM, Tim Kimber <KIMBERT at uk.ibm.com> wrote:
Steve, Mike,
Please take a look. Comments on high-level stuff like structure/level of
detail are welcome.
regards
Tim Kimber, DFDL Team,
Hursley, UK
Internet: kimbert at uk.ibm.com
Tel. 01962-816742
Internal tel. 37246742
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
--
Mike Beckerle | OGF DFDL Workgroup Co-Chair | Tresys Technology |
www.tresys.com
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
--
dfdl-wg mailing list
dfdl-wg at ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20130903/550dd16b/attachment.html>
More information about the dfdl-wg
mailing list