[DFDL-WG] How to choose the correct choice branch when serializing
Steve Hanson
smh at uk.ibm.com
Fri Apr 13 06:52:35 EDT 2012
Hi Tim
I've made some minor corrections to your summary of the problem.
If the user restructures his model to wrap the sequences in elements then
the problem goes away. So I think we should keep the solution to this as
simple as we can while not being unnecessarily restrictive.
Regards
Steve Hanson
Architect, Data Format Description Language (DFDL)
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848
From: Tim Kimber/UK/IBM at IBMGB
To: dfdl-wg at ogf.org
Date: 13/04/2012 10:54
Subject: [DFDL-WG] How to choose the correct choice branch when
serializing
Sent by: dfdl-wg-bounces at ogf.org
There is an interesting edge case which arises when the serializer
encounters a choice group.
A DFDL xsd is structured as follows:
<root>
<choice>
<sequence>
<firstname/>
<lastname/>
<postcode/>
</sequence>
<sequence>
<lastname/>
<telephoneNumber/>
</sequence>
</choice>
</root>
Note that both branches of the choice are sequences, not elements.
The infoset is
<root>
<lastName/>
<telephoneNumber/>
</root>
The likely action of the serializer is:
- pick the first branch of the choice ( because it contains lastname )
- output the default value of firstname ( assuming that firstname has
minOccurs = 1 and has a default )
- output lastname
- issue a processing error because telephoneNumber is found in the info
set but is not in the first branch.
...but from the infoset the user clearly intended:
- select the second branch of the choice and successfully process the
entire info set
The DFDL specification does not state what the behaviour should be. I
think the options are:
a) state explicitly that the serializer will choose the first branch that
contains a matching element, regardless of minOccurs
b) invent a new rule that causes the parser to back out of a branch and
try another branch if there is a minOccurs error while processing the
branch
c) disallow sequences and choices as immediate children of a choice group
Currently I'm leaning toward a) by process of elimination, for the
following reasons:
b) would make this scenario work, but I think it would impose a lot of
work on implementers because it would require the serializer to do
backtracking.
c) would simplify a lot of things, but I think it's too restrictive - I
can imagine complex data formats where is might be useful to have a choice
as the direct child of a choice because the discrimination rules might be
easier to express in a two-level structure.
regards,
Tim Kimber, Common Transformation Team,
Hursley, UK
Internet: kimbert at uk.ibm.com
Tel. 01962-816742
Internal tel. 246742
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
--
dfdl-wg mailing list
dfdl-wg at ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20120413/439525a5/attachment.html>
More information about the dfdl-wg
mailing list