[DFDL-WG] DFDL implementation support for element refs

Steve Hanson smhdfdl at gmail.com
Tue May 7 12:51:10 PDT 2024


Mike

IBM DFDL as used by ACE has supported element refs since day one. They are
really useful, as shown in the DFDL schemas for EDIFACT.  Each EDIFACT
message is a global element, so can be parsed on its own. But there is also
the EDIFACT interchange global element, which is a collection of EDIFACT
messages, so the natural approach is to use element refs to pull in the
EDIFACT messages.

I'll try and join on Thursday but I am away Wed and Thurs, it all depends
when I get home.

Regards
Steve

On Mon, May 6, 2024 at 11:20 PM Mike Beckerle <mbeckerle at apache.org> wrote:

> I'm interested in what DFDL implementations support element references?
>
> IBM ACE?
> IBM zTPF?
> DFDL4Space?
>
> Can you let me know whether these implementations support element refs?
>
> The reason I ask is below, which may be of interest or perhaps TL;DR.
>
> We support element references in Daffodil, but I'm coming around to the
> view that element refs are a bad idea in DFDL schemas.
>
> They're not needed for any specific data format expressive power. That
> suggests we should have left them out of DFDL, but for some reason we
> didn't.
>
> The problem is that most data languages have nothing like element
> references and the associated element namespace management complexity
> available.
>
> So as soon as you want to use a DFDL schema but not use it to interchange
> data as XML, element refs become a problem.
>
> I'm playing around with a best practice/subset/profile suggestion where:
>
> * The only global element declarations in the schema are for root elements.
> * Element references are disallowed
> * The root elements are declared in a root schema file that contains ONLY
> the root elements
> * Root elements should always be declared by one-liners like this:
> `<element name="rootElement" type="prefix:rootElementType"/>`
> * The root elements schema file has no target namespace.
> * All group, type, and DFDL format/escapeScheme/variable definitions must
> be declared in different schema files that may (and probably should) have a
> target namespace.
>
> The benefit of these restrictions is that the elements in the nest of a
> DFDL infoset never have any namespaces.
> This makes them compatible with non-namespaced data systems like JSON,
> Apache Drill, Apache NiFi, Generated C code, etc.
> This makes integration with those things *massively* simpler.
>
> Such schemas are still easily reused by reusing the type of the root
> element, so there is no need to ever use an element reference, and a nice
> composition property occurs - you don't need element references to assemble
> schemas from component schemas, and the assembled component has the same
> characteristic.
>
> There are a few other things this discipline also simplifies. Reusing test
> data becomes simpler if namespace URIs aren't getting embedded in every
> test infoset XML file, for example.
>
> All comments are welcome.
>
> Mike Beckerle
> Apache Daffodil PMC | daffodil.apache.org
> OGF DFDL Workgroup Co-Chair | www.ogf.org/ogf/doku.php/standards/dfdl/dfdl
> Owl Cyber Defense | www.owlcyberdefense.com
>
>
> --
>   dfdl-wg mailing list
>   dfdl-wg at lists.ogf.org
>   https://lists.ogf.org/mailman/listinfo/dfdl-wg
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/html
Size: 4948 bytes
Desc: not available
URL: <https://lists.ogf.org/pipermail/dfdl-wg/attachments/20240507/96469abd/attachment.txt>


More information about the dfdl-wg mailing list