[DFDL-WG] Action 259 - Consider allowing more flexible escapeBlock schemes
Steve Hanson
smh at uk.ibm.com
Tue May 20 12:58:12 EDT 2014
As discussed on the call, there is an import case that is not covered in
the table, namely where quotes surround a delimiter but the opening quote
is not at the start of the data. I imported the following text string into
Excel:
This is "," two separate fields
And indeed two columns were created, meaning the comma was treated as a
delimiter and not escaped. This matches DFDL so good.
Interestingly, the first column was as expected...
This is "
...but the second was not:
two separate fields
Notice the leading quote was removed without error, meaning that the
absence of the closing quote is permitted!
Regards
Steve Hanson
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848
From: Tim Kimber/UK/IBM at IBMGB
To: dfdl-wg at ogf.org,
Date: 13/05/2014 15:37
Subject: Re: [DFDL-WG] Action 259 - Consider allowing more flexible
escapeBlock schemes
Sent by: dfdl-wg-bounces at ogf.org
That looks fairly conclusive to me. DFDL should fall into line with
established practice.
regards,
Tim Kimber,
IBM Integration Bus Development (Industry Packs)
Hursley, UK
Internet: kimbert at uk.ibm.com
Tel. 01962-816742
Internal tel. 37246742
From: Steve Hanson/UK/IBM at IBMGB
To: dfdl-wg at ogf.org,
Date: 13/05/2014 11:50
Subject: [DFDL-WG] Action 259 - Consider allowing more flexible
escapeBlock schemes
Sent by: dfdl-wg-bounces at ogf.org
Action 259 was raised last call to decide what to do about the following,
as minuted:
Steve has an example of an escape block where the escape block end is not
at the end of the un-trimmed data. This gives a processing error. Another
IBM product accepts this usage. Should DFDL allow this? Or should there be
a new escapeKind that allows escapeBlockStart/End anywhere?
Tried importing these values from a CSV file into an Excel spreadsheet, a
Symphony spreadsheet (ie, successor to 123), and also accessing them via
ODBC using a Microsoft driver, to compare with IBM DFDL and IBM Cast Iron
behaviour.
Test
Data
IBM DFDL
IBM Cast Iron
MS Excel
Lotus Symphony
ODBC
1
This is normal
This is normal
This is normal
This is normal
This is normal
This is normal
2
"This is OK"
This is OK
This is OK
This is OK
This is OK
This is OK
3
"This| is expected"
This| is expected
This| is expected
This| is expected
This| is expected
This| is expected
4
This too "is OK"
This too "is OK"
This too "is OK"
This too "is OK"
This too "is OK"
This too
5
Even "this" is OK
Even "this" is OK
Even "this" is OK
Even "this" is OK
Even "this" is OK
Even
6
"This" is NOT OK
PARSE FAILED
This is NOT OK
This is NOT OK
This is NOT OK
This
7
"This"" is still OK"
This" is still OK
This" is still OK
This" is still OK
This" is still OK
This" is still OK
The data under discussion is 6. It looks like DFDL is out of step with the
behaviour of Excel / Symphony spreadsheets, and Cast Iron has adopted that
behaviour too.
Out of interest I also checked the output behaviour from Excel. That
escaped all instances of embedded quotes in the same way as DFDL, so no
issues there.
Regards
Steve Hanson
Architect, IBM DFDL
Co-Chair, OGF DFDL Working Group
IBM SWG, Hursley, UK
smh at uk.ibm.com
tel:+44-1962-815848
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
--
dfdl-wg mailing list
dfdl-wg at ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
--
dfdl-wg mailing list
dfdl-wg at ogf.org
https://www.ogf.org/mailman/listinfo/dfdl-wg
Unless stated otherwise above:
IBM United Kingdom Limited - Registered in England and Wales with number
741598.
Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://www.ogf.org/pipermail/dfdl-wg/attachments/20140520/ea46399c/attachment.html>
More information about the dfdl-wg
mailing list