taazz: yes, if you want to write a csv dataset, be my guest, but sdf is different than csv (sigh).
In my opinion, the sdf dataformat is just plain )$*(%#()*$% ugly. However, interoperability with Delphi is therefore (I think) about the only reason anybody would want this abomination.
I would definitely try to support everything a normal Delphi app would spit out - suggested: with strictdelimiter:=false, as that is the default.
Unfortunately, because of lack of a better alternative, loads of people (including me in the beginning) insisted on using sdfdataset to load their csv files, which often works but breaks horribly on boundary conditions where
- sdf specs and/or
- the Delphi implementation which deviates a bit from the spec and/or
- the FPC implementation, which differs rather more from the spec, see bug 19610
differ from what any sane csv format would be.
Any link for the SDF Specs? See at the bottom what I found.
Looking at the Delphi output for bug report 19610:
normal_string;quoted_string;"quoted;delimiter";quoted and space;"""quoted_and_starting_quote";"""quoted, starting quote, and space";quoted_with_tab character;quoted_multi
line; UnquotedSpacesInfront;UnquotedSpacesAtTheEnd ;" ""Spaces before quoted string""";Spaces after quoted string; ;
The delimiter is ';' on this record?
gives:
(The numbers below indicate field number)
Resulting elements with strictdelimiter false:
0normal_string
1quoted_string
2quoted;delimiter
3quoted and space
4"quoted_and_starting_quote
5"quoted, starting quote, and space
6quoted_with_tab character
7quoted_multi
line
8UnquotedSpacesInfront
9UnquotedSpacesAtTheEnd
10Spaces before quoted string
11Spaces after quoted string
12
Well, perhaps supporting the spaces after quoted string thing is too much.
no that's the easy part understanding the sdf specs with out reading them is a bit tedious
I'm almost done cleaning up the test cases to closely match the Delphi test program in 19610.
I'll separate out the Spaces after quoted string case, and remove some of your added quote tests.
I don't mind the spaces I do want to know the quote character used the data values in memory aka how the program sees them and on file so I can understand the problem clearly.
From what I have seen so far by changing the delimiter to ';' all those lines should work as is with the current implementation.
Understand and agree with your further changes, but I think those may actually be better done in a CSV dataset.
I would really like to see an RFC 4180 compliant CSV dataset and I would *strongly* suggest you take a look at combining csvdocument (see the wiki), as it's csvparser beautifully supports all the intricacies of RFC4180, as well as Excel mode etc.
This means we don't need to implement a parser of our own.
I really don't like SDF/CSV datasets. I would never use one. They are a memory hog except if they are used for a few hundred records and their performance is terrible because they use strings as the buffer and constantly converting from and to string on each operation.
I haven't looked at csvDocument my self but when it comes to csv I prefer a record parser that I can use to import the data to a more robust database this be it access/mysql/firebird or any other database engine that does not require me to load all the data in memory to work with.
The only true use of a csv files is to exchange data import and export and in some extreme cases where the import might take a while eg dts service with heavy calculations before or during inserts. I had to work with csv documents on the size of a 800M to 1.4G each to load those in memory (no import mind you) would require 3x the time it took to import them.
The rest of the dataset support could be built on this, e.g. by using memds or bufdataset or possibly ripping out the sdfdataset code (which I have my doubts about but by now you know much more about it than I).
Writing out the csv to file should once again be easy as csvdocument has a class for that as well.
I'll polish up the test cases for sdfdataset and post them...
Awaiting with interest to hear your opinion!
Thanks,
BigChimp
I'm here to make the sdf dataset work as expected I'm not going to use it my self at all, so if no one is going to use it either I think I will
just implement the delphi compatibility so every one that uses it can steel do so and leave it at that.
I'm against any kind of memory dataset for anything more than a glorified memory container for extremely complex forms eg. booking tickets on a ship with various port stops in between and travel planing with linked ships, flights, trains etc, and those only for performance reasons, this means no data editing, no data inserting, no deleting except on extreme cases.
While bufDataset is nice to have and it helps a lot with the SQLDB framework it needs to be as slim and fast as possible to avoid bottlenecks, I wouldn't build an un-cached dataset on top of it.
Any chance of getting my hands on delphi documentation for sdf? the following link seems to indicate that an extra file is required with the header information about the file.
http://www.delphigroups.info/2/76/115068.html