Recent

Author Topic: Error in ZMSQL with column names containing diacritics  (Read 5598 times)

maurobio

  • Hero Member
  • *****
  • Posts: 623
  • Ecology is everything.
    • GitHub
Error in ZMSQL with column names containing diacritics
« on: October 19, 2021, 10:57:03 pm »
Dear ALL,

It seems that there is an issue concerning column names with diacritics in the otherwise great ZMSQL component (https://wiki.freepascal.org/ZMSQL).

When attempting to execute a SQL query on a CSV file in which the column names contain diacritics, I got an error "Operation cannot be performed on an active dataset". This error, however, does not appear when I execute the same SQL query on a CSV not containing diacritics in column names.

Attached is some sample code illustrating the problem.

Is there any workaround for this annoying problem?

Thanks in advance for any assistance you can provide.

With best wishes,
UCSD Pascal / Burroughs 6700 / Master Control Program
Delphi 7.0 Personal Edition
Lazarus 2.0.12 - FPC 3.2.0 on GNU/Linux Mint 19.1, Lubuntu 18.04, Windows XP SP3, Windows 7 Professional, Windows 10 Home

wp

  • Hero Member
  • *****
  • Posts: 11833
Re: Error in ZMSQL with column names containing diacritics
« Reply #1 on: October 20, 2021, 11:17:44 am »
Are diacritics allowed in field names at all? Did you try to quote the field names?

BTW, you do not need the CSVExporter and the temporary CSV file to get the data from the worksheetdataset into the zmsqldataset. Simply call ZMSQLDataset.CopyFromDataset(sWorksheetDataset1).

maurobio

  • Hero Member
  • *****
  • Posts: 623
  • Ecology is everything.
    • GitHub
Re: Error in ZMSQL with column names containing diacritics
« Reply #2 on: October 20, 2021, 01:24:55 pm »
Hi, @wp!

Thanks for the reply.

In a truly localized application, diacritics should be allowed in field names - especially when the fields are the columns of a spreadsheet as in the case of the sample program (as well as my large application). Quoting the fields using QuotedStr() solves the error but not solves the problem - the query returns a blank record.

Thanks for the much useful tip concerning CSVExporter! BTW, is there a simple way of exporting an entire FPSpreadsheetDataset to a CSV file? I presume it can be done by accessing directly the underlaying spreadsheet, as you mentioned in another post, but I cannot remember exactly how!

With best wishes,
UCSD Pascal / Burroughs 6700 / Master Control Program
Delphi 7.0 Personal Edition
Lazarus 2.0.12 - FPC 3.2.0 on GNU/Linux Mint 19.1, Lubuntu 18.04, Windows XP SP3, Windows 7 Professional, Windows 10 Home

wp

  • Hero Member
  • *****
  • Posts: 11833
Re: Error in ZMSQL with column names containing diacritics
« Reply #3 on: October 20, 2021, 05:11:06 pm »
Not sure whether ZMSQL supports non-ASCII characters in field names. Look at the SQL Tokenizer, TjanSQLTokenizer.GetToken in unit janSQLTokenizer: it analyzes the SQL string, detects the command, the field names, the order-by field names etc. While scanning over the SQL string it reckognizes these characters: single quote, comma, 'A'..'Z', 'a'..'z', '0'..'9', brackets, '+', '-', '*', '/', '=', '<', '>', space. Non-ASCII characters are not in this range and will terminate for example the field name extraction. However, when the field names are single-quoted the scanner accepts every byte until it finds the closing single quote - so, non-ASCII field names may be detected here.

But this is not all; further processing must handle these characters correctly as well. In procedure TZMQueryDataSet.InsertDataFromJanSQL, there is a line converting a string from a guessed code-page to UTF8. I don't know whether this applies to fieldnames or fieldvalues, but it is an indication that janSQL (on which the ZMSQL is based) internally works with ANSI codepages. Therefore, my idea is that you should try to test the SQL with field names converted to the codepage of your system. You are from Western Europe? Then CP1252 could be correct.

Next question: How to convert the spreadsheet to CP1252? This is answered together with your other question about how to convert a spreadsheet file to csv directly: For this you must read the spreadsheet with the TsWorkbook method ReadFromFile() and then save it as a CSV file by specifying sfCSV as output format in WriteToFile(). By default, the CSV file is written as UTF8, but you can specify any other code page in the CSVParams record.

Therefore, you could use the following code to convert an xls file directly to a CSV file in your system's codepage. Load this into ZMSQLDataset, and repeat your tests.

Code: Pascal  [Select][+][-]
  1. uses
  2.   fpsSpreadsheet, fpsTypes, fpsCSV, xlsBIFF8;
  3.  
  4. procedure TForm1.XlsToCSV(XlsFileName, CSVFileName: String);
  5. var
  6.   workbook: TsWorkbook;
  7. begin
  8.   workbook := TsWorkbook.Create;
  9.   try
  10.     workbook.ReadFromFile(XlsFileName, sfExcel8);
  11.     CSVParams.Encoding := 'cp1252';   // Write the CSV with codepage 1252 (Western Europe)
  12.     workbook.WriteToFile(CSVFileName, sfCSV, true);
  13.   finally
  14.     workbook.Free;
  15.   end;
  16. end;
« Last Edit: October 20, 2021, 05:30:06 pm by wp »

maurobio

  • Hero Member
  • *****
  • Posts: 623
  • Ecology is everything.
    • GitHub
Re: Error in ZMSQL with column names containing diacritics
« Reply #4 on: October 20, 2021, 09:14:48 pm »
Hi, @wp!

Again, thank you very much for your insights.

In fact, what I need is a normalization function, which takes strings containing diacritics and replaces them by their ASCII equivalents (eg., 'á' becomes 'a', 'é' becomes 'e', and so on). In this way, I could translate on the fly the field names from, eg.. 'Espécies' to 'Especies'.

But I could not find a way of achieving that in FPC/Lazarus. There are some potentially useful suggestions in Stackoverflow:

https://stackoverflow.com/questions/1891196/convert-hi-ansi-chars-to-ascii-equivalent-%c3%a9-e
https://stackoverflow.com/questions/66733768/any-rtl-function-to-remove-accents-from-a-char

and Daniweb:

https://www.daniweb.com/programming/software-development/threads/492700/how-to-use-normalizestring-function-in-delphi


However they work only on Delphi.

In this forum, there is a discussion on this topic: https://forum.lazarus.freepascal.org/index.php?topic=13098.0, but the function suggested there just replaces the diacritics by a fixed character.

I have also tried function NormalizeNFD from the unicodedata unit, but it simply returns the same string with diacritics!

Any suggestions?

With best wishes,
« Last Edit: October 20, 2021, 09:35:44 pm by maurobio »
UCSD Pascal / Burroughs 6700 / Master Control Program
Delphi 7.0 Personal Edition
Lazarus 2.0.12 - FPC 3.2.0 on GNU/Linux Mint 19.1, Lubuntu 18.04, Windows XP SP3, Windows 7 Professional, Windows 10 Home

wp

  • Hero Member
  • *****
  • Posts: 11833

maurobio

  • Hero Member
  • *****
  • Posts: 623
  • Ecology is everything.
    • GitHub
Re: Error in ZMSQL with column names containing diacritics
« Reply #6 on: October 20, 2021, 10:58:13 pm »
Hi, @wp!

The solution suggested by @howardpc in the post you pointed out in fact does what I need, that is, removing diacritics from field names - but the SQL query still does not work! It keeps returning an empty record (see my modified code).

It looks like this problem is more complex than it appeared at first.  :o

With best wishes,
UCSD Pascal / Burroughs 6700 / Master Control Program
Delphi 7.0 Personal Edition
Lazarus 2.0.12 - FPC 3.2.0 on GNU/Linux Mint 19.1, Lubuntu 18.04, Windows XP SP3, Windows 7 Professional, Windows 10 Home

wp

  • Hero Member
  • *****
  • Posts: 11833
Re: Error in ZMSQL with column names containing diacritics
« Reply #7 on: October 20, 2021, 11:43:33 pm »
You replace only the non-ASCII fieldnames in the SQL text, but the fields in the dataset still have their old names. This way the result set must be empty.

To fix the fieldnames in the dataset itself, you could recreate the FieldDefs and replace the field names. But this is not straightforward because the Name field of a TFieldDef cannot be changed any more.

A relatively easy option is to read the xls file with fpspreadsheet, replace the strings in the first row by their ASCII counter-parts and then rewrite the file, before you begin your own operations.

Code: Pascal  [Select][+][-]
  1. uses
  2.   FPSpreadsheet;
  3.  
  4. procedure FixFieldNames(SrcFileName, DestFileName: String);
  5. var
  6.   workbook: TsWorkbook;
  7.   worksheet: TsWorksheet;
  8.   c: Integer;
  9.   s: String;
  10. begin
  11.   workbook := TsWorkBook.create;
  12.   try
  13.     workbook.ReadFromFile(SrcFileName);
  14.     worksheet := workbook.GetFirstWorksheet;
  15.     for c := 0 to worksheet.GetLastColIndex(true) do
  16.     begin
  17.       s := worksheet.ReadAsText(0, c);
  18.       if s <> '' then
  19.         worksheet.WriteText(0, c, AccentedNameToAscii(s));
  20.     end;
  21.     workbook.WriteToFile(DestFileName, true);
  22.   finally
  23.     workbook.Free;
  24.   end;
  25. end;
  26.  
  27. procedure TForm1.FormShow(Sender: TObject);
  28. begin
  29.   FixFieldNames('PlantList.xls', 'PlantList1.xls');
  30.   sWorkSheetDataset1.FileName := 'PlantList1.xls';
  31.   sWorkSheetDataset1.Open;
  32.   ZMConnection1.DatabasePath := GetCurrentDir;
  33.   ZMQueryDataSet1.TableName := 'PlantList';
  34. end;
  35.  
  36. procedure TForm1.Button1Click(Sender: TObject);
  37. begin
  38.   ZMQueryDataset1.CopyFromDataset(sWorksheetDataset1);
  39. end;
  40.  
  41. procedure TForm1.Button2Click(Sender: TObject);
  42. var
  43.   FAMILY, SPECIES: string;
  44. begin
  45.   FAMILY := QuotedStr(ZMQueryDataset1.Fields[2].FieldName);
  46.   SPECIES :=QuotedStr(ZMQueryDataset1.Fields[3].FieldName);
  47.  
  48.   ZMQueryDataset1.SQL.Text := 'SELECT ' + FAMILY + ', ' + SPECIES +
  49.     ' FROM PlantList GROUP BY ' + FAMILY + ', ' + SPECIES + ' ORDER BY ' +
  50.     FAMILY + ', ' + SPECIES;
  51.   ShowMessage(ZMQueryDataset1.SQL.Text);
  52.   ZMQueryDataset1.Close;   // This removes the empty columns from the final result
  53.   ZMQueryDataset1.QueryExecute;
  54.   ZMQueryDataset1.First;
  55. end;

This way the result is not empty, however, does not contain the correct field values. I don't know... Are you sure that SQL SELECT is working in ZMSQL? Please test simpler queries first.

maurobio

  • Hero Member
  • *****
  • Posts: 623
  • Ecology is everything.
    • GitHub
Re: Error in ZMSQL with column names containing diacritics
« Reply #8 on: October 20, 2021, 11:57:40 pm »
Hi, @wp!

I was musing here exactly that the solution should pass through FPSpreadSheetDataset itself, as you just pointed out.

I have been using TZMSQL heavily in my application in connection with FPSpreadSheetDataset without any further trouble (except for this problem with diacritics in column names).

For the time being, I will consider to simply instruct users to avoid using diacritics in column names until a proper solution (if any) can be found.

Thank you very much for your time and patience!

With best wishes,
UCSD Pascal / Burroughs 6700 / Master Control Program
Delphi 7.0 Personal Edition
Lazarus 2.0.12 - FPC 3.2.0 on GNU/Linux Mint 19.1, Lubuntu 18.04, Windows XP SP3, Windows 7 Professional, Windows 10 Home

wp

  • Hero Member
  • *****
  • Posts: 11833
Re: Error in ZMSQL with column names containing diacritics
« Reply #9 on: October 21, 2021, 12:09:42 am »
Can you post for me a modification of the program using the non-diagcritics fieldnames to demonstrate that the SQL is working. I am asking because after modification according to my previous commit there are no diacritics any more, and I don't see why this simply duplicates the field names in the rows.

maurobio

  • Hero Member
  • *****
  • Posts: 623
  • Ecology is everything.
    • GitHub
Re: Error in ZMSQL with column names containing diacritics
« Reply #10 on: October 21, 2021, 12:34:58 am »
@wp,

Here it its. As you can see, this works fine (without diacritics in the column names).

Thank you!

With best wishes,
UCSD Pascal / Burroughs 6700 / Master Control Program
Delphi 7.0 Personal Edition
Lazarus 2.0.12 - FPC 3.2.0 on GNU/Linux Mint 19.1, Lubuntu 18.04, Windows XP SP3, Windows 7 Professional, Windows 10 Home

wp

  • Hero Member
  • *****
  • Posts: 11833
Re: Error in ZMSQL with column names containing diacritics
« Reply #11 on: October 21, 2021, 11:38:47 am »
Yes, and when I add the FixFieldNames code that I sent in a previous post this code works also with diacritics in the xls file - of course this is as expected.

But when I remove the auxiliary csv file and copy the worksheetdataset to the zmsqlquerydataset directly, like in the following snippet, there is the "inactive dataset" error again.
Code: Pascal  [Select][+][-]
  1. procedure TForm1.Button1Click(Sender: TObject);
  2. begin
  3.   (*
  4.   CSVExporter1.FileName := 'PlantList.csv';
  5.   CSVExporter1.Execute;
  6.   ZMQueryDataset1.LoadFromTable;
  7.   *)
  8.   ZMQueryDataset1.CopyFromDataset(sWorksheetDataset1);
  9. end;

Something must be different in the ZMSqlDataset depending on whether it is populated via a CSV file or via CopyFromDataset.


maurobio

  • Hero Member
  • *****
  • Posts: 623
  • Ecology is everything.
    • GitHub
Re: Error in ZMSQL with column names containing diacritics
« Reply #12 on: October 21, 2021, 01:06:43 pm »
Hi, @wp!

Something must be different in the ZMSqlDataset depending on whether it is populated via a CSV file or via CopyFromDataset.

I did already suspect that, because when I tried to use CopyFromDataset in my application, it broke, so I did stick to CSVExporter which works fine.

I will try again with your FixedFieldNames routine.

Thanks a lot!

With best wishes,
UCSD Pascal / Burroughs 6700 / Master Control Program
Delphi 7.0 Personal Edition
Lazarus 2.0.12 - FPC 3.2.0 on GNU/Linux Mint 19.1, Lubuntu 18.04, Windows XP SP3, Windows 7 Professional, Windows 10 Home

 

TinyPortal © 2005-2018