Error in ZMSQL with column names containing diacritics

Forum > Database

(1/3) > >>

maurobio:
Dear ALL,

It seems that there is an issue concerning column names with diacritics in the otherwise great ZMSQL component (https://wiki.freepascal.org/ZMSQL).

When attempting to execute a SQL query on a CSV file in which the column names contain diacritics, I got an error "Operation cannot be performed on an active dataset". This error, however, does not appear when I execute the same SQL query on a CSV not containing diacritics in column names.

Attached is some sample code illustrating the problem.

Is there any workaround for this annoying problem?

Thanks in advance for any assistance you can provide.

With best wishes,

wp:
Are diacritics allowed in field names at all? Did you try to quote the field names?

BTW, you do not need the CSVExporter and the temporary CSV file to get the data from the worksheetdataset into the zmsqldataset. Simply call ZMSQLDataset.CopyFromDataset(sWorksheetDataset1).

maurobio:
Hi, @wp!

Thanks for the reply.

In a truly localized application, diacritics should be allowed in field names - especially when the fields are the columns of a spreadsheet as in the case of the sample program (as well as my large application). Quoting the fields using QuotedStr() solves the error but not solves the problem - the query returns a blank record.

Thanks for the much useful tip concerning CSVExporter! BTW, is there a simple way of exporting an entire FPSpreadsheetDataset to a CSV file? I presume it can be done by accessing directly the underlaying spreadsheet, as you mentioned in another post, but I cannot remember exactly how!

With best wishes,

wp:
Not sure whether ZMSQL supports non-ASCII characters in field names. Look at the SQL Tokenizer, TjanSQLTokenizer.GetToken in unit janSQLTokenizer: it analyzes the SQL string, detects the command, the field names, the order-by field names etc. While scanning over the SQL string it reckognizes these characters: single quote, comma, 'A'..'Z', 'a'..'z', '0'..'9', brackets, '+', '-', '*', '/', '=', '<', '>', space. Non-ASCII characters are not in this range and will terminate for example the field name extraction. However, when the field names are single-quoted the scanner accepts every byte until it finds the closing single quote - so, non-ASCII field names may be detected here.

But this is not all; further processing must handle these characters correctly as well. In procedure TZMQueryDataSet.InsertDataFromJanSQL, there is a line converting a string from a guessed code-page to UTF8. I don't know whether this applies to fieldnames or fieldvalues, but it is an indication that janSQL (on which the ZMSQL is based) internally works with ANSI codepages. Therefore, my idea is that you should try to test the SQL with field names converted to the codepage of your system. You are from Western Europe? Then CP1252 could be correct.

Next question: How to convert the spreadsheet to CP1252? This is answered together with your other question about how to convert a spreadsheet file to csv directly: For this you must read the spreadsheet with the TsWorkbook method ReadFromFile() and then save it as a CSV file by specifying sfCSV as output format in WriteToFile(). By default, the CSV file is written as UTF8, but you can specify any other code page in the CSVParams record.

Therefore, you could use the following code to convert an xls file directly to a CSV file in your system's codepage. Load this into ZMSQLDataset, and repeat your tests.

--- Code: Pascal [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---uses fpsSpreadsheet, fpsTypes, fpsCSV, xlsBIFF8; procedure TForm1.XlsToCSV(XlsFileName, CSVFileName: String);var workbook: TsWorkbook;begin workbook := TsWorkbook.Create; try workbook.ReadFromFile(XlsFileName, sfExcel8); CSVParams.Encoding := 'cp1252'; // Write the CSV with codepage 1252 (Western Europe) workbook.WriteToFile(CSVFileName, sfCSV, true); finally workbook.Free; end;end;

maurobio:
Hi, @wp!

Again, thank you very much for your insights.

In fact, what I need is a normalization function, which takes strings containing diacritics and replaces them by their ASCII equivalents (eg., 'á' becomes 'a', 'é' becomes 'e', and so on). In this way, I could translate on the fly the field names from, eg.. 'Espécies' to 'Especies'.

But I could not find a way of achieving that in FPC/Lazarus. There are some potentially useful suggestions in Stackoverflow:

https://stackoverflow.com/questions/1891196/convert-hi-ansi-chars-to-ascii-equivalent-%c3%a9-e
https://stackoverflow.com/questions/66733768/any-rtl-function-to-remove-accents-from-a-char

and Daniweb:

https://www.daniweb.com/programming/software-development/threads/492700/how-to-use-normalizestring-function-in-delphi

However they work only on Delphi.

In this forum, there is a discussion on this topic: https://forum.lazarus.freepascal.org/index.php?topic=13098.0, but the function suggested there just replaces the diacritics by a fixed character.

I have also tried function NormalizeNFD from the unicodedata unit, but it simply returns the same string with diacritics!

Any suggestions?

With best wishes,

Navigation

[0] Message Index

[#] Next page