Recent

Author Topic: Listbox and Listboxtofile  (Read 467 times)

JLWest

  • Hero Member
  • *****
  • Posts: 614
Listbox and Listboxtofile
« on: November 14, 2019, 07:49:12 am »
I have a program with a listbox. The Listbox has 43397 items. It is written to disk with the following code:

After writing the file out in the middle of the file I get this (about 3,000 lines of this stuff) :

 "¬Ãƒâ€šÃ‚¢ÃƒÆ’ƒÆ’†â€™ÃƒÂÃâ€Ã"

Has anyone seen this before.



Code: Pascal  [Select]
  1. procedure TForm1.ListBoxToFile(AFile : String; ABOX : TListbox);
  2.   Var i : integer = -1;
  3.    OutFile : Textfile;
  4.    Line    : String = '';
  5.   begin
  6.      AssignFile(OutFile, AFile);
  7.   Try
  8.    Rewrite(OutFile);
  9.    for i := 0 to ABOX.Items.Count -1 do begin
  10.        Line := ABOX.Items[i];
  11.        Line := Trim(Line);
  12.        WriteLn (Outfile,Line);
  13.    end;
  14.     finally
  15.      CloseFile(OutFile);
  16.    end;
  17.   end;  





« Last Edit: November 14, 2019, 05:51:20 pm by JLWest »
FPC 3.2.0, Lazarus IDE v2.0.4
 Windows 10 Pro 32-GB
 Intel i7 770K CPU 4.2GHz 32702MB Ram
GeForce GTX 1080 Graphics - 8 Gig
4.1 TB

sstvmaster

  • Full Member
  • ***
  • Posts: 132
Re: Listbox and Listboxtofile
« Reply #1 on: November 14, 2019, 08:41:29 am »
Hi,

ListBox can save to file by itself.

You can use:
Code: Pascal  [Select]
  1. procedure TForm1.ListBoxToFile(AFile : String; ABOX : TListbox);
  2. begin
  3.   ABOX.Items.SaveToFile(AFile);
  4. end;
  5.  
Lazarus 2.0.6 x32
Lazarus 2.1.0 Trunk x32
OS Win 7 32bit

Thaddy

  • Hero Member
  • *****
  • Posts: 9309
Re: Listbox and Listboxtofile
« Reply #2 on: November 14, 2019, 09:12:35 am »
Hi,

ListBox can save to file by itself.

You can use:
Code: Pascal  [Select]
  1. procedure TForm1.ListBoxToFile(AFile : String; ABOX : TListbox);
  2. begin
  3.   ABOX.Items.SaveToFile(AFile);
  4. end;
  5.  

Yes. and is the recommended way, but the code should almost work...
I did a quick rewrite to see what was going on:
Code: Pascal  [Select]
  1. procedure TForm1.ListBoxToFile(const AFile : String;const  ABOX : TListbox); // use const here
  2. var
  3.    OutFile : Textfile;
  4.    Line    : String = '';
  5. begin
  6.   AssignFile(OutFile, AFile);
  7.   Try
  8.    Rewrite(OutFile);
  9.    for line in ListBox.Items do
  10.     WriteLn (Outfile,Line.trim);
  11.   finally
  12.     Flush(OutFile); // <---- !!! flush() before closing !!!
  13.     CloseFile(OutFile);
  14.   end;
  15. end;

But again: your solution is the preferred way, but the original code should work. My example - besides flush() - is just to demo some modern syntax possibilities and works.
« Last Edit: November 14, 2019, 09:17:04 am by Thaddy »
also related to equus asinus.

wp

  • Hero Member
  • *****
  • Posts: 6502
Re: Listbox and Listboxtofile
« Reply #3 on: November 14, 2019, 09:47:55 am »
After writing the file out in the middle of the file I get this (about 3,000 lines of this stuff) :

 "¬Ãƒâ€šÃ‚¢ÃƒÆ’ƒÆ’†â€™ÃƒÂÃâ€Ã"

Has anyone seen this before.
Looks like some incorrect conversion from/to UTF8. When you load the file into NotePad++ what does it display in the statusline? ANSI? Play with its encoding settings and see if you can get a readable display. Or post the file in some cloud (it is probably too large for a forum attachment) so that we can have a look.
Lazarus trunk / fpc 3.0.4 / all 32-bit on Win-10

JLWest

  • Hero Member
  • *****
  • Posts: 614
Re: Listbox and Listboxtofile
« Reply #4 on: November 14, 2019, 05:49:31 pm »
It was late last nite but before I gave up here is what I did.

Wrote the file to a different disk. Same results.

Then I noticed the junk was in the middle of a record and always the same record(Attached).
Haven't tried the file without the record yet. Says UTF8

WP - I think your on to something.
 
"Looks like some incorrect conversion from/to UTF8. When you load the file into NotePad++ what does it display in the statusline? ANSI?"

Yea, I can post the file and program if need be on my google drive.

« Last Edit: November 14, 2019, 05:52:07 pm by JLWest »
FPC 3.2.0, Lazarus IDE v2.0.4
 Windows 10 Pro 32-GB
 Intel i7 770K CPU 4.2GHz 32702MB Ram
GeForce GTX 1080 Graphics - 8 Gig
4.1 TB

winni

  • Hero Member
  • *****
  • Posts: 609
Re: Listbox and Listboxtofile
« Reply #5 on: November 14, 2019, 06:30:45 pm »
Hi!

In your trashrecord there is some convert "tutti frutti" as wp assumed.
It looks like this

Code: Text  [Select]
  1. Nil|1651|LF|Montmorelien|Aignes-et-Puypéroux|Angoul\u00eame|France|45.46269883|000.15584462|I|A|

It should be:
Aignes-et-Puypéroux     and
Angoulême

So both times the é and ê were converted wrong.
The utf8 routines got mad and did not recover before end of file.

Is the above line your original data?

Winni

JLWest

  • Hero Member
  • *****
  • Posts: 614
Re: Listbox and Listboxtofile
« Reply #6 on: November 14, 2019, 07:08:24 pm »
Hi!

In your trashrecord there is some convert "tutti frutti" as wp assumed.
It looks like this

Code: Text  [Select]
  1. Nil|1651|LF|Montmorelien|Aignes-et-Puypéroux|Angoul\u00eame|France|45.46269883|000.15584462|I|A|

It should be:
Aignes-et-Puypéroux     and
Angoulême

So both times the é and ê were converted wrong.
The utf8 routines got mad and did not recover before end of file.

Is the above line your original data?

Winni

Yes that is the orig line.

FPC 3.2.0, Lazarus IDE v2.0.4
 Windows 10 Pro 32-GB
 Intel i7 770K CPU 4.2GHz 32702MB Ram
GeForce GTX 1080 Graphics - 8 Gig
4.1 TB

wp

  • Hero Member
  • *****
  • Posts: 6502
Re: Listbox and Listboxtofile
« Reply #7 on: November 14, 2019, 07:39:29 pm »
From where did you get this line? I am rather sure that it is encoded incorrectly already in its source because the way the two false characters are displayed in your string is inconsistent and cannot be created like this by Lazarus normally:

'é' -- in the Lazarus Character Map (menu "Edit"), it can be found that both characters can be interpreted as presentation of the byte values $C3 ('Ã') and $A9 ('©') in codepages 1250, 1252, 1254, 1258 and maybe some more. Looking on the Unicode page of the Character Map, Range "Latin-1 Supplement" the UTF-8 character corresponding to $C3$A9 can be identified as 'é'.

'\00ea' -- a Google search shows that this is the UTF-16 character U+00EA ('ê'). The way this is displayed indicates that this part of the string originates in C/C++/Java.
Lazarus trunk / fpc 3.0.4 / all 32-bit on Win-10

JLWest

  • Hero Member
  • *****
  • Posts: 614
Re: Listbox and Listboxtofile
« Reply #8 on: November 14, 2019, 08:51:31 pm »
@WP

I'm not sure.

The complete file is 43,396 lines of airport data. About 33,000 of the lines comes from my Apt.Dat file and that is probably generated by C++.

After researching the record came from the Apt.Dat file; however only the following:
 '1   612 1 0 1651  Montmorelien |45.46269883|000.15584462|I|A|'

I think this information in the record  'Montmorelien|Aignes-et-Puypéroux|Angoul\u00eame|France' came from a JSON file I picked up on Githu

« Last Edit: November 14, 2019, 08:53:43 pm by JLWest »
FPC 3.2.0, Lazarus IDE v2.0.4
 Windows 10 Pro 32-GB
 Intel i7 770K CPU 4.2GHz 32702MB Ram
GeForce GTX 1080 Graphics - 8 Gig
4.1 TB

winni

  • Hero Member
  • *****
  • Posts: 609
Re: Listbox and Listboxtofile
« Reply #9 on: November 14, 2019, 08:53:01 pm »
Hallo!

The old rule: Garbage in - garbage out.

As wp pointed out there a two different converting errors in one record.
And for the first error there are a lot of different code pages that fit.
There is an awful amount of code necessary to detect these errors.

What you can do:
* Delete the record and hope that there are no more further mistakes -
or
* Change the record manually and
* Refine your code to detect the lines with bad utf8 encoding


If you plan to work with geo databases I really recommand GeoNames:

https://www.geonames.org/export/

Their master DB called allCountries.zip is a CSV file with more than 10 mio records:
cities, ,villages, mountain peaks, airports, .....
with name, name in Ascii, international name, inhabitants, sealevel,  timezone  and ...
and of course lat, long and nation.

And there are no utf8 errors. Or say: I did not find any.

Winni

JLWest

  • Hero Member
  • *****
  • Posts: 614
Re: Listbox and Listboxtofile
« Reply #10 on: November 14, 2019, 10:21:56 pm »
Thanks All!
I just deleted the record and the junk in the file.

@Winni

I check out geoname thanks.
FPC 3.2.0, Lazarus IDE v2.0.4
 Windows 10 Pro 32-GB
 Intel i7 770K CPU 4.2GHz 32702MB Ram
GeForce GTX 1080 Graphics - 8 Gig
4.1 TB