Recent

Author Topic: Eliminating empty strings in TStringList when reading from text file.  (Read 21302 times)

winni

  • Hero Member
  • *****
  • Posts: 3197
Re: Eliminating empty strings in TStringList when reading from text file.
« Reply #30 on: March 12, 2020, 10:52:10 pm »
Hi!

Question @lucamar:

Is such an error like LFCR instead of CRLF consistent in those files?

Or are there other surprises?

Winni

MoCityMM

  • Jr. Member
  • **
  • Posts: 72
Re: Eliminating empty strings in TStringList when reading from text file.
« Reply #31 on: March 13, 2020, 12:52:27 am »
As far as I knew 'empty' is zero length but, each his own on that.

Use what works for you and makes you feel comfortable using.  :D

-Mo

jamie

  • Hero Member
  • *****
  • Posts: 6802
Re: Eliminating empty strings in TStringList when reading from text file.
« Reply #32 on: March 13, 2020, 12:52:35 am »
lets see...
I couldn't use a TStringList because there is no direct access pointer to actual string list, the GetText and TEXT properties recreates the string so that's a bust, also its kind of a waste of processing speed, there should be a way to get the raw text and then call a method to have the stringlist update itself if values were changed directly..

  But in anycase, I did an example using a TmemoryStream, the stream can be used to load the file and then filter out the blank lines quickly. after that you assign the Memory space over to a TstingList if you want to use the facilities of that..
 
Code: Pascal  [Select][+][-]
  1. Procedure RemoveEmptyLines( Var S:TMemorystream);
  2. var
  3.   R,W,ST:Pchar;
  4.   CC,LEC:Integer;
  5. Begin
  6.   If (S = Nil)or(S.Size=0) then Exit;
  7.   R := PChar(S.Memory);
  8.   W := R;
  9.   ST := W;
  10.   While R^<>#0 do
  11.    Begin
  12.     CC := 0; //Char Count;
  13.     While Not (R^ in [#13,#10,#0]) do //Move line content if any
  14.      Begin
  15.       W^ := R^;
  16.       Inc(R); Inc(W);
  17.       Inc(CC);
  18.      End;
  19.     If CC <> 0 Then
  20.     Begin
  21.     LEC := 0;
  22.     While (R^ in [#13,#10])and(LEC<2) Do //Move the Line Ending if Valid content.
  23.      Begin
  24.       Inc(LEC);
  25.       W^:= R^;
  26.       Inc(R);
  27.       Inc(W);
  28.      end;
  29.     End
  30.     Else
  31.      While (R^ in [#13,#10]) Do Inc(R); // Skip over blanks.
  32.    End;
  33.  W^ := #0; // Terminate the end;..
  34. End;
  35. procedure TForm1.Button1Click(Sender: TObject);
  36. Var
  37.   S:TMemoryStream;
  38.   P:PChar;
  39. begin
  40.    //Test code;
  41.    S := TmemoryStream.Create; //you can load from file here.
  42.    S.Write(Pchar(Memo1.Lines.Text)^,Length(Memo1.Lines.Text)+1); //use a memo for now to create test data.
  43.    RemoveEmptyLines(S);  // The real work horse;
  44.    Memo1.Lines.Text :=String(S.Memory);
  45.    S.Free;
  46. end;              
  47.  
  48.  

I need to work out the Line Ending issue incase there are repeating single type line endings. I can work that out later I guess.. but This works like streak Lighting compare to others..

The only true wisdom is knowing you know nothing

jamie

  • Hero Member
  • *****
  • Posts: 6802
Re: Eliminating empty strings in TStringList when reading from text file.
« Reply #33 on: March 13, 2020, 01:04:01 am »
Code: Pascal  [Select][+][-]
  1. Procedure RemoveEmptyLines( Var S:TMemorystream);
  2. var
  3.   R,W,ST:Pchar;
  4.   CC,LEC:Integer;
  5.   L:char;
  6. Begin
  7.   If (S = Nil)or(S.Size=0) then Exit;
  8.   R := PChar(S.Memory);
  9.   W := R;
  10.   ST := W;
  11.   While R^<>#0 do
  12.    Begin
  13.     CC := 0; //Char Count;
  14.     While Not (R^ in [#13,#10,#0]) do //Move line content if any
  15.      Begin
  16.       W^ := R^;
  17.       Inc(R); Inc(W);
  18.       Inc(CC);  //Char count
  19.      End;
  20.     If CC <> 0 Then
  21.     Begin
  22.     LEC := 0;  //Line Ending Count
  23.     L := #0;   //Last char in line ending.
  24.     While (R^ in [#13,#10])and(L <> R^)And(LEC<2) Do //Move the Line Ending if Valid content.
  25.      Begin
  26.       L := R^; //Update the last Line ending type incase we have single types.
  27.       Inc(LEC);
  28.       W^:= R^;
  29.       Inc(R);
  30.       Inc(W);
  31.      end;
  32.     End
  33.     Else
  34.      While (R^ in [#13,#10]) Do Inc(R); // Skip over blanks.
  35.    End;
  36.  W^ := #0; // Terminate the end;..
  37. End;
  38. procedure TForm1.Button1Click(Sender: TObject);
  39. Var
  40.   S:TMemoryStream;
  41.   P:PChar;
  42. begin
  43.    //Test code;
  44.    S := TmemoryStream.Create; //you can load from file here.
  45.    S.Write(Pchar(Memo1.Lines.Text)^,Length(Memo1.Lines.Text)+1); //use a memo for now to create test data.
  46.    RemoveEmptyLines(S);  // The real work horse;
  47.    Memo1.Lines.Text :=String(S.Memory);
  48.    S.Free;
  49. end;
  50.  
  51. end.
  52.  

Sorry for the noise on the last one, this one will do single line endings and it does not matter if its 13,10 or singles of ether ..
The only true wisdom is knowing you know nothing

lucamar

  • Hero Member
  • *****
  • Posts: 4219
Re: Eliminating empty strings in TStringList when reading from text file.
« Reply #34 on: March 13, 2020, 08:45:28 am »
Is such an error like LFCR instead of CRLF consistent in those files?

Or are there other surprises?

Depends on what you mean by "consistent"; It's consistent in the sense that if a file uses it as line separator, it uses it for all lines. But, of course, you can always count on finding surprises such as  partially or incorrectly converted texts, etc.

Note that using LF+CR is not really an "error" per se; some printers needed that exact combination rather than CR+LF to advance a line, so some editors catered to that. Also some (most?) systems don't really define what is to be considered a line terminator and what not, so almost anything goes in there. :)
Turbo Pascal 3 CP/M - Amstrad PCW 8256 (512 KB !!!) :P
Lazarus/FPC 2.0.8/3.0.4 & 2.0.12/3.2.0 - 32/64 bits on:
(K|L|X)Ubuntu 12..18, Windows XP, 7, 10 and various DOSes.

dbannon

  • Hero Member
  • *****
  • Posts: 3230
    • tomboy-ng, a rewrite of the classic Tomboy
Re: Eliminating empty strings in TStringList when reading from text file.
« Reply #35 on: March 13, 2020, 09:47:43 am »
Quote
1) #13#10;
2) #10#13;
3) #10;
4) #13.
I haven't seen them in any of my files. Not even the classic mac one ........
I definitely have seen text files with #13 line endings. At the time they were believed to have come from a Mac attached to a scientific instrument. Long time ago ...

Davo
Lazarus 3, Linux (and reluctantly Win10/11, OSX Monterey)
My Project - https://github.com/tomboy-notes/tomboy-ng and my github - https://github.com/davidbannon

Thaddy

  • Hero Member
  • *****
  • Posts: 16580
  • Kallstadt seems a good place to evict Trump to.
Re: Eliminating empty strings in TStringList when reading from text file.
« Reply #36 on: March 13, 2020, 11:32:31 am »
Hi!

Question @lucamar:

Is such an error like LFCR instead of CRLF consistent in those files?

Or are there other surprises?

Winni
The surprise is maybe that the LineEnding const is cross-platform. You may run into trouble, though, when reading files created on different platforms.
But I am sure they don't want the Trumps back...

Bart

  • Hero Member
  • *****
  • Posts: 5516
    • Bart en Mariska's Webstek
Re: Eliminating empty strings in TStringList when reading from text file.
« Reply #37 on: March 13, 2020, 06:49:10 pm »
Anyone up for a solution using a RegEx?

Bart

MaxCuriosus

  • Full Member
  • ***
  • Posts: 136
Re: Eliminating empty strings in TStringList when reading from text file.
« Reply #38 on: March 13, 2020, 11:16:28 pm »
avk,

what's the name of the library and where can I find it?

MaxCuriosus

  • Full Member
  • ***
  • Posts: 136
Re: Eliminating empty strings in TStringList when reading from text file.
« Reply #39 on: March 13, 2020, 11:19:54 pm »
Before checking the various character based solutions, to avoid LineEnding I will try another approach.

The code below has some similarities with the bubble sort algorithm. The empty strings ripple through and accumulate towards the tail end of the list, then the tail is cut off.

I would welcome your comments on the performance.

Code: Pascal  [Select][+][-]
  1. Var
  2.   SL: TStringList;
  3.   I,J,K: Integer;
  4.  
  5. Begin
  6.   SL:=TStringList.Create;
  7.   SL.LoadFromFile(SomeTextFile);
  8.  
  9.   N:=SL.Count; K:=0;
  10.  
  11.   For I:=0 to N-2 do
  12.     Begin
  13.       If (SL.Strings[I]='') then
  14.         Begin
  15.           For J:=I+1 to N-1 do
  16.             Begin
  17.               If (SL.Strings[J]<>'') then
  18.                 Begin
  19.                   SL.Strings[I]:=SL.Strings[J];
  20.                   SL.Strings[J]:='';
  21.                   K:=I+1;
  22.                   Break;
  23.                 end;
  24.             end;{For J}
  25.         end;{If}
  26.     end;{For I}
  27.  
  28.   SL.Capacity:=K;
  29.  
  30.   {...}
  31.  
  32.   SL.Free;
  33. end;                                  

MaxCuriosus

  • Full Member
  • ***
  • Posts: 136
Re: Eliminating empty strings in TStringList when reading from text file.
« Reply #40 on: March 13, 2020, 11:22:00 pm »
Bart,

what's the matter with you?
It seems you stick your RegEx question anywhere at random.
Are you obsessed with RegEx?

(just kidding)

Bart

  • Hero Member
  • *****
  • Posts: 5516
    • Bart en Mariska's Webstek
Re: Eliminating empty strings in TStringList when reading from text file.
« Reply #41 on: March 13, 2020, 11:31:31 pm »
what's the matter with you?

It's kind of Godwin's law.
If a discussion about strings goes on long enough, a RegEx solution will pop up eventually.
At which point the thread spirals out of control.

Bart

eljo

  • Sr. Member
  • ****
  • Posts: 468
Re: Eliminating empty strings in TStringList when reading from text file.
« Reply #42 on: March 14, 2020, 01:38:24 am »
what's the matter with you?

It's kind of Godwin's law.
If a discussion about strings goes on long enough, a RegEx solution will pop up eventually.
At which point the thread spirals out of control.

Bart
And you are what? its reminder to make an appearance?

avk

  • Hero Member
  • *****
  • Posts: 771
Re: Eliminating empty strings in TStringList when reading from text file.
« Reply #43 on: March 14, 2020, 05:13:21 am »
what's the name of the library and where can I find it?

It is LGenerics

egsuh

  • Hero Member
  • *****
  • Posts: 1534
Re: Eliminating empty strings in TStringList when reading from text file.
« Reply #44 on: March 14, 2020, 01:14:38 pm »
Well following is quite old way, but the most PASCALISTIC I think.  :D

Code: Pascal  [Select][+][-]
  1. function StringsFromTextFile (fn: string);
  2. var
  3.      f: TextFile;
  4.      s : string;
  5.  
  6. begin
  7.      Result := TStringList.Create;
  8.      AssignFile(f, fn);
  9.      Reset(f);
  10.  
  11.      while not eof(f) do begin
  12.            Readln(f, s);
  13.            if trim (s) <> '' then Result.Append(s);
  14.      end;
  15.      CloseFile(f);
  16. end;
  17.  
     
 

 

 

TinyPortal © 2005-2018