Recent

Author Topic: extreamly slow text file loading (55,000 lines text) into listbox  (Read 24142 times)

luthepa1

  • New Member
  • *
  • Posts: 15
Hi all,

I was hoping someone could help me understand why and/or advise the best way to carry out my task at hand.  I have an instrument I service for my company that creates text log files of every device command sent to and reply from and starts a new log once the actual working log file reaches 4MB in size.  Well each field is comma separated so I guess you would call this a csv file.  The instrument records roughly 6-10 command per millisecond.  I have software on windows to review log files and filter them which I use to track down the root cause of when there are instrument errors/failure.  Ok so I am writing my own log reviewer app in Lazarus because I want extra features to help speed up troubleshooting and think I can make a better app.

I am developing under Linux Ubuntu 8.1.  My problem I am having is that loading a 4MB text file takes for ever!  Using a textfile variable, assgnfile, reset, readln and closefile method is very slow!!!  I just tried this method to open a 4MB text file (this file is 56,000 lines) and it took 27mins to get to the 26,000th line.  I added a counter to update a label object to tell me what line it was up to.  Also a listbox.items.loadfromfile is painfully slow.  I have also found an example on delphi3000 for a line by line read (http://www.delphi3000.com/articles/article_1363.asp?SK=) but this is still to slow for me (but it was faster).  I run a asus mobo with a core 2 duo E8400 3GHz with 4GB ram and sata 2 hdd.  Then my laptop which is a centrino 1.7Ghz, 500MB ram and maybe sata hdd, using the service software log reviewer app can read this 4MB text file, into its stringlist/stringgrid in 2sec!!!!  Way, way, way faster.

Can anyone shed some light as to what my problem could be or the method more suitable to perform my task?  I started to look a memory mapped file methods but my programming experience is not the high and would require me to do some more learning.  But i am up for learning!  Below is the fastest routine I have made so far.  But I want a 4MB text file loading in seconds no minutes to hours.

  AssignFile(F, fName); //Open up the text file as an untyped file
  Reset(F, 1);
  fSize := FileSize(fName);    //Get the file size
  pb.Max := fSize;
  NewBuffer := '';
  buffer := '';
  lineCount := 0;

  //while FilePos(F) <> fSize do
  Repeat
        //read in up to an 8K block
        BlockRead(F, buf, BlockSize, BlockCount);
   // Allocate the new buffer
   GetMem(NewBuffer, BlockSize + 1);
   // Copy the New Data Into The Buffer
   StrPLCopy(NewBuffer, buf, BlockCount);
   // Concat the buffers
   Buffer := Buffer + NewBuffer;
   // Return the buffer memory
   FreeMem(NewBuffer);
   // Step 2. Chop the data into a stringlist
   while (Pos(#10, buffer) <> 0) do 
   begin
             //showmessage('ok');
             Application.ProcessMessages;
        CutAt := Pos(#10, buffer);
        Line := Copy(buffer, 1, CutAt);
        Line := Trim(Line);
        ListBox1.Items.Add(Line);
        Delete(buffer, 1, CutAt);
             Inc(lineCount);
             Label1.Caption := 'Total Lines: ' + IntToStr(lineCount);
   end;

        //showmessage(IntToStr(FilePos(F)));

        Inc(logCount);
        pb.Position := BlockSize * logCount;
  Until (BlockCount=0); //(logCount=20);

  CloseFile(F);                              //close the file, you're done
  showmessage(IntToStr(logCount)); }


Thanks in advance,

Paul.
« Last Edit: March 19, 2009, 03:32:46 am by luthepa1 »

Leledumbo

  • Hero Member
  • *****
  • Posts: 8835
  • Programming + Glam Metal + Tae Kwon Do = Me
Re: extreamly slow text file loading (55,000 lines text) into listbox
« Reply #1 on: March 19, 2009, 05:01:29 am »
Some people recommends using TFileStream together with TMemoryStream. I'm not very familiar with them though.

arnoldb

  • Jr. Member
  • **
  • Posts: 97
Re: extreamly slow text file loading (55,000 lines text) into listbox
« Reply #2 on: March 19, 2009, 06:48:49 am »
I have had a similar problem in the past...  It turned out that updating screen display labels & the listbox takes a lot of time.

A good way to increase speed is to hide the listbox when loading, and not to call processmessages and a label update for every line, but on every -- say 1000th line.  Then once loaded, show the listbox again.

luthepa1

  • New Member
  • *
  • Posts: 15
Re: extreamly slow text file loading (55,000 lines text) into listbox
« Reply #3 on: March 19, 2009, 07:21:30 am »
Thanks for the feedback guys. 

Good points about the label update and Application.Processmessages.  However I originally did not have these lines.  I later added them to get some kind of indication that my app was actually doing something because before adding these line the program just appeared to be frozen, but I could see the CPU at full throttle.  But the thing that really interests me is the comment about hiding the listbox until loaded.  I had been thinking of this and in Delphi there is a command to disable component updating, and then you later re-enable updating once the task complete.  But lazarus does not recognise this command.  Anyway I just tried running again with out these commands and the listbox.visible := false.  2-3 mins pass and still going.  Still too slow.

I am thinking I will need to do something more technical like Tfilestream and Tmemorystream.  I keep thinking I should install Delphi to compare but I dont want to because I think Lazarus has some real potential, especially since it runs on so many OS's.  Also the link to that Delphi300 site where there is the code to load line by line in 8K chunks uses Tfilestream.  Thats still not as quick as I want or as quick as I think it can be, although it is quicker like I mentioned originally.

More help please for anyone.

Thanks

Vincent Snijders

  • Administrator
  • Hero Member
  • *
  • Posts: 2661
    • My Lazarus wiki user page
Re: extreamly slow text file loading (55,000 lines text) into listbox
« Reply #4 on: March 19, 2009, 07:51:55 am »
If just reading is slow (without adding the lines to the listbox), which I don't think, then use SetTextBuf.

If the problem is with the listview, make sure you use ListBox.Items.BeginUpdate.
I am not sure if Listbox.Items.BeginUpdate has been implemented for gtk, I am pretty sure is implemented for win32.
Two examples:
Easy coding:
Code: [Select]
begin
  Listbox1.Items.BeginUpdate;
  try
    ListBox1.Items.LoadFromFile(fName)
  finally
    Listbox1.Items.EndUpdate;
  end;  

Possibly faster code:
Code: [Select]
const
  BufSize = 1024 * 1024; // 1 MB buffer, experiment with it, bigger than 64 kB doesn't probably show any speedup.
var
  F: Text;
  Buffer: array[1..BufSize] of byte;
begin
  AssignFile(F,FName);
  Reset(F);
  SetTextBuf(F, Buffer);
  Listbox1.Items.BeginUpdate;
  try
    repeat
       readln(F, line);
       ListBox1.Items.Add(line);
    until eof(F);
    CloseFile(F);
  finally
    Listbox1.Items.EndUpdate;
  end;  

Instead of assigning directly to the listbox, you can first read it into a stringlist and then do ListBox1.Items.Assign(MyStringList).

If you try any of these methods, please post timings. I am interested to see what is fastest.

Vincent Snijders

  • Administrator
  • Hero Member
  • *
  • Posts: 2661
    • My Lazarus wiki user page
Re: extreamly slow text file loading (55,000 lines text) into listbox
« Reply #5 on: March 19, 2009, 07:53:46 am »
A good way to increase speed is to hide the listbox when loading, and not to call processmessages and a label update for every line, but on every -- say 1000th line.  Then once loaded, show the listbox again.
Hiding the listbox should not be necessary, if you use BeginUpdate. Application.ShowMessages and the Label updates are a waste of time, indeed.

luthepa1

  • New Member
  • *
  • Posts: 15
Re: extreamly slow text file loading (55,000 lines text) into listbox
« Reply #6 on: March 19, 2009, 12:29:28 pm »
ok well the simple code with the recommendation takes 9:15 (m:sec).  slow.

The faster code option recommendation not sure yet.  Keep getting compile error that SettextBuf has "wrong number of parameters specified for call to SetTextBuf".  I even tried a 3rd param for SetTextBuf with the size of the buffer.  I dont understand whats wrong.  Also I had to use "System.Text" for the F variable.  Does not like "Text".


Vincent Snijders

  • Administrator
  • Hero Member
  • *
  • Posts: 2661
    • My Lazarus wiki user page
Re: extreamly slow text file loading (55,000 lines text) into listbox
« Reply #7 on: March 19, 2009, 01:01:45 pm »
System.Text or TextFile. Aparently you use this code in a method of an object that has the text property.

Vincent Snijders

  • Administrator
  • Hero Member
  • *
  • Posts: 2661
    • My Lazarus wiki user page
Re: extreamly slow text file loading (55,000 lines text) into listbox
« Reply #8 on: March 19, 2009, 02:08:10 pm »
Try Jump to declaration to see what Lazarus (not the compiler) thinks the number of parameters should be.

chrnobel

  • Sr. Member
  • ****
  • Posts: 283
Re: extreamly slow text file loading (55,000 lines text) into listbox
« Reply #9 on: March 19, 2009, 10:48:13 pm »
A little off topic, but my experience is that text reading as such is very, very fast.

I have some programs where I read line by line typically like this:
Code: [Select]
  Reset(F);
  While SeekEof(F)=false do
  begin
     Readln(F, S);
        ....
        Do a lot of validating of S, and store the result in a MySQL database
        ...
  end
  CloseFile(F);
And it is very fast, typically a few seconds for 50.000+ lines

Also I have made an XML reader (which basically is the same), and it takes around 15 seconds to read a 12M XML file and save into a database on a virtual machine on my laptop.

So my conclusion is the the file read as such is not a problem, it has to caused by something inside your loop.

/Chris
« Last Edit: March 19, 2009, 10:53:47 pm by chrnobel »

luthepa1

  • New Member
  • *
  • Posts: 15
Re: extreamly slow text file loading (55,000 lines text) into listbox
« Reply #10 on: March 19, 2009, 11:18:57 pm »
Firstly, thanks for all your help so far Vincent.  I appreciate it greatly.

Second: I am not sure how to "Jump" to the declaration to see what Lazarus thinks.  I am coming back to pascal coding after years of not coding.  Although I probably still would not know how to do this.  Sorry.

So what I had is: SetTextBuf(F, Buffer);
This would give the error that there is the wrong number of parameters for the call.
Also it give a hint of "Found Declaration: TControl.SetTextBuf(PChar)".  If I click this hint I get a message dialog stating "control.inc" cant be found and to check my search paths for other units.  Not sure if I have screwed Lazarus up a bit with trying to add components to which all of them always error(HistoryFiles, multilog).

But if I change it to System.SetTextBuf(F, Buffer);
there are no errors and it compiles.  But not sure if its working because my laod times are now worse.

This code is executed from the Open item of a menubar component (TMenuItem).

So using this code example with 1MB buffer I found it is still slow and I ended up terminating the app after 11mins.  I then changed the buffer size to 64Kb and then 8KB but still the same scenario and I ended up terminating after 11mins.  So the fastest so far is the Listbox1.items.loadfromfile function at just over 9mins.

Do you agree that this should load a text file of 55,000 lines in under a minute, even seconds?

Thanks.

luthepa1

  • New Member
  • *
  • Posts: 15
Re: extreamly slow text file loading (55,000 lines text) into listbox
« Reply #11 on: March 19, 2009, 11:21:59 pm »
@ chrnobel.

I was so glad to hear your reply because you confirm what I would have thought.  A few seconds to load such a large text file.  I think I will start a new project and just simply code to perform this task of text file loading.  Then I can figure out (if there is a sudden speed up) if something else in my app is slowing things down.

I will post up my code and results of the new project/app for the purpose of troubleshooting this issue I am having.

Cheers!

Vincent Snijders

  • Administrator
  • Hero Member
  • *
  • Posts: 2661
    • My Lazarus wiki user page
Re: extreamly slow text file loading (55,000 lines text) into listbox
« Reply #12 on: March 19, 2009, 11:43:38 pm »
Use System.SetTextBuf, so you don't use the method of TControl.

If you post your application, with a test data generator, I will run it on windows.

luthepa1

  • New Member
  • *
  • Posts: 15
Re: extreamly slow text file loading (55,000 lines text) into listbox
« Reply #13 on: March 20, 2009, 12:08:33 am »
Ok, well found something interesting.  Looks like Ubuntu 8.1 maybe the issue.  Below I will put the demo app code for this troubleshooting.

I run this on Ubuntu 8.1 64bit OS with Lazarus.  I am wondering also if 64bit is the difference causing probs here.  After 7min I terminated since it again looked like it would take for ever to load a simple text file.

So I then took this code/project to my laptop which runs WinXP Pro 32bit with Lazarus.  When I go click open on the file menu in the app I get error message popup "Project Raised exception class 'External: SIGSEGV'.".  So Ithen comment out the variable "Buffer : Array[1..BufferSize] of byte;" as I am not even using it anyway.  The project then runs and loads the text file of 55,000 lines in 20sec!!!! Massive improvement!

I tried again on Ubuntu x64 with the buffer variable commented out but no difference.

So looks like an OS thing?

The Code
-----------------

unit main;

{$mode objfpc}{$H+}

interface

uses
  Classes, SysUtils, FileUtil, LResources, Forms, Controls, Graphics, Dialogs,
  StdCtrls, Menus;

const
     BufferSize = 1024 * 1024; //1MB buffer, experiment with it, bigger than
                               //64Kb probably doesn't show any speedup..

type

  { TfrmMain }

  TfrmMain = class(TForm)
    lstBox: TListBox;
    MainMenu1: TMainMenu;
    mnuOpen: TMenuItem;
    mnuMain: TMenuItem;
    OpenDiag: TOpenDialog;
    procedure mnuOpenClick(Sender: TObject);
  private
    { private declarations }
  public
      { public declarations }
  end;

var
  frmMain: TfrmMain;

implementation

{ TfrmMain }

procedure TfrmMain.mnuOpenClick(Sender: TObject);
var
  fName, line: string;
  F: System.Text;
  //Buffer : Array[1..BufferSize] of byte;
begin
  openDiag.Filter:='Tigris Log Files (*.txt)|*.txt';
  if openDiag.Execute then
  begin
     fName := openDiag.FileName;

     AssignFile(F, fName);
     Reset(F);
     try
        lstBox.Items.BeginUpdate;
        While SeekEOF(F)=False do
        begin
             ReadLn(F, line);
             lstBox.Items.Add(line);
        end;
        CloseFile(F);
     finally
        lstBox.Items.EndUpdate;
     end;
  end;
end;

initialization
  {$I main.lrs}

end. 

Vincent Snijders

  • Administrator
  • Hero Member
  • *
  • Posts: 2661
    • My Lazarus wiki user page
Re: extreamly slow text file loading (55,000 lines text) into listbox
« Reply #14 on: March 20, 2009, 12:11:25 am »
My guess: BeginUpdate is not implemented in the gtk widgetset interface.

 

TinyPortal © 2005-2018