Recent

Author Topic: [SOLVED] SQLite - I've created a really slow implementation. How to speed up?  (Read 6159 times)

Gizmo

  • Hero Member
  • *****
  • Posts: 831
UPDATE - FOR THE ANSWER, GO TO REPLY 3 BY RVK, WHICH IS VERIFIED IN MY LAST ANSWER ON PAGE 2, #16

Hi all

Just got into SQLIte for the first time. I have a program that is being used increasingly for large scale industrial work so StringGrids are proving not ideal for these users. So I am exploring migrating it to SQLite, utilisng DBGrids to display the content. To explore and play, I have created a new simple project based on the /examples/database/sqlite demo in Lazarus.

So far, a folder is chosen, all the files are found, they are then (in this instance) MD5 hashed, and the filename, path and hash is then output to the SQLite DB and displayed to the user. This is closely replicating how the real program works at the moment, except the real program displays the data in StringGrids and uses various hash algorithms. Anyway, this little demo project works. However, there is a problem. It's MUCH MUCH slower than it should be, so I am guessing I am doing something horribly wrong, and I'm not experienced enough yet to work out which bit is the problem.

Basically, this demo project takes 1m10s to explore 526 files in a folder, hash them and output to SQLIte and display in the DBGrid.

The real program, that still uses StringGrids, does it in less than 1 second!!

So 1 second, vs 1m10s...you can see why I think I've gone wrong somewhere.

Would anyone be good enough to take a look and see if they can spot an issue? The only difference between this demo and the real thing is that the real thing doesn't go through the 'TotalFilesToExamine' stringlist one line at a time. It has an @OnFileFound handler as part of TFileSearcher. But surely that alone wouldn't make such a big difference. So I'm suspecting at the moment my database technique is poor and I am repeating things that don't need to be repeated, or something? But, if you database experts can't see anything wrong, then it must be something else.

Code: Pascal  [Select][+][-]
  1. ...
  2. type
  3.   { TListFileSearcher }
  4.  
  5.   TListFileSearcher = class(TFileSearcher)
  6.   private
  7.     FList: TStrings;
  8.   protected
  9.     procedure DoFileFound; override;
  10.   public
  11.     constructor Create(AList: TStrings);
  12.   end;
  13. ...
  14.  
  15. // This will compute the hash of each file and then input the result to the SQLite DB
  16. procedure TForm1.btnHashSingleFileClick(FullName, Filename, Filepath : string);
  17. var
  18.   hashval : string;
  19. begin
  20.   SQLite3Connection1.Close;
  21.  
  22.   //SQLite3Connection1.Password := txtPass.Text; // The current password
  23.     try
  24.       SQLite3Connection1.Open;
  25.       SQLTransaction1.Active := True;
  26.       hashval := MD5Print(MD5File(FullName));
  27.       // Insert the values into the database
  28.       // We're using ParamByName which prevents SQL Injection
  29.       // http://wiki.freepascal.org/Working_With_TSQLQuery#Parameters_in_TSQLQuery.SQL
  30.       SQLQuery1.SQL.Text := 'Insert into TBL_FILES (Filename, Path, Hash) values (:Filename,:Path,:Hash)';
  31.       SQLQuery1.Params.ParamByName('Filename').AsString := Filename;
  32.       SQLQuery1.Params.ParamByName('Path').AsString := FilePath;
  33.       SQLQuery1.Params.ParamByName('Hash').AsString := hashval;
  34.       SQLQuery1.ExecSQL;
  35.       SQLTransaction1.Commit;
  36.     except
  37.       ShowMessage('Failed');
  38.     end;
  39. end;
  40.  
  41. // This will find all files in a folder, and then call btnHashSingleFileClick for each file it finds
  42. procedure TForm1.btnHashManyFilesClick(Sender: TObject);
  43. var
  44.   TotalFilesToExamine : TStringList;
  45.   i : integer;
  46.   Filename, FilePath, FullName : string;
  47.   TmStarted, TmEnded : TDateTime;
  48. begin
  49.   i := 0;
  50.   TotalFilesToExamine := TStringList.Create;
  51.   if SelectDirectoryDialog1.Execute then
  52.   begin
  53.   TmStarted := Now;
  54.   lblStarted.Caption:= FormatDateTime('HH:MM:SS', Now);
  55.   TotalFilesToExamine := FindAllFilesEx(SelectDirectoryDialog1.Filename, '*', True, True);
  56.   for i := 0 to TotalFilesToExamine.Count -1 do
  57.   begin
  58.     FullName := TotalFilesToExamine.Strings[i];
  59.     FileName := ExtractFileName(TotalFilesToExamine.Strings[i]);
  60.     FilePath := ExtractFilePath(TotalFilesToExamine.Strings[i]);
  61.     btnHashSingleFileClick(FullName, FileName, FilePath);
  62.   end;
  63.     TotalFilesToExamine.Free;
  64.   end;
  65.   // Now let's update the grid to show the new values to the user:
  66.   UpdateGridFILES(nil);  // Now update and display the DBGrid with all the SQLite data in it
  67.   TmEnded := Now;
  68.   lblEnded.Caption:= FormatDateTime('HH:MM:SS', Now);
  69.   lblTimeTaken.Caption:= FormatDateTime('HH:MM:SS', TmEnded-TmStarted);  // Reports 1m 10s for 500+ regular files!!
  70. end;
  71.  
  72. procedure TForm1.UpdateGridFILES(Sender: TObject);
  73. begin
  74.    SQLite3Connection1.Close;
  75.  
  76.   // Try to perform query
  77.   try
  78.     SQLite3Connection1.Connected := True;
  79.  
  80.     SQLQuery1.SQL.Clear;
  81.     SQLQuery1.SQL.Text := 'Select * from TBL_FILES';
  82.     SQLQuery1.Open;
  83.  
  84.     // Allow the DBGrid to view the results of our query
  85.     DataSource1.DataSet := SQLQuery1;
  86.     DBGrid1.DataSource := DataSource1;
  87.     DBGrid1.AutoFillColumns := true;
  88.  
  89.   except
  90.     ShowMessage('Unable to query the database');
  91.   end;
  92.  
  93. { TListFileSearcher }
  94.  
  95. procedure TListFileSearcher.DoFileFound;
  96. begin
  97.   FList.Add(FileName);
  98. end;
  99.  
  100. constructor TListFileSearcher.Create(AList: TStrings);
  101. begin
  102.   inherited Create;
  103.   FList := AList;
  104. end;
  105.  
  106. function TForm1.FindAllFilesEx(const SearchPath: string; SearchMask: string;
  107.   SearchSubDirs: boolean; IncludeHiddenDirs: boolean): TStringList;
  108. var
  109.   Searcher: TListFileSearcher;
  110. begin
  111.   Result   := TStringList.Create;
  112.   Searcher := TListFileSearcher.Create(Result);
  113.   Searcher.DirectoryAttribute := Searcher.DirectoryAttribute or faHidden;
  114.   try
  115.     Searcher.Search(SearchPath, SearchMask, SearchSubDirs);
  116.   finally
  117.     Searcher.Free;
  118.   end;
  119. end;
  120.  
  121. end;
  122.  
« Last Edit: September 20, 2017, 02:39:57 pm by Gizmo »

CharlyTango

  • New Member
  • *
  • Posts: 46
Re: SQLite - I've created a really slow implementation. How to speed up?
« Reply #1 on: September 19, 2017, 03:29:30 pm »
hi,

first of all I'd suggest not to open and close Database connections at each "file-found" operation.

e.g. put
SQLite3Connection1.Open;
in the TForm1.OnCreate and the SQLite3Connection1.Open; in TForm1.OnCloseQuery.

that reduces the count of the time consuming opening and closing of the database to 1. All other operations can be done while the database connection is open/active.

Furthermore i might remember that you don't need extra transaction management using SQLite -- but recheck that. I think its buildt in.

in your code you do not write any dependent information at one time so transactions might not be needes.


Removing the open/close procedures of the database should speed up yor code.

Regards





Lazarus stable, Win32/64

Gizmo

  • Hero Member
  • *****
  • Posts: 831
Re: SQLite - I've created a really slow implementation. How to speed up?
« Reply #2 on: September 19, 2017, 03:49:20 pm »
Thanks Charly

Good tips. That has improved things. Down from 1m10s to 0m38s. So that's about a 50% boost! But it's still 37 times longer than my existing program, that can go through the same files, compute bigger hashes and output the results to a stringgrid in less than 1 second.

I'm starting to think that perhaps the traversal of the StringGrid in this demo is perhaps the problem afterall, because my other program doesn't do that. Maybe I just need to try implementing SQLIte into it now I've got a bit of a grasp of how it works. I'm just worried that I could spend several days implementing SQLIte only to find it makes the program worse. A fellow FPC user (molly) advised, sensibly, for me to get a good grasp of the database system before incorporating it into me real project, so that is what I'm trying to do. But so far, it has resulted in only worry and concern that the mix of SQLIte with DBGrids is slower than simple StringGrid output.   

rvk

  • Hero Member
  • *****
  • Posts: 6056
Re: SQLite - I've created a really slow implementation. How to speed up?
« Reply #3 on: September 19, 2017, 04:01:31 pm »
Not sure how much faster it will be but you shouldn't do a SQLTransaction1.Commit; on every insert.

Every 1000 records would be fine.
You can also do it just before the SQLite3Connection1.Close;

mangakissa

  • Hero Member
  • *****
  • Posts: 1131
Re: SQLite - I've created a really slow implementation. How to speed up?
« Reply #4 on: September 19, 2017, 04:45:46 pm »
Use TSQLQuery.Disablecontrols at te beginning and TQuery.Enablecontrols at the end.
Code: Pascal  [Select][+][-]
  1. begin
  2.   try
  3.     SQLQuery1.disablecontrols;
  4.     .....
  5.   finally
  6.     SQLQuery1.enablecontrols;
  7.   end;
  8.  
Lazarus 2.06 (64b) / FPC 3.0.4 / Windows 10
stucked on Delphi 10.3.1

rvk

  • Hero Member
  • *****
  • Posts: 6056
Re: SQLite - I've created a really slow implementation. How to speed up?
« Reply #5 on: September 19, 2017, 05:02:57 pm »
Use TSQLQuery.Disablecontrols at te beginning and TQuery.Enablecontrols at the end.
SQLQuery1.ExecSQL is used with a INSERT statement here.
So I don't expect there will be any controls connected to that.

(There is also no SQLQuery1.Open done but ExecSQL so connected controls won't be an issue)

ASerge

  • Hero Member
  • *****
  • Posts: 2212
Re: SQLite - I've created a really slow implementation. How to speed up?
« Reply #6 on: September 19, 2017, 05:06:58 pm »
On my machine this code displays about 300 milliseconds:
Code: Pascal  [Select][+][-]
  1. procedure TForm1.Button1Click(Sender: TObject);
  2. var
  3.   AllFiles: TStringList;
  4.   StartTime: TDateTime;
  5.   FullName: string;
  6. begin
  7.   StartTime := Now;
  8.   Label1.Caption := DateTimeToStr(StartTime);
  9.   SQLTransaction1.Active := True;
  10.   try
  11.     sqlQueryClear.ExecSQL;
  12.     AllFiles := TStringList.Create;
  13.     try
  14.       FindAllFiles(AllFiles, 'c:\windows\system32', '*', False);
  15.       for FullName in AllFiles do
  16.       begin
  17.         sqlQueryInsert.ParamByName('FileName').AsString := ExtractFileName(FullName);
  18.         sqlQueryInsert.ParamByName('Path').AsString := ExtractFileDir(FullName);
  19.         sqlQueryInsert.ParamByName('Hash').AsString := '(none)';
  20.         sqlQueryInsert.ExecSQL;
  21.       end;
  22.     finally
  23.       AllFiles.Free;
  24.     end;
  25.   finally
  26.     SQLTransaction1.Commit;
  27.   end;
  28.   sqlQueryForGrid.Open;
  29.   Label2.Caption := FormatDateTime('hh:nn:ss:zzz', Now - StartTime);
  30. end;
  31.  
  32. procedure TForm1.FormCreate(Sender: TObject);
  33. begin
  34.   sqlQueryForGrid.SQL.Text :=
  35.     'select ' +
  36.       'cast(FileName as varchar(30)) FileName, ' +
  37.       'cast(Path as varchar(256)) Path, ' +
  38.       'cast(Hash as varchar(10)) Hash ' +
  39.     'from TBL_FILES ' +
  40.     'order by FileName';
  41.   sqlQueryInsert.SQL.Text := 'insert into TBL_FILES (FileName, Path, Hash) ' +
  42.     'values(:FileName, :Path, :Hash)';
  43.   sqlQueryClear.SQL.Text := 'delete from TBL_FILES';
  44. end;

CharlyTango

  • New Member
  • *
  • Posts: 46
Re: SQLite - I've created a really slow implementation. How to speed up?
« Reply #7 on: September 19, 2017, 05:09:04 pm »
Hi

all these suggestions are more or less obvious quickhacks.

Do you really know where did you loose the time?

Did you measure it  ?

Regards
Lazarus stable, Win32/64

ASerge

  • Hero Member
  • *****
  • Posts: 2212
Re: SQLite - I've created a really slow implementation. How to speed up?
« Reply #8 on: September 19, 2017, 05:12:50 pm »
Do you really know where did you loose the time?
MD5Print(MD5File(FullName)) and frequent commit.

rvk

  • Hero Member
  • *****
  • Posts: 6056
Re: SQLite - I've created a really slow implementation. How to speed up?
« Reply #9 on: September 19, 2017, 05:14:48 pm »
The real program, that still uses StringGrids, does it in less than 1 second!!
I also want to see that program, that calculates MD5 for 526 files in 1 second  ::)

Or are these all relative small files? Or maybe that real program stores the MD5 from a previous run.
« Last Edit: September 19, 2017, 05:16:42 pm by rvk »

Gizmo

  • Hero Member
  • *****
  • Posts: 831
Re: SQLite - I've created a really slow implementation. How to speed up?
« Reply #10 on: September 19, 2017, 05:21:28 pm »
Quote
I also want to see that program, that calculates MD5 for 526 files in 1 second

www.quickhash-gui.org --> FileS tab.

Yes, the files were fairly small (between 10Kb to 200Kb kind of size), but the point is if one program can process them all in < 1 second, and another tool takes over 1 minute, the other tool is 60 times slower.

rvk

  • Hero Member
  • *****
  • Posts: 6056
Re: SQLite - I've created a really slow implementation. How to speed up?
« Reply #11 on: September 19, 2017, 05:32:33 pm »
You could strip out any database stuff to see how much time the processing of those files take with FPCs MD5File.

Use ASerges method with FindAllFiles to find the files and process them (so replace the DB-stuff with just the MD5File line).
If it also just takes a second, you know it's the saving to the database.

If it also takes a long time then you need to loop at the FindFirst/FindNext method. Maybe quickhash uses another faster method of retrieving the file-list. (I already saw quickhash uses \\?\ direct access, to hash the files)

(I take it you've already tried moving the commit and it's still slow.)
« Last Edit: September 19, 2017, 05:37:05 pm by rvk »

CharlyTango

  • New Member
  • *
  • Posts: 46
Re: SQLite - I've created a really slow implementation. How to speed up?
« Reply #12 on: September 19, 2017, 08:23:34 pm »
you're a lucky guy  :)

the program you mentioned is obviousley a lazarus programm hosted
on https://github.com/tedsmith/quickhash

As it is opensource you can check why it is so quickly and can use parts of the code to speed up yours.

Regards
Lazarus stable, Win32/64

Gizmo

  • Hero Member
  • *****
  • Posts: 831
Re: SQLite - I've created a really slow implementation. How to speed up?
« Reply #13 on: September 19, 2017, 09:50:52 pm »
Charly

It's better than that. I am its developer! :-) (see profile URL).

I'm looking at switching it from StringGrids to SQLite because for big data sets, the StringGrids are too slow and not flexible enough for the users. I want more choices that SQLIte and DBGrid seem to offer (faster sorting, searching, filtering etc) but a quick new project I made from the examples folder of Lazarus seems to be a bit slow for my liking. Thus my post

Jurassic Pork

  • Hero Member
  • *****
  • Posts: 1228
Re: SQLite - I've created a really slow implementation. How to speed up?
« Reply #14 on: September 20, 2017, 02:20:02 am »
hello,
Gizmo, instead of SQLite you can also try to use the TBufDataset .
Quote
TBufDataset is a component that provides the link to buffered data in a database. A TDataset descendant acts like a cursor on a table or query-result.
With this component you can sort your fields and filter your records.
120 ms to read , calculate md5 and display  428 records on a tdbgrid .
Friendly, J.P

« Last Edit: September 20, 2017, 02:25:12 am by Jurassic Pork »
Jurassic computer : Sinclair ZX81 - Zilog Z80A à 3,25 MHz - RAM 1 Ko - ROM 8 Ko

 

TinyPortal © 2005-2018