Recent

Author Topic: Loading Large Sqlite DB file - hangs - performance advice  (Read 642 times)

marcusadamski

  • Newbie
  • Posts: 4
Loading Large Sqlite DB file - hangs - performance advice
« on: October 24, 2020, 09:58:36 pm »

Hi there,

After using Delphi for many years, moving to C, Java,Python .... now I'm returning to Lazarus. My first project was to test out loading a large Sqlite file (~ 700mb, 1.7 million rows). With the sqlQuery limited to a few thousand rows, everything was very fast. However, it simply hangs on the full file. The following set up runs for around 30 mins, takes 3GB in memory, but still doesn't finish.

I want to load this manually, rather than DBGrid.  Can someone please review my code to see if I've done something stupid? (Note - connection, transaction and sqlquery all created and freed in main form create/ destroy. POSTCODE_COL etc are just INT consts).

I've written similar test apps in Java and C++, using a similar structure, same db - and they're taking around 60 secs. So maybe I'm missing something here?


Code: Pascal  [Select][+][-]
  1.  
  2. procedure TfrmMain.btnLoadClick(Sender: TObject);
  3. var
  4.   RowIndex: integer;
  5.   StartTime: TDateTime;
  6. begin
  7.   Self.Cursor:= crHourGlass;
  8.   sgPostcodes.BeginUpdate;
  9.   StartTime:= Now;
  10.   try
  11.     btnLoad.Caption:= 'Working ...';
  12.     btnLoad.Enabled:= False;
  13.  
  14.     // connect to the db
  15.     FConn.DatabaseName:= ConcatPaths([extractfiledir(paramstr(0)), 'data', 'postcodes.dat']);
  16.     FConn.ConnectorType:='SQLite3';
  17.     FConn.Open;
  18.  
  19.     FQuery.SQL.Text:='select ' +
  20.       'cast( "Postcode3" as TEXT), ' +
  21.       'cast( "LocalAuthorityName" as TEXT), ' +
  22.       'cast( "WardName" as TEXT), ' +
  23.       '"Longitude", ' +
  24.       '"Latitude" ' +
  25.       'from postcodes';
  26.     FQuery.Open;
  27.  
  28.     // iterate through all the postcodes
  29.     RowIndex:= 1;
  30.     while not FQuery.EOF do begin
  31.       sgPostcodes.RowCount:= RowIndex +1;
  32.  
  33.       sgPostcodes.Cells[0, RowIndex]:= FQuery.Fields[POSTCODE_COL].AsString;
  34.       sgPostcodes.Cells[1, RowIndex]:= FQuery.Fields[LOCAUTH_COL].AsString;
  35.       sgPostcodes.Cells[2, RowIndex]:= FQuery.Fields[WARDNAME_COL].AsString;
  36.       sgPostcodes.Cells[3, RowIndex]:= FloatToStr(FQuery.Fields[LONG_COL].AsFloat);
  37.       sgPostcodes.Cells[4, RowIndex]:= FloatToStr(FQuery.Fields[LAT_COL].AsFloat);
  38.  
  39.       FQuery.Next;
  40.       inc(RowIndex);
  41.     end;
  42.  
  43.   finally
  44.     sgPostcodes.EndUpdate();
  45.     Self.Cursor:= crDefault;
  46.  
  47.     // update stats
  48.     lblTimeTaken.Caption:= lblTimeTaken.Caption + '   ' + FormatDateTime('nn : ss : z ', Now - StartTime);
  49.     lblCount.Caption:= lblCount.Caption + '   ' + IntToStr(RowIndex -1);
  50.  
  51.   end;
  52. end;
  53.  
  54.  

marcov

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8894
  • FPC developer.
Re: Loading Large Sqlite DB file - hangs - performance advice
« Reply #1 on: October 24, 2020, 10:02:25 pm »
Simple grids are fairly slow for anything above thousands/ten-thousands rows.  Store the results in memory and use some virtual grid that only touches actually shown data.
(Like virtual string tree that can also be used in grid mode)

First thing is to test if that hypothesis is correct. If you don't assign (and thus enlarge) the grid, is it faster then?

marcusadamski

  • Newbie
  • Posts: 4
Re: Loading Large Sqlite DB file - hangs - performance advice
« Reply #2 on: October 24, 2020, 10:42:54 pm »
hi marcov,

I understand this is not a real world scenario, I was just trying to get a comparison with other languages. I carried out the following two tests:

1)
Code: Pascal  [Select][+][-]
  1.     while not FQuery.EOF do begin
  2. {
  3.       sgPostcodes.RowCount:= RowIndex +1;
  4.  
  5.       sgPostcodes.Cells[0, RowIndex]:= FQuery.Fields[POSTCODE_COL].AsString;
  6.       sgPostcodes.Cells[1, RowIndex]:= FQuery.Fields[LOCAUTH_COL].AsString;
  7.       sgPostcodes.Cells[2, RowIndex]:= FQuery.Fields[WARDNAME_COL].AsString;
  8.       sgPostcodes.Cells[3, RowIndex]:= FloatToStr(FQuery.Fields[LONG_COL].AsFloat);
  9.       sgPostcodes.Cells[4, RowIndex]:= FloatToStr(FQuery.Fields[LAT_COL].AsFloat);
  10.  }
  11.       FQuery.Next;
  12.       // inc(RowIndex);
  13.     end;
  14.  

which took around 60 seconds

2)
Code: Pascal  [Select][+][-]
  1.     while not FQuery.EOF do begin
  2.  
  3.       // sgPostcodes.RowCount:= RowIndex +1;
  4.  
  5.       sgPostcodes.Cells[0, RowIndex]:= FQuery.Fields[POSTCODE_COL].AsString;
  6.       sgPostcodes.Cells[1, RowIndex]:= FQuery.Fields[LOCAUTH_COL].AsString;
  7.       sgPostcodes.Cells[2, RowIndex]:= FQuery.Fields[WARDNAME_COL].AsString;
  8.       sgPostcodes.Cells[3, RowIndex]:= FloatToStr(FQuery.Fields[LONG_COL].AsFloat);
  9.       sgPostcodes.Cells[4, RowIndex]:= FloatToStr(FQuery.Fields[LAT_COL].AsFloat);
  10.  
  11.       FQuery.Next;
  12.       // inc(RowIndex);
  13.     end;
  14.  

where the stringgrid was used, but overwrote the same first row each time.  Taking around 90 seconds


I understand the "proper" approach would be to display items as they are viewed (e.g. virtual grid), but I was initially interested in a quick comparison with other languages (and to jog my memory on the FreePascal syntax)

It's nice to discover FreePascal again. I'll persevere with a few mini apps, to see how I find the IDE, and language, these many years later...

« Last Edit: October 24, 2020, 10:50:55 pm by marcusadamski »

marcov

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8894
  • FPC developer.
Re: Loading Large Sqlite DB file - hangs - performance advice
« Reply #3 on: October 24, 2020, 11:24:12 pm »
This is why comparing languages with a helicopter view is such bad idea. If one language has a virtual grid by default, and the other not, you get a lopsided impression, while it has nearly nothing to do with either language (and libraries) capabilities.

It is a minor choice. Lazarus is modeled after Delphi which is older than both C# and Java, so its default choices are a bit more conservative.

Same for the 30 seconds extra. It is dangerous to assume that it is merely a difference in string performance. Some languages might detect that you don't do anything with the assignment and eliminate it, some not.

korba812

  • Full Member
  • ***
  • Posts: 142
Re: Loading Large Sqlite DB file - hangs - performance advice
« Reply #4 on: October 25, 2020, 12:01:21 am »
Do you really need to display 1.7 million rows? No man can get through all these records. The strength of SQL databases is that you can select a subset of the data that you really need.
But if you need to process this amount of data, it would be better to use native SQLite API without using TDataset.

PierceNg

  • New Member
  • *
  • Posts: 37
Re: Loading Large Sqlite DB file - hangs - performance advice
« Reply #5 on: October 25, 2020, 02:32:21 am »
Do you really need to display 1.7 million rows? No man can get through all these records. The strength of SQL databases is that you can select a subset of the data that you really need.

+1

OP's program loads all 1.7 million rows to display, what, 50 or 200 rows at a time? For such cases, best to use what is known as database paging or database cursor, no need to load all rows into memory

Sieben

  • Full Member
  • ***
  • Posts: 140
Re: Loading Large Sqlite DB file - hangs - performance advice
« Reply #6 on: October 25, 2020, 02:38:46 am »
-1 - did you take the trouble of reading the thread...?
Lazarus 2.0.10, FPC 3.2.0, .deb install on Ubuntu Xenial 32

winni

  • Hero Member
  • *****
  • Posts: 1993
Re: Loading Large Sqlite DB file - hangs - performance advice
« Reply #7 on: October 25, 2020, 02:55:45 am »
Hi!

OP has to read the 1.7 Mio rows. Row by row.
To check that there are no errors.

After that we carry on with the discussion.

Shake of the head.

Winni

marcusadamski

  • Newbie
  • Posts: 4
Re: Loading Large Sqlite DB file - hangs - performance advice
« Reply #8 on: October 25, 2020, 12:15:35 pm »
Thanks for the feedback - as I say, this is not a real world application. After 10-15 years away from Delphi, I'd like to start using a flavour of it again ... I.e. Lazarus.  Wanted to re-familiarise myself with the core packages and syntax through some basic apps - and what better way than a simple DB app.  In this case, there was a performance discrepancy and I was curious why.

Btw - if using SQLite, what's the preferred component set and why: TSQLConnector or TSQLite3Connection ?


Due to limited time, it will take me a little while to get up to speed - but I'm hoping the forum will be patient with me  :)

marcov

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 8894
  • FPC developer.
Re: Loading Large Sqlite DB file - hangs - performance advice
« Reply #9 on: October 25, 2020, 12:33:36 pm »
I use sqlite3conn, which is sqldb, which I assume TSQLConnector is a wrapper for.

But I only use sqlite for small peanut local db.

korba812

  • Full Member
  • ***
  • Posts: 142
Re: Loading Large Sqlite DB file - hangs - performance advice
« Reply #10 on: October 25, 2020, 06:22:32 pm »
TSQLQuery component from the SQLDB package uses TBufDataSet as base - downloaded data from database is additionally buffered inside TBufDataSet and hence such memory consumption and lower performance.

You can try other SQLite database access components.
See https://wiki.freepascal.org/SQLite

The idea of TDataSet was borrowed from Delphi and is intended for rapid development of database applications linked to GUI. Therefore, it includes a large number of additional functionalities (data validation, GUI controls notification, code page conversion, etc.) that may slow down performance.

As mentioned earlier, use the direct SQLite API for maximum performance.

PierceNg

  • New Member
  • *
  • Posts: 37
Re: Loading Large Sqlite DB file - hangs - performance advice
« Reply #11 on: October 26, 2020, 05:54:35 pm »
Btw - if using SQLite, what's the preferred component set and why: TSQLConnector or TSQLite3Connection ?

For a web microservice I use https://github.com/plashenkov/SQLite3-Delphi-FPC. I will be looking at Zeos next and also relook at what comes with FPC. My code doesn't use any abstract database API, just SQLite C API style prepare/step/finalize operations.

Thaddy

  • Hero Member
  • *****
  • Posts: 10572
Re: Loading Large Sqlite DB file - hangs - performance advice
« Reply #12 on: October 26, 2020, 05:56:10 pm »
Never load the whole database. Leave the performance to the DB engine.
You do not load a database but simply open it. As said many times (also here) use a virtual control like TVirtualListview.
TDbGrid is fine for smaller (<1000) solutions.

The TVirtualXXX controls are windowed - not MsWindows! but an algorithm family - controls and perform therefor much faster since the file stays on disk and only what needs to be displayed is actually in memory. The Java grids work in the same way.
There is more to it but that is the simplest explanation.
« Last Edit: October 26, 2020, 06:05:58 pm by Thaddy »

 

TinyPortal © 2005-2018