* * *

Author Topic: How to speed up Delete  (Read 2508 times)

HatForCat

  • Sr. Member
  • ****
  • Posts: 260
How to speed up Delete
« on: July 04, 2016, 06:55:14 pm »
I am trying to help a friend speed up his old software. It uses ZEOS 6.6 (too much work to change to ZEOS-v7) and SQLite3. Running the program for a few hours will result in about 10,000 records being gathered.

We need to clean out records as nothing older than last 1000 records is needed. Using the following code it is abysmally slow even if called several times a day. Could someone please give me some SQL language to do this as I am sure it will be quicker, but I am not yet very skilled with SQL.

Code: [Select]
procedure Tdm.ClearToLatest;
var
  AllDone : Boolean;
begin
  tblData.DisableControls;
  try
    AllDone:=False;
    tblData.First;
    repeat
        tblData.Delete;
        AllDone:=tblData.RecordCount <= 1000;
    until AllDone;
  finally
    tblData.EnableControls;
  end;
end;
Acer-i5, 2.6GHz, 6GB, 256-SSD, Ubuntu 14.04-LTS, Unity Desktop, Lazarus 1.6.2, SQLite3 -- Retired: Programming for my own use for Ubuntu.

rvk

  • Hero Member
  • *****
  • Posts: 2730
Re: How to speed up Delete
« Reply #1 on: July 04, 2016, 07:06:25 pm »
If there is some index or ID generated, this can be done with one SQL-statement, in which case it will be executed locally by the server which will be much much faster.

Something like this, but it depends on the table-definition:
Code: SQL  [Select]
  1. DELETE FROM TABLE
  2. WHERE id NOT IN
  3.   (SELECT id FROM TABLE ORDER BY id DESC LIMIT 1000)
(there are other options like older than a certain time or if ID is generated sequentially you can just delete everything with id < last_id - 1000)
« Last Edit: July 04, 2016, 07:08:24 pm by rvk »

HatForCat

  • Sr. Member
  • ****
  • Posts: 260
Re: How to speed up Delete
« Reply #2 on: July 04, 2016, 07:46:19 pm »
Thanks RVK, but I got a bit more creative with searching and found this for a Query...

Code: [Select]
  tStr:='DELETE FROM tblData WHERE inDate <= date(''now'',''-2 day'')';

Took out 20,000 records in about two seconds. By the way, anyone using that line, the ''now'' and ''-2 day'' are using multiple single quotes (un-shifted quote-key)  and NOT the keyboard double-quotes "now" and "-2 day" (shifted quote-key) although they look pretty much the same on this page. :)
Acer-i5, 2.6GHz, 6GB, 256-SSD, Ubuntu 14.04-LTS, Unity Desktop, Lazarus 1.6.2, SQLite3 -- Retired: Programming for my own use for Ubuntu.

rvk

  • Hero Member
  • *****
  • Posts: 2730
Re: How to speed up Delete
« Reply #3 on: July 04, 2016, 08:18:46 pm »
... but I got a bit more creative with searching and found this for a Query...
Code: [Select]
  tStr:='DELETE FROM tblData WHERE inDate <= date(''now'',''-2 day'')';
Yep, that will work too.

But you said:
Running the program for a few hours will result in about 10,000 records being gathered.
If you have 10.000 records in a few hours, you're still left with a lot of records if you only delete everything older than 2 days. You could also use "-4 hours" or something in which case you come closer to those 1000 records you wanted.

The problem is that if you use -x time in the query you never know how many records you are left with. If for instance in your case the last 2 days only 2 records were added you would end up with just 2 records (which could be a problem or too few). Using the ID-method I showed ensures exactly 1000 records after the statement.

But if the -2 days works (it depends on the situation) this is fine too.

HatForCat

  • Sr. Member
  • ****
  • Posts: 260
Re: How to speed up Delete
« Reply #4 on: July 05, 2016, 12:52:47 am »
Thanks again RVK, yup, sorry I used the test program I wrote which was deleting by days. Should have come back and changed it. I appreciate your speedy and detailed reply though.

Your ID-suggestion is not an easy one as there is a lot of data streaming in and not very well formatted -- and with a large number if ID type items. There could be 10,000 of "001" one day then none for another week. The only current Index is on Date. :)

The 1000 records remaining was an arbitrary minimum and your point is taken there could be very few items left, but given the constant stream, using the -2 or -1 days will have to suffice. It would be nice to include the 1000-limit with the Date, but I can't find anything that relates to "Remaining Record Count" within SQLite3. So, no way I can tell it to stop at 1000, 5000 or whatever.
« Last Edit: July 05, 2016, 12:55:32 am by HatForCat »
Acer-i5, 2.6GHz, 6GB, 256-SSD, Ubuntu 14.04-LTS, Unity Desktop, Lazarus 1.6.2, SQLite3 -- Retired: Programming for my own use for Ubuntu.

rvk

  • Hero Member
  • *****
  • Posts: 2730
Re: How to speed up Delete
« Reply #5 on: July 05, 2016, 01:07:00 am »
It would be nice to include the 1000-limit with the Date, but I can't find anything that relates to "Remaining Record Count" within SQLite3. So, no way I can tell it to stop at 1000, 5000 or whatever.
You can (kinda). There is no way to tell which records are the last 1000 (maybe there is but I'm on mobile now so can't check) but you can limit it to the date of the previous 1000 records (which might be more but never less).

Code: SQL  [Select]
  1. DELETE FROM TABLE
  2. WHERE DATE NOT IN
  3.   (SELECT DATE FROM TABLE ORDER BY DATE DESC LIMIT 1000)

And of course you can combine both your where with this one (with an and) so you keep all records which are newer than two days and at least 1000 (or more records). But if you have 50.000 records on that day (exactly 1000 records ago) it will keep all 50.000 records because it can't know which are "first" on that day.
« Last Edit: July 05, 2016, 01:15:02 am by rvk »

shobits1

  • Sr. Member
  • ****
  • Posts: 278
  • .
Re: How to speed up Delete
« Reply #6 on: July 05, 2016, 06:14:30 am »
Quote from: sqlite
https://www.sqlite.org/lang_createtable.html#rowid

Except for WITHOUT ROWID tables, all rows within SQLite tables have a 64-bit signed integer key that uniquely identifies the row within its table. This integer is usually called the "rowid". The rowid value can be accessed using one of the special case-independent names "rowid", "oid", or "_rowid_" in place of a column name. If a table contains a user defined column named "rowid", "oid" or "_rowid_", then that name always refers the explicitly declared column and cannot be used to retrieve the integer rowid value.

as far as i understood right, you can use rvk's sql script without problem (with little mod)
Code: SQL  [Select]
  1. DELETE FROM `TABLE`
  2. WHERE rowid NOT IN
  3.   (SELECT rowid FROM `TABLE` ORDER BY id DESC LIMIT 1000)
  4.  

rvk

  • Hero Member
  • *****
  • Posts: 2730
Re: How to speed up Delete
« Reply #7 on: July 05, 2016, 04:10:44 pm »
as far as i understood right, you can use rvk's sql script without problem (with little mod)
Yes, I kinda suspected there was something internal but couldn't check at the time.

So the final result could be:
Code: SQL  [Select]
  1. DELETE FROM tblData
  2. WHERE
  3.   (inDate <= DATE('now','-2 day') AND
  4.   (rowid NOT IN (SELECT rowid FROM tblData ORDER BY id DESC LIMIT 1000))
  5.  
Will delete ALL records EXCEPT records of the last 2 days (today and yesterday) and IF that's less than 1000 it will keep that 1000 records (regardless of age).

Note: The rowid-method only works for SQLite. Other DB-engines might have other options. But it's always better to have a unique ID/primary key anyway !!!

HatForCat

  • Sr. Member
  • ****
  • Posts: 260
Re: How to speed up Delete
« Reply #8 on: July 06, 2016, 12:06:53 am »
Thanks guys, I remember reading about "rowid" quite some time back but had forgotten all about it as I don't use SQL much at all.

I am working with stuff that was written a few years back so want to keep as much as possible and the Serial-Data-In table has no unique identifier, I guess they just wanted to grab everything as they don't appear to using rowid anywhere.

Back then volume coming in was not a big problem but the load has slowly been increasing to where it was taking an hour to clear old stuff using the OP routine every day.

Your help is very much much appreciated.
Acer-i5, 2.6GHz, 6GB, 256-SSD, Ubuntu 14.04-LTS, Unity Desktop, Lazarus 1.6.2, SQLite3 -- Retired: Programming for my own use for Ubuntu.

 

Recent

Get Lazarus at SourceForge.net. Fast, secure and Free Open Source software downloads Open Hub project report for Lazarus