Recent

Author Topic: Using Tdbf for searching an lookup table  (Read 1401 times)

chenyuchih

  • Jr. Member
  • **
  • Posts: 71
Using Tdbf for searching an lookup table
« on: February 23, 2019, 05:25:50 am »
Hello,

I am programming a fluid properties calculator, which interpolates from a lookup table, and I selected xBase as database system for storing the data.

The original data is based on a 2-D(pressure and temperature) lookup table. I re-arranged it to a long list and saved it to a DBF file.

In general cases, a state with pressure P & temperature T would be enveloped by 4 neighbor points: PLTL,PLTH,PHTL, and PHTH.(L for lower, H for higher)

I write the following code for test:

//---------------------------------------------------------------------
var
DataDBF:TDBF;
P,T,PL,PH:Double; //Fluid states
PLTL,PLTH,PHTL,PHTH:Integer; //Record number of neighbor points
T1,T2:Integer; //For executing time measurement

begin
DataDBF:=TDbf.Create(Nil);
DataDBF.FilePathFull:=ExtractFilePath(ParamStr(0));
DataDBF.ReadOnly:=True;
DataDBF.TableName:='Sample.DBF';
DataDBF.TableLevel:=4;
DataDBF.Open;

P:=37.945;
T:=-62.066;

T1:=GetTickCount();

DataDBF.Filter:='P>='+floattostr(P);
DataDBF.Filtered:=True;
DataDBF.First; //This line relies on the raw data sequence. I feel that it's not so ideal, but I don't know how to get the minimum P from the search result.... :(
PH:=DataDBF.FieldByName('P').AsFloat;

DataDBF.Filter:='P<='+floattostr(P);
DataDBF.Filtered:=True;
DataDBF.Last;
PL:=DataDBF.FieldByName('P').AsFloat;

DataDBF.Filter:='P='+floattostr(PL)+' and T<='+floattostr(T);
DataDBF.Filtered:=True;
DataDBF.Last;
PLTL:=DataDBF.PhysicalRecNo;

DataDBF.Filter:='P='+floattostr(PL)+' and T>='+floattostr(T);
DataDBF.Filtered:=True;
DataDBF.First;
PLTH:=DataDBF.PhysicalRecNo;

DataDBF.Filter:='P='+floattostr(PH)+' and T<='+floattostr(T);
DataDBF.Filtered:=True;
DataDBF.Last;
PHTL:=DataDBF.PhysicalRecNo;

DataDBF.Filter:='P='+floattostr(PH)+' and T>='+floattostr(T);
DataDBF.Filtered:=True;
DataDBF.First;
PHTH:=DataDBF.PhysicalRecNo;

T2:=GetTickCount();

DataDBF.Close;
DataDBF.Free;
end;
//---------------------------------------------------------------------

I can get the correct result from the above code. The only issue is that the execution speed. On my laptop it takes more than 1 second(T2-T1) to perform once.

I tried using index by P(Primary), but the result is not correct. Is there any suggestion for improving it?

The sample DBF file can be downloaded here:
https://drive.google.com/file/d/1bGrwKyE5oMHsmdxnpDyHh4507-c_QSW1/view?usp=sharing

BTW, does anyone have recommend website for Tdbf tips? I just know the official site and Lazarus tutorial.

Thanks!

Chen, Yu-Chih

(Lazarus 1.8.4 official release)

Handoko

  • Hero Member
  • *****
  • Posts: 3041
  • My goal: build my own game engine using Lazarus
Re: Using Tdbf for searching an lookup table
« Reply #1 on: February 23, 2019, 05:48:52 am »
First of all, I'm not a database expert.

I found that:
Code: Pascal  [Select]
  1.   T1:=GetTickCount();
  2.  
  3.   DataDBF.First;
  4.   while not(DataDBF.EOF) do
  5.     DataDBF.Next;
  6.  
  7.   T2:=GetTickCount();
Run much faster than your code.

So I think you should manually find all the data you need by a single loop.

chenyuchih

  • Jr. Member
  • **
  • Posts: 71
Re: Using Tdbf for searching an lookup table
« Reply #2 on: February 23, 2019, 06:39:46 am »
OMG!! How could it happen? :o

Before your example, I deeply believed that the benefit of database manage system is to search data corresponding to the expression efficiently through some internal special algorithm or structure. If it is (much) slower than the whole file scanning, the database becomes just a storing media.

Could anyone make a brief explanation? I must misunderstand something...

Thank you Handoko. Your post changed my concept.

Regards,
ChenYuChih

valdir.marcos

  • Hero Member
  • *****
  • Posts: 724
Re: Using Tdbf for searching an lookup table
« Reply #3 on: February 23, 2019, 07:40:19 am »
Hello,
I am programming a fluid properties calculator, which interpolates from a lookup table, and [b[I selected xBase as database system for storing the data[/b].
The original data is based on a 2-D(pressure and temperature) lookup table. I re-arranged it to a long list and saved it to a DBF file.
Why didn't you select a modern, robust, reliable, more flexible and faster solution such as Firebird embedded, SQLite, etc?

Quote
In general cases, a state with pressure P & temperature T would be enveloped by 4 neighbor points: PLTL,PLTH,PHTL, and PHTH.(L for lower, H for higher)
I write the following code for test:

Code: Pascal  [Select]
  1. var
  2.   DataDBF:TDBF;
  3.   P,T,PL,PH:Double; //Fluid states
  4.   PLTL,PLTH,PHTL,PHTH:Integer; //Record number of neighbor points
  5.   T1,T2:Integer; //For executing time measurement
  6. begin
  7.   DataDBF:=TDbf.Create(Nil);
  8.   DataDBF.FilePathFull:=ExtractFilePath(ParamStr(0));
  9.   DataDBF.ReadOnly:=True;
  10.   DataDBF.TableName:='Sample.DBF';
  11.   DataDBF.TableLevel:=4;
  12.   DataDBF.Open;
  13.  
  14.   P:=37.945;
  15.   T:=-62.066;
  16.  
  17.   T1:=GetTickCount();
  18.  
  19.   DataDBF.Filter:='P>='+floattostr(P);
  20.   DataDBF.Filtered:=True;
  21.   DataDBF.First; //This line relies on the raw data sequence. I feel that it's not so ideal, but I don't know how to get the minimum P from the search result.... :(
  22.   PH:=DataDBF.FieldByName('P').AsFloat;
  23.  
  24.   DataDBF.Filter:='P<='+floattostr(P);
  25.   DataDBF.Filtered:=True;
  26.   DataDBF.Last;
  27.   PL:=DataDBF.FieldByName('P').AsFloat;
  28.  
  29.   DataDBF.Filter:='P='+floattostr(PL)+' and T<='+floattostr(T);
  30.   DataDBF.Filtered:=True;
  31.   DataDBF.Last;
  32.   PLTL:=DataDBF.PhysicalRecNo;
  33.  
  34.   DataDBF.Filter:='P='+floattostr(PL)+' and T>='+floattostr(T);
  35.   DataDBF.Filtered:=True;
  36.   DataDBF.First;
  37.   PLTH:=DataDBF.PhysicalRecNo;
  38.  
  39.   DataDBF.Filter:='P='+floattostr(PH)+' and T<='+floattostr(T);
  40.   DataDBF.Filtered:=True;
  41.   DataDBF.Last;
  42.   PHTL:=DataDBF.PhysicalRecNo;
  43.  
  44.   DataDBF.Filter:='P='+floattostr(PH)+' and T>='+floattostr(T);
  45.   DataDBF.Filtered:=True;
  46.   DataDBF.First;
  47.   PHTH:=DataDBF.PhysicalRecNo;
  48.  
  49.   T2:=GetTickCount();
  50.  
  51.   DataDBF.Close;
  52.   DataDBF.Free;
  53. end;
I can get the correct result from the above code. The only issue is that the execution speed. On my laptop it takes more than 1 second(T2-T1) to perform once.
You should pick faster options than xBase.

Quote
I tried using index by P(Primary), but the result is not correct. Is there any suggestion for improving it?
Yes, use Integer number as Primary Key.

Quote
The sample DBF file can be downloaded here:
https://drive.google.com/file/d/1bGrwKyE5oMHsmdxnpDyHh4507-c_QSW1/view?usp=sharing
BTW, does anyone have recommend website for Tdbf tips? I just know the official site and Lazarus tutorial.
Thanks!
Chen, Yu-Chih
(Lazarus 1.8.4 official release)
Sorry, It's been a long time that I just migrate xBase to reliable solutions.

valdir.marcos

  • Hero Member
  • *****
  • Posts: 724
Re: Using Tdbf for searching an lookup table
« Reply #4 on: February 23, 2019, 07:46:16 am »
OMG!! How could it happen? :o
Before your example, I deeply believed that the benefit of database manage system is to search data corresponding to the expression efficiently through some internal special algorithm or structure. If it is (much) slower than the whole file scanning, the database becomes just a storing media.
Again, please, forget about xBase and MS Access and move to something better.

Quote
Could anyone make a brief explanation? I must misunderstand something...
Still talking about xBase, Primary Key should be of an Integer type.

Quote
Thank you Handoko. Your post changed my concept.
Regards,
ChenYuChih
« Last Edit: February 23, 2019, 09:23:27 am by valdir.marcos »

chenyuchih

  • Jr. Member
  • **
  • Posts: 71
Re: Using Tdbf for searching an lookup table
« Reply #5 on: February 23, 2019, 09:03:19 am »
Okay, it seems that I selected a database which is not so fit my application.

The reason why I choose Tdbf is just because it's full sourced in pascal and doesn't need any other binary library, no matter linking statically or dynamically. This feature is very attractive to me.

Thanks for your advise, valdir.marcos. I will evaluate another one.

Sincerely,
ChenYuChih

valdir.marcos

  • Hero Member
  • *****
  • Posts: 724
Re: Using Tdbf for searching an lookup table
« Reply #6 on: February 23, 2019, 09:26:03 am »
Okay, it seems that I selected a database which is not so fit my application.
The reason why I choose Tdbf is just because it's full sourced in pascal and doesn't need any other binary library, no matter linking statically or dynamically. This feature is very attractive to me.
Thanks for your advise, valdir.marcos. I will evaluate another one.
Sincerely,
ChenYuChih
There are better Pascal in-memory database alternatives such as TBufDataset or ZMSQL, among others:

ZMSQL - TBufDataset SQL enhanced in-memory database
http://forum.lazarus.freepascal.org/index.php/topic,13821.0.html

http://wiki.freepascal.org/ZMSQL

http://wiki.freepascal.org/How_to_write_in-memory_database_applications_in_Lazarus/FPC#ZMSQL

http://wiki.freepascal.org/How_to_write_in-memory_database_applications_in_Lazarus/FPC#TBufDataSet

http://wiki.freepascal.org/How_to_write_in-memory_database_applications_in_Lazarus/FPC

chenyuchih

  • Jr. Member
  • **
  • Posts: 71
Re: Using Tdbf for searching an lookup table
« Reply #7 on: February 23, 2019, 09:35:57 am »
Thank you so much!

I'll give them a try. :)

wp

  • Hero Member
  • *****
  • Posts: 5723
Re: Using Tdbf for searching an lookup table
« Reply #8 on: February 23, 2019, 12:41:20 pm »
You use a filtering process to find the neighboring records; this returns a lot of unneeded records, and I think copying these records to the result set is the reason why your code is so slow.

To decide how this can be avoided please answer these questions:
  • Are the pressure and temperature values of the table always equally spaced and ordered like in the sample dbf file, and do you know the spacing?
  • Is the point for which you seek the neighbors an element of the table or somewhere between the grid points?
« Last Edit: February 23, 2019, 12:57:48 pm by wp »
Lazarus trunk / fpc 3.0.4 / all 32-bit on Win-10

chenyuchih

  • Jr. Member
  • **
  • Posts: 71
Re: Using Tdbf for searching an lookup table
« Reply #9 on: February 23, 2019, 03:13:00 pm »
You use a filtering process to find the neighboring records; this returns a lot of unneeded records, and I think copying these records to the result set is the reason why your code is so slow.

To decide how this can be avoided please answer these questions:
  • Are the pressure and temperature values of the table always equally spaced and ordered like in the sample dbf file, and do you know the spacing?
  • Is the point for which you seek the neighbors an element of the table or somewhere between the grid points?

1. No. Because of some vapor's saturation characteristic, the interval between values is not always equal. Besides, the data file may be constructed by others, the 4 neighbor points on P-T plane may even not be a rectangle.

2. The target may be anywhere on the P-T plane. I need to get the neighbor points first then interpolate the properties of target point. The calculation part is not a problem. Now the issue is how to get the neighbor points more efficiently.

I guess one possible reason of this performance issue is the file size. It contains over 20,000 records for good resolution. If there's no effective remedy, I think I have to do some trade off.

Thanks!

ChenYuChih

wp

  • Hero Member
  • *****
  • Posts: 5723
Re: Using Tdbf for searching an lookup table
« Reply #10 on: February 23, 2019, 04:20:51 pm »
the 4 neighbor points on P-T plane may even not be a rectangle.
Then the problem is different. You need to find the three (or four) P-T points among the given ones which have the closest distance to the given P-T point. Because pressure and temperature have different units and cover different ranges I'd normalize them at first to the range 0..1. Then I'd calculate the "distance" of each record point from the P-T point, add a column for that and sort the table by this distance column. The first three (or four) points will be the ones that you seek.

I think that this will be a bit complicated for the dbf dataset. You should move to an SQL database, where you can do the calculations by SQL statements. Maybe SQLite3, but this requires an external DLL. I would not recommend ZMSQL because this package is not actively maintained.

Or, my own preference would be to load the entire dbf file into an array of P-T-V-H-S records and do the calculation and sorting in Pascal. It shouldn't be too difficult.

[EDIT]
The latter idea is implemented in the attached demo. First click on "Load dbf" to load the records from the dbf file into a pascal list, then click on "Find neighbors" to find a given count of records which have the closest "distance" to the pressure-temperature value pair given in the edit boxes. Loading the db file and populating the list takes a few 1/10 sec, searching completes within the ms time frame.
« Last Edit: February 23, 2019, 07:09:04 pm by wp »
Lazarus trunk / fpc 3.0.4 / all 32-bit on Win-10

chenyuchih

  • Jr. Member
  • **
  • Posts: 71
Re: Using Tdbf for searching an lookup table
« Reply #11 on: February 24, 2019, 02:50:50 am »
Thanks for your hint and help, wp.

I am going to investigate your code. :)

ChenYuChih

avra

  • Hero Member
  • *****
  • Posts: 1530
    • Additional info
Re: Using Tdbf for searching an lookup table
« Reply #12 on: February 24, 2019, 07:21:36 pm »
I am programming a fluid properties calculator, which interpolates from a lookup table, and I selected xBase as database system for storing the data.

The original data is based on a 2-D(pressure and temperature) lookup table. I re-arranged it to a long list and saved it to a DBF file.

The biggest speed loss is in database lookup table. Simple floating point formula calculation is so much faster on modern hardware. In case of simple fluid flow calculation correction which depends on measured pressure (much more) and measured temperature (much less) it is trivial and very fast to calculate. Flow measurement devices are calibrated for some reference pressure and temperature, and you can usually read this data on measurement device label or device calibration certification paper. You read this data, enter them in a simple sqrt like formula and you get actual corrected flow. In case you need more then just flow calculation, then things might not be that simple, but for flow they are.
ct2laz - Conversion between Lazarus and CodeTyphon
bithelpers - Bit manipulation for standard types
pasettimino - Siemens S7 PLC lib