Recent

Author Topic: Most efficient way of text-search in a record array?  (Read 4570 times)

sniperton

  • New Member
  • *
  • Posts: 18
Most efficient way of text-search in a record array?
« on: April 13, 2016, 05:38:18 pm »
Hi, I returned to hobby-programming after a while and feel more stupid than ever…

I have data which requires a record-like structure with various fields, something like this (but could be different as well; it’s just what I functionally need):

Code: Pascal  [Select][+][-]
  1. Type  
  2.   myRecord = Record  
  3.     myString : Array [0..n] of String;  
  4.     myData : Array [0..n] of Integer;
  5.     …
  6.  
  7. Var  
  8.   myR : Array [0..n] of myRecord;

What is the most efficient way for a beginner (beyond the primeval for..next approach in a function) to check

A. 1. whether one of the myString fields (where I can specify the field index, meaning that it could also be myString1, myString2, ...)
   a) matches a text pattern (or, optionally,)
   b) contains a text pattern;
A. 2. if it does, return the record index;

B. 1. whether the myData fields (where I can’t specify the field index, meaning that a full sub-array has to be searched)
   a) is equal to a numeric value (a value is already contained in the array);
B.2. if it is, return the record index

Thank you for your time,

Cheers, and please be generous, the shame is mine...  :-[

Leledumbo

  • Hero Member
  • *****
  • Posts: 8783
  • Programming + Glam Metal + Tae Kwon Do = Me
Re: Most efficient way of text-search in a record array?
« Reply #1 on: April 13, 2016, 05:58:44 pm »
A. 1. whether one of the myString fields (where I can specify the field index, meaning that it could also be myString1, myString2, ...)
   a) matches a text pattern (or, optionally,)
=, (Ansi|Wide)Compare(Text|Str)
   b) contains a text pattern;
AnsiContains(Text|Str) (no Wide variant as of now), use RegExpr unit (or other units implementing the same) functionality.
A. 2. if it does, return the record index;
Do it in a loop. If you can ensure they're sorted, use binary search. Otherwise, populate them in a TStringList, then use IndexOf. Series of Insert followed by Sorted := true before that may speed up the code a bit.
B. 1. whether the myData fields (where I can’t specify the field index, meaning that a full sub-array has to be searched)
   a) is equal to a numeric value (a value is already contained in the array);
B.2. if it is, return the record index
No different from above, even simpler since the only = required, no fancy function required.

sniperton

  • New Member
  • *
  • Posts: 18
Re: Most efficient way of text-search in a record array?
« Reply #2 on: April 13, 2016, 06:38:44 pm »
Thank you, the Compare/Contains thingy is clear, but the loop proves to be pretty slow, even if I exit the function when a match is found (I merge data from various sources and I have to check them on each import).

I tried, but failed, with the IndexOf function; probably I missed something.

Code: Pascal  [Select][+][-]
  1. Type  
  2.   myRecord = Record  
  3.     myString : Array [0..n] of TStringList;  
  4.     myData : Array [0..n] of Integer;
  5.     …
  6.  
  7. Var  
  8.   myR : Array [0..n] of myRecord;

How would you use IndexOf in the above case for myString (array or not)?

Code: Pascal  [Select][+][-]
  1. i:= IndexOf(myR.myString[n]);
???

BTW, is the IndexOf function considerably faster than the loop?

balazsszekely

  • Guest
Re: Most efficient way of text-search in a record array?
« Reply #3 on: April 13, 2016, 07:12:30 pm »
Sort the TStringList after you populate with items, otherwise IndexOf will be very slow!
Code: Pascal  [Select][+][-]
  1. var
  2.   i, j: Integer;
  3.   p: Integer;
  4. begin
  5.   for i := 0 to n do //loop through myR
  6.   begin
  7.     for j := 0 to n do //loop through myString
  8.     begin
  9.       p := myR[i].myString[j].IndexOf('test');
  10.       if p <> -1 then
  11.       begin
  12.         ShowMessage('Found at position:' + sLineBreak +
  13.                     '   i = ' + IntToStr(i) + sLineBreak +
  14.                     '   j = ' + IntToStr(j));
  15.         Break;
  16.       end;
  17.     end;
  18.   end;
  19. end;

Leledumbo

  • Hero Member
  • *****
  • Posts: 8783
  • Programming + Glam Metal + Tae Kwon Do = Me
Re: Most efficient way of text-search in a record array?
« Reply #4 on: April 13, 2016, 07:34:56 pm »
I tried, but failed, with the IndexOf function; probably I missed something.

Code: Pascal  [Select][+][-]
  1. Type  
  2.   myRecord = Record  
  3.     myString : Array [0..n] of TStringList;  
  4.     myData : Array [0..n] of Integer;
  5.     …
  6.  
  7. Var  
  8.   myR : Array [0..n] of myRecord;

How would you use IndexOf in the above case for myString (array or not)?

Code: Pascal  [Select][+][-]
  1. i:= IndexOf(myR.myString[n]);
???
That's not the way I mean. Populate a TStringList means you loop over your current whatever data structure, taking all myString of index n and Add/Insert-it to a TStringList instance.
BTW, is the IndexOf function considerably faster than the loop?
IndexOf in Sorted = false is the same, it's just linear search. But with Sorted = true, it uses binary search which is a lot faster.

sniperton

  • New Member
  • *
  • Posts: 18
Re: Most efficient way of text-search in a record array?
« Reply #5 on: April 14, 2016, 03:52:29 pm »
Great, I think I got the picture. Thank you all for your help. Cheers

 

TinyPortal © 2005-2018