Recent

Author Topic: How can one optimise inserting an item into an array, anywhere but the end?  (Read 3584 times)

Gustavo 'Gus' Carreno

  • Hero Member
  • *****
  • Posts: 1111
  • Professional amateur ;-P
Hi Y'all,

In order to do that, one increments the length of the array by one, then moves all the elements that stand on the insertion index, and to the right of it, one place to the right and then insert our element at the insertion index.

For me, the logic implementation of shift-the-elements-to-the-right would be:

Code: Pascal  [Select][+][-]
  1. type
  2.   TMyArray: Array of Integer; // As a quick example
  3.  
  4. procedure ShiftArrayRight(const AMyArray: TMyArray; AInsertionIndex: Integer);
  5. var
  6.   Index: Integer;
  7. begin
  8.   SetLength(AMyArray, Length(AMyArray)+1); // Would the const up there be an issue here?
  9.   for Index:= High(AMyArray) downto AInsertionIndex do
  10.   begin
  11.     AMyArray[Index]:= AMyArray[Pred(Index)];
  12.   end;
  13. end;

And for small amounts contained in the array, it will make sense.
But at what point is this approach gonna be inefficient?

So my original idea to optimise it is to have a low level call that would shift a whole block of mem, one byte to the right.

But then more questions appear:
  • Is this possible?
  • Is it the best solution?
  • When does the array pass the boundary of a page, and there need to be 2 mem copy operations?

Hence I'm asking the more experienced programmer in this forum to help me on this one.

As per usual, many, MANY thanks in advance!!

Cheers,
Gus
Lazarus 3.99(main) FPC 3.3.1(main) Ubuntu 23.10 64b Dark Theme
Lazarus 3.0.0(stable) FPC 3.2.2(stable) Ubuntu 23.10 64b Dark Theme
http://github.com/gcarreno

MarkMLl

  • Hero Member
  • *****
  • Posts: 6676
Don't use an array.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

AlexTP

  • Hero Member
  • *****
  • Posts: 2386
    • UVviewsoft
Yes, better use TFPGList<item_type_here>, it is generic. Unit FGL.
Item type must be (should) non-managed.

Gustavo 'Gus' Carreno

  • Hero Member
  • *****
  • Posts: 1111
  • Professional amateur ;-P
Hey Mark,

Don't use an array.

I like your Mister Miyagi answer: Best way to block a punch, no be there. LOL !!!  :D

How should I implement "no be there" in your opinion?

Cuz the only way I see to avoid the array is to use a linked list.

At their core, what do most of the TList* implementations use? An array, or a linked list? Something else I'm missing?

Cheers,
Gus
Lazarus 3.99(main) FPC 3.3.1(main) Ubuntu 23.10 64b Dark Theme
Lazarus 3.0.0(stable) FPC 3.2.2(stable) Ubuntu 23.10 64b Dark Theme
http://github.com/gcarreno

Gustavo 'Gus' Carreno

  • Hero Member
  • *****
  • Posts: 1111
  • Professional amateur ;-P
Hey Alex,

Yes, better use TFPGList<item_type_here>, it is generic. Unit FGL.
Item type must be (should) non-managed.

Yes, that would also be my answer, but in it's core, does that generic use an array or a linked list?
And if it uses an array, how did the author manage the optimisation of inserting in the middle?

Cheers,
Gus
Lazarus 3.99(main) FPC 3.3.1(main) Ubuntu 23.10 64b Dark Theme
Lazarus 3.0.0(stable) FPC 3.2.2(stable) Ubuntu 23.10 64b Dark Theme
http://github.com/gcarreno

JdeHaan

  • Full Member
  • ***
  • Posts: 115
First, you can use the standard Insert function for arrays.
But if you want to do it yourself, this is what I use:

System.Move(fItems[Index], fItems[Index + 1], SizeOf(Item) * (fCount - Index));

I separately keep track of the number items (fCount) and the capacity of the array, which may be more than fCount.

Gustavo 'Gus' Carreno

  • Hero Member
  • *****
  • Posts: 1111
  • Professional amateur ;-P
Hey JdeHaan,

First, you can use the standard Insert function for arrays.

Hummm, this proves I should know my arrays a little bit more since I was unaware of such function :)
I thank you SOOO very much for teaching me something new :)

But the same questions apply to this:
  • Is this method optimised?
  • How is it implemented at the bare metal level?



But if you want to do it yourself, this is what I use:

System.Move(fItems[Index], fItems[Index + 1], SizeOf(Item) * (fCount - Index));

I separately keep track of the number items (fCount) and the capacity of the array, which may be more than fCount.

This is the exact thing I had in mind: A low level Move. THANK YOU so much for this answer!!!

My only fear is this, because at the bare metal level, the move/copy is one byte at a time:
  • What's the algorithm that implements that move?
  • Does it copy from left-to-right or form right-to-left? If it copies from left to right, then all values are now the same as insertion point
  • Does it copy the chunk to another place and then back to the original position+1?
  • What if the array is so big that it crosses page boundaries, is Move aware of that?

Cheers,
Gus
Lazarus 3.99(main) FPC 3.3.1(main) Ubuntu 23.10 64b Dark Theme
Lazarus 3.0.0(stable) FPC 3.2.2(stable) Ubuntu 23.10 64b Dark Theme
http://github.com/gcarreno

MarkMLl

  • Hero Member
  • *****
  • Posts: 6676
How should I implement "no be there" in your opinion?

A linear array is the worst possible structure as far as insertions are concerned, hence the optimal solution cannot rely on them. Trust me: as an engineer, I tend to over-use arrays.

You need to look at what databases do internally, you need to consider the relative frequency of insertions and references, you need to consider the possibility of using e.g. an array of arrays or and array of linked lists, and you need to be prepared to build indexes and have some heuristic which indicates when it is time to rebuild or reallocate storage.

Having considered those points, you might find that it's most efficient to assemble your data into a database (even if it's then read into an array for rapid processing), or to use something based on a TList, or an explicit balanced tree, or whatever computersciency replacement is in vogue.

Quote
At their core, what do most of the TList* implementations use? An array, or a linked list? Something else I'm missing?

Who knows? It's an opaque data structure, so there is no way of knowing whether its implementation is the same across all platforms and all FPC (etc.) versions.

But your question boils down to (a) efficient insertion and (b) efficient indexed access... and to answer that you need to consider the relative number of times those occur.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11383
  • FPC developer.
Don't use an array.

Depends on what you want.   If you want deterministic walking of the datastructure and not necessarily infinity scalability, array of arrays can be quite ok, see e.g. http://www.stack.nl/~marcov/lightcontainers.zip

winni

  • Hero Member
  • *****
  • Posts: 3197
    • What's the algorithm that implements that move?
    • Does it copy from left-to-right or form right-to-left? If it copies from left to right, then all values are now the same as insertion point
    • Does it copy the chunk to another place and then back to the original position+1?
    • What if the array is so big that it crosses page boundaries, is Move aware of that?

    Cheers,
    Gus

    Hi!

    As you can see in the source move is done in Assembler.

    In the very early days of Pascal you had to take care of moving left or right, there was moveL and moveR.

    That was replaced long time ago by only one move that takes care of the direction.

    I have moved big pieces of data with fpc and I never had problems.

    In Turbo times move was often used to speed up execution.
    With todays machines there is no need anymore to do that.

    Winni

    Winni


    Gustavo 'Gus' Carreno

    • Hero Member
    • *****
    • Posts: 1111
    • Professional amateur ;-P
    Hey Y'all,

    Ultimately what I'm trying to do is avoid using binary trees.
    At Uni, when trees came along, I was able to quickly understand the population and the traversal of one.
    Alas, for the life of me, I could never understand the balancing of a tree. Well, the balancing I understood, but the implementation of it, NOPE!

    And if I want to do a proper hash map class, the best way to implement such a thing is on the back of a binary tree, a balanced binary tree!!!

    So, I know how to do a binary search on an ordered array that will land me an OLog N. But then I have an ON for the array insertion, ARGH!!
    If I do linked lists, I cannot implement a binary search since I have no indexes, and no way to quickly get at the middle element of a sub set of a linked list. So whatever algo i use for search, would always be ON, but I would have O1 in the insertion. ARGHHHH!!!

    If only the insertion of the array could be optimized out of ON, that would be GRAND!!

    Either that, or I'll have to copy/paste some binary tree balancing code that I really don't understand and for me that's a tad unacceptable!

    Cheers,
    Gus
    « Last Edit: August 11, 2022, 11:15:49 pm by Gustavo 'Gus' Carreno »
    Lazarus 3.99(main) FPC 3.3.1(main) Ubuntu 23.10 64b Dark Theme
    Lazarus 3.0.0(stable) FPC 3.2.2(stable) Ubuntu 23.10 64b Dark Theme
    http://github.com/gcarreno

    Gustavo 'Gus' Carreno

    • Hero Member
    • *****
    • Posts: 1111
    • Professional amateur ;-P
    Hey Winni,

    As you can see in the source move is done in Assembler.

    Ok, this gives me some hope :)

    In the very early days of Pascal you had to take care of moving left or right, there was moveL and moveR.

    That was replaced long time ago by only one move that takes care of the direction.

    Clever, exactly the assurance I wanted :)

    I have moved big pieces of data with fpc and I never had problems.

    Kewl, kewl, looks like I'll have to investigate and a add another app to my Test* series.

    Many thanks Winni!!

    Cheers,
    Gus
    Lazarus 3.99(main) FPC 3.3.1(main) Ubuntu 23.10 64b Dark Theme
    Lazarus 3.0.0(stable) FPC 3.2.2(stable) Ubuntu 23.10 64b Dark Theme
    http://github.com/gcarreno

    MarkMLl

    • Hero Member
    • *****
    • Posts: 6676
    Ultimately what I'm trying to do is avoid using binary trees.

    Mandating that is likely to be incompatible with your original "optimise" request.

    MarkMLl
    MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
    Pet hate: people who boast about the size and sophistication of their computer.
    GitHub repositories: https://github.com/MarkMLl?tab=repositories

    Gustavo 'Gus' Carreno

    • Hero Member
    • *****
    • Posts: 1111
    • Professional amateur ;-P
    Hey Marco,

    Depends on what you want.   If you want deterministic walking of the datastructure and not necessarily infinity scalability, array of arrays can be quite ok, see e.g. http://www.stack.nl/~marcov/lightcontainers.zip

    I've had a quick glance at the *.txt file and I'm intrigued by your solutions.
    Need to have a good glance at the code.

    Many thanks, Marco, for the code share!!

    Cheers,
    Gus
    Lazarus 3.99(main) FPC 3.3.1(main) Ubuntu 23.10 64b Dark Theme
    Lazarus 3.0.0(stable) FPC 3.2.2(stable) Ubuntu 23.10 64b Dark Theme
    http://github.com/gcarreno

    Gustavo 'Gus' Carreno

    • Hero Member
    • *****
    • Posts: 1111
    • Professional amateur ;-P
    Hey Mark,

    Ultimately what I'm trying to do is avoid using binary trees.

    Mandating that is likely to be incompatible with your original "optimise" request.

    MarkMLl

    I was afraid someone would say that, and quite frankly, I think I know it in my bones, but hey, if you don't ask you keep ignorant :)

    Thanks Mark!!

    Cheers,
    Gus
    Lazarus 3.99(main) FPC 3.3.1(main) Ubuntu 23.10 64b Dark Theme
    Lazarus 3.0.0(stable) FPC 3.2.2(stable) Ubuntu 23.10 64b Dark Theme
    http://github.com/gcarreno

     

    TinyPortal © 2005-2018