Print Page - How can one optimise inserting an item into an array, anywhere but the end?

Programming => General => Topic started by: Gustavo 'Gus' Carreno on August 11, 2022, 09:40:22 pm

Title: How can one optimise inserting an item into an array, anywhere but the end?
Post by: Gustavo 'Gus' Carreno on August 11, 2022, 09:40:22 pm

Hi Y'all,

In order to do that, one increments the length of the array by one, then moves all the elements that stand on the insertion index, and to the right of it, one place to the right and then insert our element at the insertion index.

For me, the logic implementation of shift-the-elements-to-the-right would be:

Code: Pascal [Select][+]

type
  TMyArray: Array of Integer; // As a quick example
 
procedure ShiftArrayRight(const AMyArray: TMyArray; AInsertionIndex: Integer);
var
  Index: Integer;
begin
  SetLength(AMyArray, Length(AMyArray)+1); // Would the const up there be an issue here?
  for Index:= High(AMyArray) downto AInsertionIndex do
  begin
    AMyArray[Index]:= AMyArray[Pred(Index)];
  end;
end;

And for small amounts contained in the array, it will make sense.
But at what point is this approach gonna be inefficient?

So my original idea to optimise it is to have a low level call that would shift a whole block of mem, one byte to the right.

But then more questions appear:

Is this possible?
Is it the best solution?
When does the array pass the boundary of a page, and there need to be 2 mem copy operations?

Hence I'm asking the more experienced programmer in this forum to help me on this one.

As per usual, many, MANY thanks in advance!!

Cheers,
Gus

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: MarkMLl on August 11, 2022, 09:49:46 pm

Don't use an array.

MarkMLl

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: AlexTP on August 11, 2022, 09:54:14 pm

Yes, better use TFPGList<item_type_here>, it is generic. Unit FGL.
Item type must be (should) non-managed.

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: Gustavo 'Gus' Carreno on August 11, 2022, 09:59:47 pm

Hey Mark,

Quote from: MarkMLl on August 11, 2022, 09:49:46 pm

Don't use an array.

I like your Mister Miyagi answer: Best way to block a punch, no be there. LOL !!! :D

How should I implement "no be there" in your opinion?

Cuz the only way I see to avoid the array is to use a linked list.

At their core, what do most of the TList* implementations use? An array, or a linked list? Something else I'm missing?

Cheers,
Gus

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: Gustavo 'Gus' Carreno on August 11, 2022, 10:02:47 pm

Hey Alex,

Quote from: AlexTP on August 11, 2022, 09:54:14 pm

Yes, better use TFPGList<item_type_here>, it is generic. Unit FGL.
Item type must be (should) non-managed.

Yes, that would also be my answer, but in it's core, does that generic use an array or a linked list?
And if it uses an array, how did the author manage the optimisation of inserting in the middle?

Cheers,
Gus

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: JdeHaan on August 11, 2022, 10:21:01 pm

First, you can use the standard Insert function for arrays.
But if you want to do it yourself, this is what I use:

System.Move(fItems[Index], fItems[Index + 1], SizeOf(Item) * (fCount - Index));

I separately keep track of the number items (fCount) and the capacity of the array, which may be more than fCount.

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: Gustavo 'Gus' Carreno on August 11, 2022, 10:36:57 pm

Hey JdeHaan,

Quote from: JdeHaan on August 11, 2022, 10:21:01 pm

First, you can use the standard Insert function for arrays.

Hummm, this proves I should know my arrays a little bit more since I was unaware of such function :)
I thank you SOOO very much for teaching me something new :)

But the same questions apply to this:

Is this method optimised?
How is it implemented at the bare metal level?

Quote from: JdeHaan on August 11, 2022, 10:21:01 pm

But if you want to do it yourself, this is what I use:

System.Move(fItems[Index], fItems[Index + 1], SizeOf(Item) * (fCount - Index));

I separately keep track of the number items (fCount) and the capacity of the array, which may be more than fCount.

This is the exact thing I had in mind: A low level Move. THANK YOU so much for this answer!!!

My only fear is this, because at the bare metal level, the move/copy is one byte at a time:

What's the algorithm that implements that move?
Does it copy from left-to-right or form right-to-left? If it copies from left to right, then all values are now the same as insertion point
Does it copy the chunk to another place and then back to the original position+1?
What if the array is so big that it crosses page boundaries, is Move aware of that?

Cheers,
Gus

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: MarkMLl on August 11, 2022, 10:41:28 pm

Quote from: Gustavo 'Gus' Carreno on August 11, 2022, 09:59:47 pm

How should I implement "no be there" in your opinion?

A linear array is the worst possible structure as far as insertions are concerned, hence the optimal solution cannot rely on them. Trust me: as an engineer, I tend to over-use arrays.

You need to look at what databases do internally, you need to consider the relative frequency of insertions and references, you need to consider the possibility of using e.g. an array of arrays or and array of linked lists, and you need to be prepared to build indexes and have some heuristic which indicates when it is time to rebuild or reallocate storage.

Having considered those points, you might find that it's most efficient to assemble your data into a database (even if it's then read into an array for rapid processing), or to use something based on a TList, or an explicit balanced tree, or whatever computersciency replacement is in vogue.

Quote

At their core, what do most of the TList* implementations use? An array, or a linked list? Something else I'm missing?

Who knows? It's an opaque data structure, so there is no way of knowing whether its implementation is the same across all platforms and all FPC (etc.) versions.

But your question boils down to (a) efficient insertion and (b) efficient indexed access... and to answer that you need to consider the relative number of times those occur.

MarkMLl

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: marcov on August 11, 2022, 10:47:12 pm

Quote from: MarkMLl on August 11, 2022, 09:49:46 pm

Don't use an array.

Depends on what you want. If you want deterministic walking of the datastructure and not necessarily infinity scalability, array of arrays can be quite ok, see e.g. http://www.stack.nl/~marcov/lightcontainers.zip

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: winni on August 11, 2022, 10:52:50 pm

Quote from: Gustavo 'Gus' Carreno on August 11, 2022, 10:36:57 pm

What's the algorithm that implements that move?
Does it copy from left-to-right or form right-to-left? If it copies from left to right, then all values are now the same as insertion point
Does it copy the chunk to another place and then back to the original position+1?
What if the array is so big that it crosses page boundaries, is Move aware of that?

Cheers,
Gus

Hi!

As you can see in the source move is done in Assembler.

In the very early days of Pascal you had to take care of moving left or right, there was moveL and moveR.

That was replaced long time ago by only one move that takes care of the direction.

I have moved big pieces of data with fpc and I never had problems.

In Turbo times move was often used to speed up execution.
With todays machines there is no need anymore to do that.

Winni

Winni

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: Gustavo 'Gus' Carreno on August 11, 2022, 10:54:29 pm

Hey Y'all,

Ultimately what I'm trying to do is avoid using binary trees.
At Uni, when trees came along, I was able to quickly understand the population and the traversal of one.
Alas, for the life of me, I could never understand the balancing of a tree. Well, the balancing I understood, but the implementation of it, NOPE!

And if I want to do a proper hash map class, the best way to implement such a thing is on the back of a binary tree, a balanced binary tree!!!

So, I know how to do a binary search on an ordered array that will land me an O_{Log N}. But then I have an O_N for the array insertion, ARGH!!
If I do linked lists, I cannot implement a binary search since I have no indexes, and no way to quickly get at the middle element of a sub set of a linked list. So whatever algo i use for search, would always be O_N, but I would have O₁ in the insertion. ARGHHHH!!!

If only the insertion of the array could be optimized out of O_N, that would be GRAND!!

Either that, or I'll have to copy/paste some binary tree balancing code that I really don't understand and for me that's a tad unacceptable!

Cheers,
Gus

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: Gustavo 'Gus' Carreno on August 11, 2022, 11:13:44 pm

Hey Winni,

Quote from: winni on August 11, 2022, 10:52:50 pm

As you can see in the source move is done in Assembler.

Ok, this gives me some hope :)

Quote from: winni on August 11, 2022, 10:52:50 pm

In the very early days of Pascal you had to take care of moving left or right, there was moveL and moveR.

That was replaced long time ago by only one move that takes care of the direction.

Clever, exactly the assurance I wanted :)

Quote from: winni on August 11, 2022, 10:52:50 pm

I have moved big pieces of data with fpc and I never had problems.

Kewl, kewl, looks like I'll have to investigate and a add another app to my Test* series.

Many thanks Winni!!

Cheers,
Gus

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: MarkMLl on August 11, 2022, 11:39:52 pm

Quote from: Gustavo 'Gus' Carreno on August 11, 2022, 10:54:29 pm

Ultimately what I'm trying to do is avoid using binary trees.

Mandating that is likely to be incompatible with your original "optimise" request.

MarkMLl

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: Gustavo 'Gus' Carreno on August 11, 2022, 11:40:59 pm

Hey Marco,

Quote from: marcov on August 11, 2022, 10:47:12 pm

Depends on what you want. If you want deterministic walking of the datastructure and not necessarily infinity scalability, array of arrays can be quite ok, see e.g. http://www.stack.nl/~marcov/lightcontainers.zip

I've had a quick glance at the *.txt file and I'm intrigued by your solutions.
Need to have a good glance at the code.

Many thanks, Marco, for the code share!!

Cheers,
Gus

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: Gustavo 'Gus' Carreno on August 11, 2022, 11:42:50 pm

Hey Mark,

Quote from: MarkMLl on August 11, 2022, 11:39:52 pm

Quote from: Gustavo 'Gus' Carreno on August 11, 2022, 10:54:29 pm
Ultimately what I'm trying to do is avoid using binary trees.

Mandating that is likely to be incompatible with your original "optimise" request.

MarkMLl

I was afraid someone would say that, and quite frankly, I think I know it in my bones, but hey, if you don't ask you keep ignorant :)

Thanks Mark!!

Cheers,
Gus

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: MarkMLl on August 11, 2022, 11:59:20 pm

Quote from: Gustavo 'Gus' Carreno on August 11, 2022, 11:42:50 pm

I was afraid someone would say that, and quite frankly, I think I know it in my bones, but hey, if you don't ask you keep ignorant :)

Well, there's B-Trees, B+-Trees and Bonsai Trees... and there was something about those that made IBM buy the company that invented them.

There's definitely hybrid approaches you could use. For example, if you knew that your overall array would never have more than 64K elephants you could implement it as 256 dynamic arrays each of up to 256 elements, each of which had an index, and an otter class wrapper to handle indexed access.

The absolutely crucial thing is to look at how often there is a mid-array insertion, and how far it is from the end, relative to the number of reads.

MarkMLl

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: Gustavo 'Gus' Carreno on August 12, 2022, 12:11:24 am

Hey Mark,

Quote from: MarkMLl on August 11, 2022, 11:59:20 pm

Well, there's B-Trees, B+-Trees and Bonsai Trees... and there was something about those that made IBM buy the company that invented them.

Eheheh!! So now, it begs the question: Does FPC ship with an implementation of a binary tree kinda container?

Quote from: MarkMLl on August 11, 2022, 11:59:20 pm

There's definitely hybrid approaches you could use. For example, if you knew that your overall array would never have more than 64K elephants you could implement it as 256 dynamic arrays each of up to 256 elements, each of which had an index, and an otter class wrapper to handle indexed access.

You know what? Marco also mentioned arrays of arrays and it only clicked with this explanation of yours!! Thanks Mark, this helped a bit.

Quote from: MarkMLl on August 11, 2022, 11:59:20 pm

The absolutely crucial thing is to look at how often there is a mid-array insertion, and how far it is from the end, relative to the number of reads.

To be quite honest, a lot of the usage will see the array containing only around 1 to ~50 items. Yeah, again stuff from the coder I mentioned in the "non-blocking" post :(
But I like to have it future proof, where there could be items in the hundreds, thousands even...

I like my containers future proof and optimised. And maybe I'm suffering from early optimization sickness and I should just use a bloody array for the time being, and then when the load gets a bit too much for the array, then bite the bullet and just do the bloody binary tree !!

I dunno!!!!

Cheers,
Gus

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: MarkMLl on August 12, 2022, 09:00:41 am

Quote from: Gustavo 'Gus' Carreno on August 12, 2022, 12:11:24 am

To be quite honest, a lot of the usage will see the array containing only around 1 to ~50 items. Yeah, again stuff from the coder I mentioned in the "non-blocking" post :(
But I like to have it future proof, where there could be items in the hundreds, thousands even...

For something that small, have a maximal-sized array of integers as an index and an unsorted dynamic array. But be aware that extending a dynamic array can be slow, so do it in chunks.

MarkMLl

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: Gustavo 'Gus' Carreno on August 12, 2022, 09:17:12 am

Hey Mark,

Quote from: MarkMLl on August 12, 2022, 09:00:41 am

For something that small, have a maximal-sized array of integers as an index and an unsorted dynamic array. But be aware that extending a dynamic array can be slow, so do it in chunks.

During the past few days I've been imagining ways of pulling it off and one of them was exactly that. Thanks for validating some of my brain wanderings :)

And you've now also gave me the answer to why there is a variable called capacity attached to dynamic arrays. We have to SetLengh() in chunks because it's costly!!

Again, many thanks for your share of knowledge !!

Cheers,
Gus

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: 440bx on August 12, 2022, 09:23:05 am

Quote from: Gustavo 'Gus' Carreno on August 12, 2022, 09:17:12 am

And you've now also gave me the answer to why there is a variable called capacity attached to dynamic arrays. We have to SetLengh() in chunks because it's costly!!

Is this something that will be Windows only or does it have to be multiplatform ?

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: AlexTP on August 12, 2022, 10:04:43 am

Gustavo,
TList (and generics like TFPGList) are based on array internally. And yes, inserting to the middle - makes slow array moving - like Move( ) that you seen in this thread. So it is not optimal. What is optimal from the point of view of insertion - is linked list.

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: Zvoni on August 12, 2022, 10:35:27 am

Staying with the original question (Inserting into an Array):
In Visual Basic 6 (Blast the Dot CRAP) i did it as follows

(Don't start with Array-Index beginning at 0, this is to visualize it)
You have a Source-Array with Length 100 (1..100)
Now you want to insert at Position 42
Create a Destination Array with Length 101 (1..101)
Copy the Memory starting at Pos 1 in Source-Array with Length 41*MemberSize (Member can be 2 Bytes, 4 Bytes, 8 Bytes, A "hidden" Pointer--> Strings, Records etc.) to DestArray[1]
Assign Array-Member[42] the new value
Copy the Memory starting at Pos 42 in Source-Array to Dest-Array[43] with Length 59*MemberSize
Discard/Destroy/Erase Source-Array

No idea about performance and what not.
In VBA it worked without problems, and i never noticed any performance glitches, though maybe i just didn't have enough data in my arrays
*shrug*

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: MarkMLl on August 12, 2022, 10:45:59 am

Quote from: Zvoni on August 12, 2022, 10:35:27 am

Copy the Memory starting at Pos 1 in Source-Array with Length 41*MemberSize (Member can be 2 Bytes, 4 Bytes, 8 Bytes, A "hidden" Pointer--> Strings, Records etc.) to DestArray[1]
Assign Array-Member[42] the new value
Copy the Memory starting at Pos 42 in Source-Array to Dest-Array[43] with Length 59*MemberSize
Discard/Destroy/Erase Source-Array

If access assumed that the array was actually split into two fragments A and B (Hell, I wasn't the one who brought BASIC into it :-) with A containing 100 elements and B empty, then inserting element 42 would involve moving the top fragment into B, trimming A and then appending to A. An insertion of an element >42 would now involve appending the unaffected elements from B to A, and so on.

However the management and performance burden would be substantially worse than using e.g. index+store, or "doing it properly" with a linked list or tree.

MarkMLl

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: Zvoni on August 12, 2022, 10:48:40 am

Quote from: MarkMLl on August 12, 2022, 10:45:59 am

However the management and performance burden would be substantially worse than using e.g. index+store, or "doing it properly" with a linked list or tree.

MarkMLl

Mark, don't get me wrong: I actually agree with you, i just addressed the original question, and my answer was, how i did/do it in VBA
My answer describes more the "algorithm"/technique i use(d)

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: PascalDragon on August 12, 2022, 04:15:39 pm

Quote from: Gustavo 'Gus' Carreno on August 11, 2022, 09:40:22 pm

And for small amounts contained in the array, it will make sense.
But at what point is this approach gonna be inefficient?

So my original idea to optimise it is to have a low level call that would shift a whole block of mem, one byte to the right.

But then more questions appear:
Is this possible?
Is it the best solution?
When does the array pass the boundary of a page, and there need to be 2 mem copy operations?

For small arrays you can just as well use Insert and call it a day as it tries to be as performant as possible for what it can work with. Not to mention that it also deals with the nitty, gritty details of managed types.

For larger arrays you will likely need additional or different data structures (e.g. the mentioned binary trees).

Also as long as you aren't developing a database system that needs to be sure where each page is (in memory or on disk) then you don't need to deal with pages aside from "is it valid?".

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: Gustavo 'Gus' Carreno on August 13, 2022, 06:35:16 am

Hey 440bx,

Quote from: 440bx on August 12, 2022, 09:23:05 am

Is this something that will be Windows only or does it have to be multiplatform ?

The intent, at least with me, is to always aim at cross-platform.

Cheers,
Gus

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: Gustavo 'Gus' Carreno on August 13, 2022, 06:40:04 am

Hey Alex,

Quote from: AlexTP on August 12, 2022, 10:04:43 am

TList (and generics like TFPGList) are based on array internally. And yes, inserting to the middle - makes slow array moving - like Move( ) that you seen in this thread. So it is not optimal. What is optimal from the point of view of insertion - is linked list.

Ok, many thanks ALex, for confirming that a lot of the list classes are implemented over an array :)

And I do agree that for insertion, the linked list is the best, since it's O₍₁₎. The problem is locating the place to insert, and that can be O_(N), which is the worst :(

Still, you gave me an assurance that I wasn't saying bullshit about the lists being based in arrays and I cannot thank you enough for that!!!

Cheers,
Gus

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: Gustavo 'Gus' Carreno on August 13, 2022, 06:47:49 am

Hey PascalDragon,

Quote from: PascalDragon on August 12, 2022, 04:15:39 pm

For small arrays you can just as well use Insert and call it a day as it tries to be as performant as possible for what it can work with. Not to mention that it also deals with the nitty, gritty details of managed types.

Okidokes, many thanks for more complete tidbits surrounding arrays and the Insert in particular !!!

Quote from: PascalDragon on August 12, 2022, 04:15:39 pm

For larger arrays you will likely need additional or different data structures (e.g. the mentioned binary trees).

Hummm, confirms my suspitions, thanks!
Does Free Pascal come with a balanced binary tree implementation somewhere in it's bowels? <--- EDIT

Quote from: PascalDragon on August 12, 2022, 04:15:39 pm

Also as long as you aren't developing a database system that needs to be sure where each page is (in memory or on disk) then you don't need to deal with pages aside from "is it valid?".

Thanks, again, since this is rather good to know!!

As usual, my dear PascalDragon, a trove of information that is useful and presented in a clear and succinct way. Kudos mate!!

Cheers,
Gus

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: jollytall on August 13, 2022, 07:58:38 am

A bit off, but still worth considering.
Where does your data to insert come from, and when do you need to use it?
Typically, either you have the data upfront, before actually using it (e.g. on disk) or the data slowly comes while also being processed (e.g. user input, sensors, network).
In the first case, maybe first load all data without inserting to the right place and sort the array once before use.
In the second case, unless talking about very large data sets, the input is slow enough to do whatever insert method you want.

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: BrunoK on August 13, 2022, 09:33:13 am

In FPC 3.2.2, for example
\packages\fcl-base\src\avl_tree.pp
Explanations :
https://en.wikipedia.org/wiki/AVL_tree

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: jcmontherock on August 13, 2022, 10:34:59 am

Returning to the first question:
Insert for dynamic array now exists in 322:

Insert(Item, Array, Index);
at end:
Insert(Item, Array, Length(Array));

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: MarkMLl on August 13, 2022, 10:42:28 am

And the timings look like... ?

MarkMLl

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: Ten_Mile_Hike on August 14, 2022, 12:31:48 am

Is an "array of arrays" the same thing as a multidimensional array ... Asking for a friend %)

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: alpine on August 14, 2022, 01:13:27 am

Quote from: Ten_Mile_Hike on August 14, 2022, 12:31:48 am

Is an "array of arrays" the same thing as a multidimensional array ... Asking for a friend %)

Now I can remember my old professor saying : "In the end it all boils down to a vector, be it to the vector of the main memory".

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: MarkMLl on August 14, 2022, 09:38:35 am

Quote from: Ten_Mile_Hike on August 14, 2022, 12:31:48 am

Is an "array of arrays" the same thing as a multidimensional array ... Asking for a friend %)

Not necessarily.

With a small number of exceptions on historical systems, the content of an array will be contiguous. That applies to both the case of a single-dimensioned array, and a multidimensioned array.

You cannot necessarily rely on an array of arrays having contiguous content, particularly where for some reason an array is represented by a pointer or some form of descriptor with the actual content on the heap (e.g. FPC dynamic arrays).

The historical exceptions are some very old systems which divided even single-dimensioned arrays into page-size chunks, and systems such as OS/2 on the '286 which had convoluted rules for making large segmented arrays look as though they were contiguous even if they really weren't.

In addition, considering a language such as (any dialect of) Pascal a multidimensioned array has all elements the same type, while in the case of an array-of-arrays they obviously are not (the first dimension has elements which are arrays, the final dimension has elements which are e.g. numbers).

MarkMLl

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: jollytall on August 14, 2022, 11:24:36 am

Quote from: Ten_Mile_Hike on August 14, 2022, 12:31:48 am

Is an "array of arrays" the same thing as a multidimensional array ... Asking for a friend %)

MarkMLI maybe found it too obvious, but one more difference: A normal multidimensional array has a rectangular shape (in 2D) and so on in larger dimensions, while (for dynamic arrays) an array of an array can have different size rows.

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: alpine on August 14, 2022, 01:34:36 pm

Quote from: jollytall on August 14, 2022, 11:24:36 am

Quote from: Ten_Mile_Hike on August 14, 2022, 12:31:48 am
Is an "array of arrays" the same thing as a multidimensional array ... Asking for a friend %)

MarkMLI maybe found it too obvious, but one more difference: A normal multidimensional array has a rectangular shape (in 2D) and so on in larger dimensions, while (for dynamic arrays) an array of an array can have different size rows.

Considering that, the dynamic arrays can give a memory size benefit, not a speed one. Splitting the array to smaller chunks is just marginally better, since the problem of shifting the tail remains, just the tail becomes smaller.

But the problem with the linearisation emerges - in which chunk to insert the current element given the 1D insertion index n? The chunk sizes must be extracted sequentially from n until n is smaller than the current one and that will be the 2-nd level index. Sounds optimal?

Back to the original question:

Quote

How can one optimise inserting an item into an array, anywhere but the end?

Here IMHO the most straightforward answer:

Quote

Don't use an array.

MarkMLl

Or with other words: "No one can."

The OP must give more details for the application in order to receive more beneficial answer. Other is just a speculation.
Being speculative, I would propose in turn the linear hashing (https://en.wikipedia.org/wiki/Linear_hashing).

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: Thaddy on August 14, 2022, 06:09:20 pm

Quote from: y.ivanov on August 14, 2022, 01:13:27 am

Quote from: Ten_Mile_Hike on August 14, 2022, 12:31:48 am
Is an "array of arrays" the same thing as a multidimensional array ... Asking for a friend %)
Now I can remember my old professor saying : "In the end it all boils down to a vector, be it to the vector of the main memory".

Strange professor, because a multiple dimension array has a linear layout in memory, same as a one dimensional array, so unless you sort, it is O(N) and the sort is costly. Basics.

In the case at hand, one should assume O(N). It is impossible to improve on that.

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: alpine on August 14, 2022, 07:09:46 pm

Quote from: Thaddy on August 14, 2022, 06:09:20 pm

Quote from: y.ivanov on August 14, 2022, 01:13:27 am
Quote from: Ten_Mile_Hike on August 14, 2022, 12:31:48 am
Is an "array of arrays" the same thing as a multidimensional array ... Asking for a friend %)
Now I can remember my old professor saying : "In the end it all boils down to a vector, be it to the vector of the main memory".
Strange professor, because a multiple dimension array has a linear layout in memory, same as a one dimensional array, so unless you sort, it is O(N) and the sort is costly. Basics.

In the case at hand, one should assume O(N). It is impossible to improve on that.

Why strange? Isn't it the same statement? Or maybe I didn't wrote it well in English %)

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: MarkMLl on August 14, 2022, 10:30:50 pm

Quote from: Thaddy on August 14, 2022, 06:09:20 pm

In the case at hand, one should assume O(N). It is impossible to improve on that.

That only applies if the insertion point is randomly-distributed: if it's predominantly near the end the workload will be proportionately less.

And for that matter, if it's predominantly near the beginning the obvious hack will also make it predominantly less.

MarkMLl

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: SymbolicFrank on August 15, 2022, 02:05:05 pm

If you want both fast inserts and fast retrieval, you can do it like a database. You use chunks that are an integer amount times the max size, a linked list to add new ones and an index to find the one you like.

To insert data: see if there is enough room left in the last chunk, otherwise just allocate a new one and put it in there. Do this until everything is stored. Now add the entries to the index.

The index can be a simple array, or you could use chunks as well. That gives you an index of the index, where only the blocks are sorted. And that is what you want for fast access.

And, of course, you can make a binary tree of the index.

Title: Re: How can one optimise inserting an item into an array, anywhere but the end?
Post by: BrunoK on August 20, 2022, 07:47:10 pm

I have created a TCountedTree = class(TAVLTree)
The joined graphic shows that using a Tree for insertion, even if suffering from the overhead of having node data instances, is much faster than inserting to a TFPList.
I didn't put the line for the total timing of TFPList because it shoots out of the graphic.
For the same 400'000 items, it takes a total of 10.170 seconds to insert to a FPList to compare to 325.023 ms for inserting to a balanced tree.

Lazarus

Programming => General => Topic started by: Gustavo 'Gus' Carreno on August 11, 2022, 09:40:22 pm