Recent

Author Topic: Concat strings in parallel  (Read 1746 times)

LemonParty

  • Sr. Member
  • ****
  • Posts: 362
Concat strings in parallel
« on: September 09, 2025, 11:21:54 pm »
Hello.

I want to concat TStringArray using multithread. I don't know how to send some signal from thread to main thread that current string was processed (and next string should start processing). How I can achieve that?
Lazarus v. 4.99. FPC v. 3.3.1. Windows 11

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 11812
  • Debugger - SynEdit - and more
    • wiki
Re: Concat strings in parallel
« Reply #1 on: September 10, 2025, 12:19:21 am »
Not sure how you are planing to exactly do that... But are you sure the bottleneck is CPU time?
You still need to copy all the text from different RAM locations into one continuos destination. So there would be some restraints there.

Also have you measured in your app what takes the most time? Getting the individual strings into the TStringArray, or copying the final result into one big memory chunk (without time for allocating it).

And then in the end, potentially freeing the array, and all strings in it (all via ref count...). And even if you set strings to nil, in different threads, the mem manager probably will have some critical sections it needs to enter...

---

I have done some similar work, for building extremely large strings of thousands of tiny fragments. The time saving isn't that big, because of the work to produce the strings takes most of the time.

See TStringBuilderPart used in unit IdeDebuggerWatchResPrinter

It's a nested list, works best if you know of each part/section how many part it needs to contain. So you can alloc in advance.

Of course setting each part still needs to do string refcounting. That could be further optimized.

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 11812
  • Debugger - SynEdit - and more
    • wiki
Re: Concat strings in parallel
« Reply #2 on: September 10, 2025, 12:23:02 am »
Also if you do something in threads, every thread communication needs time of its own.... So you need good savings to make up for that.

I don't know which ones are fastest, and that probably depends on the underlying system / OS. Maybe even the version of that.

440bx

  • Hero Member
  • *****
  • Posts: 5814
Re: Concat strings in parallel
« Reply #3 on: September 10, 2025, 12:50:20 am »
Hello.

I want to concat TStringArray using multithread. I don't know how to send some signal from thread to main thread that current string was processed (and next string should start processing). How I can achieve that?
As @Martin_fr implied, it is unlikely a multi-threaded implementation of such string concatenation will lead to a result in less time.

Leaving any synchronization between the worker threads and the main thread aside, just the creation of the threads will take a significant amount of time relative to the time required to move bytes from one place to another.  IOW, in the time a thread is created it's likely several hundred strings could have been concatenated, therefore just the creation of the thread slows the process down to an extent that may not be "recoverable".

Other problems come to mind but the one mentioned above is enough to dispose of that idea.  OTH, academically, it is an interesting exercise.  From a performance viewpoint, it looks like it will be a case of "too many cooks in the kitchen".

HTH.
FPC v3.2.2 and Lazarus v4.0rc3 on Windows 7 SP1 64bit.

d2010

  • Full Member
  • ***
  • Posts: 230
Re: Concat strings in parallel
« Reply #4 on: September 10, 2025, 04:27:17 am »
Hello.
I want to concat TStringArray using multithread. I don't know how to send some signal from thread to main thread that current string was processed (and next string should start processing). How I can achieve that?

I my opinion concat of AnsiString or String , is worst mind because You can not used multithread over memory Heap, and "concat of AnsiString"  touch too many times the memory Heap. :D (.e.g memory leaks).

LemonParty

  • Sr. Member
  • ****
  • Posts: 362
Re: Concat strings in parallel
« Reply #5 on: September 10, 2025, 09:20:06 am »
I sketch this code to check a concept.
Can't understand why event TMaster.OnTerminate not triggered.
Lazarus v. 4.99. FPC v. 3.3.1. Windows 11

Thaddy

  • Hero Member
  • *****
  • Posts: 18344
  • Here stood a man who saw the Elbe and jumped it.
Re: Concat strings in parallel
« Reply #6 on: September 10, 2025, 10:07:29 am »
Code: Pascal  [Select][+][-]
  1. procedure TConcatenationThread.Execute;
  2. begin
  3.   Move(FString[1], FDestination^, Length(FString));
  4.   FString:= '';
  5.   Terminate; // <---- is very wrong, skip it...
  6. end;
You already free the thread automatically because you set freeonterminate to true (which is usually not a good idea). If the task is done, the thread will terminate. You can even cause a sigsev or a deadlock there.
That is because the terminate may - will - kick in before or after you call it explicitly....The thread is already terminating....because its task is done...

In effect you call terminate twice...

I wonder if this - when corrected - leads to speed gains, btw.
« Last Edit: September 10, 2025, 10:25:00 am by Thaddy »
Due to censorship, I changed this to "Nelly the Elephant". Keeps the message clear.

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 11812
  • Debugger - SynEdit - and more
    • wiki
Re: Concat strings in parallel
« Reply #7 on: September 10, 2025, 10:53:18 am »
Code: Pascal  [Select][+][-]
  1. Terminate;
This just sets a flag, that the caller want the thread to know that it should (voluntarily) "stop" (return from the "Execute" method). It just sets a variable (field).

So your Execute method would need to check, and then exit => of course since your aren't running a loop in the Execute method, you will return anyway. So you don't need to check.
=> And you don't need to indicate, but the indication via "Terminate" is anyway meant to be done from "outside the thread".

OnTerminate will be called in the main thread, when
- the thread has terminated
- the terminate thread has been "collected" => for that you need to call "WaitFor".

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 11812
  • Debugger - SynEdit - and more
    • wiki
Re: Concat strings in parallel
« Reply #8 on: September 10, 2025, 10:58:26 am »
So, if you solved the "Terminate" issues ...

Your "OnTerminate" sets the threads FDestination and FString to new values => but the thread has stopped, it will not do anything with that.  You would need to start the thread again => very time consuming.

Instead you want to do a loop inside
  procedure TConcatenationThread.Execute;
But then you must "syncronize" for each finished item => again majorly expensive.

Though you could just use threadsave interlocked statements, and get the next entry without waiting for the main thread to feed each thread. But that requires a lot of detail to attention.  Otherwise you get race conditions.

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 11812
  • Debugger - SynEdit - and more
    • wiki
Re: Concat strings in parallel
« Reply #9 on: September 10, 2025, 11:02:04 am »
Code: Pascal  [Select][+][-]
  1. FString:= FStringArray[CurrentIndexes[i]];

While that would anyway need to be different, due to above considerations.... This will increment the strings refcount. That is a (semi) thread safe operation, so it takes a little extra time.

Since you know, that the string is not changed while the thread is running, just pass a pointer to the string (PString).

Thaddy

  • Hero Member
  • *****
  • Posts: 18344
  • Here stood a man who saw the Elbe and jumped it.
Re: Concat strings in parallel
« Reply #10 on: September 10, 2025, 11:02:20 am »
Code: Pascal  [Select][+][-]
  1. Terminate;
This just sets a flag
You confuse the terminate method with the flag terminated.
Due to censorship, I changed this to "Nelly the Elephant". Keeps the message clear.

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 11812
  • Debugger - SynEdit - and more
    • wiki
Re: Concat strings in parallel
« Reply #11 on: September 10, 2025, 11:08:03 am »
Code: Pascal  [Select][+][-]
  1. Terminate;
This just sets a flag
You confuse the terminate method with the flag terminated.

No, I don't.

I wrote: "sets a flag"

Here is the code (from the RTL):
Code: Pascal  [Select][+][-]
  1. procedure TThread.Terminate;
  2. begin
  3.   FTerminated := True;
  4.   TerminatedSet;
  5. end;
  6.  

I does set FTerminated (the flag).

As for "terminated" that you mentioned: => that checks the flag (or "is the flag" to be exact).

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 11812
  • Debugger - SynEdit - and more
    • wiki
Re: Concat strings in parallel
« Reply #12 on: September 10, 2025, 11:11:08 am »
Off topic

I noticed you set the debugger for the project to GDB. Is there a reason? Was there a particular issue that FpDebug could not solve for this? (Unless you are running Windows an ARM)

LemonParty

  • Sr. Member
  • ****
  • Posts: 362
Re: Concat strings in parallel
« Reply #13 on: September 10, 2025, 11:37:45 am »
Off topic

I noticed you set the debugger for the project to GDB. Is there a reason? Was there a particular issue that FpDebug could not solve for this? (Unless you are running Windows an ARM)
No specific reason. I just picked the first option and this was GDB. I will try FpDebug.
Lazarus v. 4.99. FPC v. 3.3.1. Windows 11

Warfley

  • Hero Member
  • *****
  • Posts: 2021
Re: Concat strings in parallel
« Reply #14 on: September 10, 2025, 11:39:56 am »
Hello.

I want to concat TStringArray using multithread. I don't know how to send some signal from thread to main thread that current string was processed (and next string should start processing). How I can achieve that?

You don't, signaling is wayyyyy to expensive to use it per list entry, unless you are talking gigabytes of strings. Instead split the array up into regions (e.g. index 1..10, 11..20 etc) and have the algorithm work on that in parallel.

Also note that strings are inherently not thread safe, so never touch the same string in different threads
« Last Edit: September 10, 2025, 11:41:55 am by Warfley »

 

TinyPortal © 2005-2018