* * *

Author Topic: Mtprocs: computing with many threads and garbage results  (Read 837 times)

daringly

  • New member
  • *
  • Posts: 30
Mtprocs: computing with many threads and garbage results
« on: March 24, 2017, 04:55:13 pm »
I have a series of computations that I'd like to run in parallel to speed it up.

The program looks something like this:

Type InputData = record
                            ....
                           end;

var InputD: Array[1..10000] of InputData;
     OutputD: Array[1..10000] of real;

Procedure AnalyzeIt(Index: PtrInt; Data: Pointer; Item: TmultiThreadProcItem);
// This does computations using data from InputD[Index], and stores the output in OutputD[index]
// The routine calls lots of other global procedures and functions, none of which use global variables except
// InputD and OutputD,
// and then only to the array element [Index]

begin
// other routines build up input data
ProcThreadPool.DoParallel(@AnalyzeIt, 1, 10000, nil);
//other routines store the output somewhere

The program runs, the data is saved, and it is garbage. If you run it twice, you get different results for each data point (and there are no random calls or changing inputs). Clearly I am missing something big picture here.

Any suggestions?

If I have a global function Add(x,y) for example, and Add gets called in two threads at the same time, I am assuming that Add is run locally in each thread. Is that incorrect?


Phil

  • Hero Member
  • *****
  • Posts: 2043
Re: Mtprocs: computing with many threads and garbage results
« Reply #1 on: March 24, 2017, 05:34:35 pm »
Any suggestions?

Does it run okay in a single thread?

Maybe post a simple example project here that doesn't work in multiple threads.

Make sure you don't have any other collisions between threads, for example with temporary file names, etc.

daringly

  • New member
  • *
  • Posts: 30
Re: Mtprocs: computing with many threads and garbage results
« Reply #2 on: March 24, 2017, 06:02:06 pm »
It runs fine as a single thread (although 10% slower than if the computations are not run as a thread).

Thaddy

  • Hero Member
  • *****
  • Posts: 3434
Re: Mtprocs: computing with many threads and garbage results
« Reply #3 on: March 24, 2017, 08:14:25 pm »
If it is only 10% then there's probably something seriously wrong with your code.
It either doesn't lend itself very well to parallelization or the code is not written very well.
Can you give us a compilable example?

Phil

  • Hero Member
  • *****
  • Posts: 2043
Re: Mtprocs: computing with many threads and garbage results
« Reply #4 on: March 24, 2017, 09:10:18 pm »
It runs fine as a single thread (although 10% slower than if the computations are not run as a thread).

I wouldn't bother with the additional complexity just to get a 10% speedup. (Although since you haven't gotten it to run correctly with threads yet, the actual difference may be greater or less than that when you do.)

I recall a rule of thumb I heard once: users won't even notice speed improvements of less than 20%.

Maybe focus on the code that takes the other 90% first?

daringly

  • New member
  • *
  • Posts: 30
Re: Mtprocs: computing with many threads and garbage results
« Reply #5 on: March 24, 2017, 09:40:11 pm »
If it is only 10% then there's probably something seriously wrong with your code.
It either doesn't lend itself very well to parallelization or the code is not written very well.
Can you give us a compilable example?

Perhaps I was unclear.

If I run the AnalyzeIt alone, not creating a thread, it takes about 6 minutes. If I create a single thread, it takes nearly 7 minutes.

If I run it with all my cores (eight), is takes less than 2 minutes, but the results are random, wrong, and impossible. For example, the output should always be between 0 and 1, and sometimes it is above 1.

This is a computational series I will run hundreds or even thousands of times, so I'm willing to spend 20 hours or more to get it right.
« Last Edit: March 24, 2017, 09:42:24 pm by daringly »

Cyrax

  • Sr. Member
  • ****
  • Posts: 495
Re: Mtprocs: computing with many threads and garbage results
« Reply #6 on: March 24, 2017, 11:32:31 pm »
You need to split and isolate your data to one thread at time. You can't modify your data when other thread is processing it.

daringly

  • New member
  • *
  • Posts: 30
Re: Mtprocs: computing with many threads and garbage results
« Reply #7 on: March 25, 2017, 01:19:05 am »
It's not enough to have thread one look only at array element [1], thread two to only work with element[2]? The index specifies which element of the output array to change, so no two threads should ever change the same element of the array... even if somehow all 10k ran at once.

HeavyUser

  • Full Member
  • ***
  • Posts: 104
Re: Mtprocs: computing with many threads and garbage results
« Reply #8 on: March 25, 2017, 06:26:26 am »
Erm as everyone said already, global variables must always be protected. Now to answer two of your questions.
If I have a global function Add(x,y) for example, and Add gets called in two threads at the same time, I am assuming that Add is run locally in each thread. Is that incorrect?
Each thread will initialize a different stack for the function, as a result if you stay inside that stack you are safe.
It's not enough to have thread one look only at array element [1], thread two to only work with element[2]?
The index specifies which element of the output array to change, so no two threads should ever change the same element of the array... even if somehow all 10k ran at once.
if the threads stay in their confined space then no it should not be a problem, the moment that you change their space from 1 to 2 then with out a lock you have a problem.
The problem is that you can not know where each thread is at any given point you can't use timing because you do not know how long a thread was running or how far in the code managed to execute before it was switched out. in theory your design should work, as long as no other micro management (aka OS thread manager) gets in your way. Locks and events are not there to make sure that nothing bad happens to your data they are there to help you synchronize the data access with out them there is no way to know if what you think is happening is really happening.

daringly

  • New member
  • *
  • Posts: 30
Re: Mtprocs: computing with many threads and garbage results
« Reply #9 on: March 25, 2017, 01:11:27 pm »
Thanks HeavyUser.  This was the concept I was completely missing.

How would you implement this?

HeavyUser

  • Full Member
  • ***
  • Posts: 104
Re: Mtprocs: computing with many threads and garbage results
« Reply #10 on: March 25, 2017, 10:15:18 pm »
Thanks HeavyUser.  This was the concept I was completely missing.

How would you implement this?
I would start with a single critical section but your design is single thread for a single value, I would create a critical section for each cell in the array. A thread will lock the section it is working on before it starts and release it after the job is done. Any other thread that wants to work with that value upon calling entercriticalsection will either lock the cell or wait until the thread that has a lock on it releases it. From there I would build up to a more refined control if that is needed, eg lock only when you read/write  the value not the entire time of the calculation.

 

Recent

Get Lazarus at SourceForge.net. Fast, secure and Free Open Source software downloads Open Hub project report for Lazarus