Recent

Author Topic: Best practices to use tRtlCriticalSection  (Read 1398 times)

jollytall

  • Sr. Member
  • ****
  • Posts: 366
Best practices to use tRtlCriticalSection
« on: November 30, 2023, 08:54:34 am »
I have a program with a number (~10) of threads. Every time a variable can be accessed by multiple threads ("shared variables") I protect their read and write with tRtlCriticalSection and it works fine. Still it raised me some questions. The first two are clear yes/no, the third is more opinion/taste/implementation based, but I am also interested in that one.

Question 1 - Do I need to protect every read?
E.g. I have two threads, A and B and they both access a variable V (let it be a large record for the example). A only reads it Va := V; while B can read and write it as well Vb := V; and V := Vb;. Currently I protect all three operations, but I am wondering if Vb := V; needs to be protected at all.
It is clear that V := Vb; needs to be protected, otherwise Va := V; might read a half updated, inconsistent record. It only makes sense if Va := V; is protected with the same CriticalSection.
However, do I need to protect Vb := V;? Thread A never updates it, so if both A and B read V at the same time, it is not supposed to be a problem. B cannot read and write V at the same time, so there is no conflict there either. So why would I protect B reading V (other then being future proof, should I change my mind and A would start to write V as well, etc.)?

Question 2 - Do I need to protect every V?
In Q1 V was a large record. but, what if SizeOf(V) is less than or equal to the processor bit length, e.g. a boolean or a natural integer type? I would assume that V := Vb; is such a simple processor operation that it cannot happen (event theoretically, not because it is "unlikely") that A starts reading while V is half written. E.g. in a 64 bit operating system V being Int64, can it happen that the first two bytes are already updated when B reads the whole variable. This I cannot test, because even if theoretically it is possible, it is so unlikely...
What if on a 64 bit system, V is a 32 bit integer? What is with larger things, e.g. a record or an AnsiString?

Question 3 - How many Critical Sections to declare?
If I have V and W like above. Does it make sense to use CriticalSectionV and CriticalSectionW to protect the two separately? Obviously if they are updated together and their consistency is important (e.g. X and Y coordinates of a point) then they have to be updated, read and hence protected in one critical section.
But if they are totally independent then I can use two separate CSs and so reading V does not block writing W. What are the things to consider to choose the best option? I would assume something like, that if the operations are fast (just V := Va;) then it is better to use one CS even for unrelated variables, not to make the whole program too complex and not to load the OS with many CSs to monitor. However, if the operations are large and long (e.g. reading a whole file from disk) then it is better to use separate CSs, e.g. not to block reading of a variable while writing a file.
The same can be extended to arrays. E.g. I have an array of records. If the length of the array changes during the operations, then I am sure we need to have one CS for the whole array (so when B changing the length of the array makes the system to move it in memory, then A does not try to read it), but what if the array length is fixed? Would it make sense to include a tRtlCriticalSection field in the record declaration and hence protect every record independently? Is it technically doable, or there can be cases when such an approach fails? If technically possible, how "expensive" is it to have hundreds of CSs? Does it worth it?

Mr.Madguy

  • Hero Member
  • *****
  • Posts: 859
Re: Best practices to use tRtlCriticalSection
« Reply #1 on: November 30, 2023, 09:34:49 am »
It's about atomic operations. One thread shouldn't access global state, while it's in some intermediate state. Simple example:
Code: Pascal  [Select][+][-]
  1. Dec(RefCount);
  2. if RefCount = 0 then Destroy;
  3.  
RefCount can be changed by other thread between "Dec" and "If", that can produce unpredictable results. Therefore we should make sure, that RefCount stays the same between these operations. So we guard them via critical section.

So, just think, what operations should be atomic. But don't overuse synchronization too much. Remember, that if two operations can't be executed at the same time due to synchronization - then you don't even need two threads in a first place.
« Last Edit: November 30, 2023, 10:11:07 am by Mr.Madguy »
Is it healthy for project not to have regular stable releases?
Just for fun: Code::Blocks, GCC 13 and DOS - is it possible?

runewalsh

  • Jr. Member
  • **
  • Posts: 85
Re: Best practices to use tRtlCriticalSection
« Reply #2 on: November 30, 2023, 10:11:55 am »
I am wondering if Vb := V; needs to be protected at all.
If B is the only thread that writes it, then of course it can read it without entering the CS. Also, if your variable is effectively read-only (e.g. is set up before the first reader starts, and is not written since then), it can be read without any synchronization.

Do I need to protect every V?
Reads and writes of (properly aligned, which is by default) native types, pointer-sized or smaller: pointer, Ptr(U)Int, or (u)intNN, are always atomic, i.e. if old value was 0 and you write $FFFFFFFF from thread A, then the only thing thread B can see is the same change from 0 to $FFFFFFFF at some point. (I don’t think there is such a guarantee for same-sized records, even correctly aligned.) With barriers, you can have more guarantees: let variables X = Y = 0; thread A sets X to $FFFFFFFF, issues WriteBarrier, then sets Y to $EEEEEEEE. If thread B sees Y = $EEEEEEEE and then issues a ReadBarrier, it will guaranteedly see X = $FFFFFFFF (without a barrier it can still see X = 0). Entering and leaving a CS emit these barriers automatically.

Mr.Madguy

  • Hero Member
  • *****
  • Posts: 859
Re: Best practices to use tRtlCriticalSection
« Reply #3 on: November 30, 2023, 10:30:25 am »
Reads and writes of (properly aligned, which is by default) native types, pointer-sized or smaller: pointer, Ptr(U)Int, or (u)intNN, are always atomic, i.e. if old value was 0 and you write $FFFFFFFF from thread A, then the only thing thread B can see is the same change from 0 to $FFFFFFFF at some point. (I don’t think there is such a guarantee for same-sized records, even correctly aligned.) With barriers, you can have more guarantees: let variables X = Y = 0; thread A sets X to $FFFFFFFF, issues WriteBarrier, then sets Y to $EEEEEEEE. If thread B sees Y = $EEEEEEEE and then issues a ReadBarrier, it will guaranteedly see X = $FFFFFFFF (without a barrier it can still see X = 0). Entering and leaving a CS emit these barriers automatically.
Problem is - it's true for single-core situation only. In single core situation execution can be interrupted between two CPU instructions only, so single instruction can be assumed to always be atomic. But I'm not sure about multi-core situation. I guess, it depends on memory controller implementation. And we can't assume, that CPU instruction is atomic. At least without LOCK prefix, that should lock cache and memory bus for instruction's duration.

Overall there are three cases here:
1) Special atomic CPU instructions, like CMPXCHG, are guaranteed to be atomic on single core.
2) LOCK prefix, that causes cache/memory bus lock and guarantees, that instruction is atomic in multi-core situation.
3) Multi-instruction atomic operations require critical sections.
Is it healthy for project not to have regular stable releases?
Just for fun: Code::Blocks, GCC 13 and DOS - is it possible?

Thaddy

  • Hero Member
  • *****
  • Posts: 16200
  • Censorship about opinions does not belong here.
Re: Best practices to use tRtlCriticalSection
« Reply #4 on: November 30, 2023, 10:42:52 am »
If the global variables ae simple types it is recommended to use the InterlockedXXX functions since they do not have the overhead of critical sections..
One restriction would be strings, but that would be obvious from the documentation of the interlocked functions.

Interlocked means in fact usually  / depending on platform / atomic operations and are always thread safe. and cross platform.
« Last Edit: November 30, 2023, 10:50:58 am by Thaddy »
If I smell bad code it usually is bad code and that includes my own code.

jollytall

  • Sr. Member
  • ****
  • Posts: 366
Re: Best practices to use tRtlCriticalSection
« Reply #5 on: November 30, 2023, 10:52:38 am »
Thanks runewalsh,
It is very clear for Q1 and Q2 (before Madguy's second reply). Regarding the mentioned read- and writebarriers, do they exist in FPC, without using critical sections?
Still wondering on Q3.

Thanks, Mr.Madguy,
Your first reply is very clear. I would not make that mistake.

Also thanks for your second reply, and clarifying that in Q2 it is still better to use control, as (a) nowadays most systems are multicore (or at least hyperthread, what might behave this way or that?) and (b) I can never be sure what processor operation is used when a pascal instruction is implemented. So, to be on the safe side, my read of it, is to use access control even for the simplest variable access.

However, I am not sure I agree with your last sentence in the first reply. In my case, e.g. I need to read some data (from a solar inverter) at a certain time, to be in sync with another data source (electricity meter) that gives data only at certain times through a serial port. At the same time I use the data to process it in a loop (switch on and off devices) as well as read and display data through a web server (so I can check what is happening).
I need synchronization not to process a half ready record, but I cannot do all in one thread, otherwise I could not serve a webserver at any time (i.e. wait for a connection request). Or would need to wait and accept connections with a very short timeout to be able to read the serial data when it is time for it and then try to wait/accept a connection again.
Also it would mean that I would need to program a general web server considering synchronization issues of a special case. It is/was much easier to make and use a general web server (what I wrote years ago) and only do a synchronization when it picks the data from the new modules.
On the other hand, once a record is read at the the right time, it is no problem to wait for a critical section, before the data (read at the right, synchronized time) is copied to the system. So, there definitely are use cases, when it make sense 8otherwise, it would not have been invented, I guess).

Thanks Thaddy,
I did not even know InterlockedXXX exist. Now I checked, and they really seem very useful for integers. For larger animals, like records they cannot be used, or do I misunderstand it?

runewalsh

  • Jr. Member
  • **
  • Posts: 85
Re: Best practices to use tRtlCriticalSection
« Reply #6 on: November 30, 2023, 10:57:27 am »
Mr.Madguy, AFAIK all multithreaded platforms in existence offer the guarantees I described.

Regarding the mentioned read- and writebarriers, do they exist in FPC, without using critical sections?

Yes, System.ReadBarrier, System.WriteBarrier, and full System.ReadWriteBarrier, but working with barriers is more complex and error-prone than with CS (not to mention that most of the things are impossible with just barriers and require a CS anyway), it was just for general information, not a guide to action.

Thaddy

  • Hero Member
  • *****
  • Posts: 16200
  • Censorship about opinions does not belong here.
Re: Best practices to use tRtlCriticalSection
« Reply #7 on: November 30, 2023, 11:04:24 am »
I explicitly wrote that the Interlocked functions are for simple types , including floting point types, but excluding strings, records, classes etc.
Also note as above by Runewalsh, barriers are an option but are very much more involved to implement, so when the need arizes I would use a critical section except when performance warrant its use and the extra development cycles.. But barriers gives you control over specific cores on your cpu and the pipelining of the caches.
If I smell bad code it usually is bad code and that includes my own code.

Mr.Madguy

  • Hero Member
  • *****
  • Posts: 859
Re: Best practices to use tRtlCriticalSection
« Reply #8 on: November 30, 2023, 11:05:54 am »
Thanks runewalsh,
It is very clear for Q1 and Q2 (before Madguy's second reply). Regarding the mentioned read- and writebarriers, do they exist in FPC, without using critical sections?
Still wondering on Q3.

Thanks, Mr.Madguy,
Your first reply is very clear. I would not make that mistake.

Also thanks for your second reply, and clarifying that in Q2 it is still better to use control, as (a) nowadays most systems are multicore (or at least hyperthread, what might behave this way or that?) and (b) I can never be sure what processor operation is used when a pascal instruction is implemented. So, to be on the safe side, my read of it, is to use access control even for the simplest variable access.

However, I am not sure I agree with your last sentence in the first reply. In my case, e.g. I need to read some data (from a solar inverter) at a certain time, to be in sync with another data source (electricity meter) that gives data only at certain times through a serial port. At the same time I use the data to process it in a loop (switch on and off devices) as well as read and display data through a web server (so I can check what is happening).
I need synchronization not to process a half ready record, but I cannot do all in one thread, otherwise I could not serve a webserver at any time (i.e. wait for a connection request). Or would need to wait and accept connections with a very short timeout to be able to read the serial data when it is time for it and then try to wait/accept a connection again.
Also it would mean that I would need to program a general web server considering synchronization issues of a special case. It is/was much easier to make and use a general web server (what I wrote years ago) and only do a synchronization when it picks the data from the new modules.
On the other hand, once a record is read at the the right time, it is no problem to wait for a critical section, before the data (read at the right, synchronized time) is copied to the system. So, there definitely are use cases, when it make sense 8otherwise, it would not have been invented, I guess).

Thanks Thaddy,
I did not even know InterlockedXXX exist. Now I checked, and they really seem very useful for integers. For larger animals, like records they cannot be used, or do I misunderstand it?
As I know, interlocked opeations do exactly this - simply add LOCK prefix to other instructions.

In some cases synchronization is overused. For example not only read-write operations are guarded, but whole processing. It causes operations to be executed serially, that kills whole reason to use threads.
Is it healthy for project not to have regular stable releases?
Just for fun: Code::Blocks, GCC 13 and DOS - is it possible?

Thaddy

  • Hero Member
  • *****
  • Posts: 16200
  • Censorship about opinions does not belong here.
Re: Best practices to use tRtlCriticalSection
« Reply #9 on: November 30, 2023, 11:20:36 am »
Depending on CPU the lock prefix is not even necessary. We tested that for KOL many years ago and removed most locks.
That is for all read locks and some but not all write locks,
If I smell bad code it usually is bad code and that includes my own code.

jollytall

  • Sr. Member
  • ****
  • Posts: 366
Re: Best practices to use tRtlCriticalSection
« Reply #10 on: November 30, 2023, 11:25:20 am »
Thaddy, you mention floating point types, but I do not find it as an option: https://www.freepascal.org/docs-html/rtl/system/interlockedexchange.html, only integer types.

440bx

  • Hero Member
  • *****
  • Posts: 4751
Re: Best practices to use tRtlCriticalSection
« Reply #11 on: November 30, 2023, 11:52:08 am »
@Jollytall,

I think the question you have to answer to yourself and the forum members is, should the thread that is read-only process the same record multiple times ?

IOW, if the writer has not updated the record since the last time it was read and the reading thread reads the record again, what should the reading thread do ?  a.) wait until the writer updates the record or b.) do something new/different with the unchanged record ?

It's important to know that because depending on the answer, a critical section may not even be the correct synchronization object.

Just for the record, a critical section does not ensure threads take turns reading/writing.  Using a critical section allows the reader to read multiple times and the writer to write multiple times before the other thread has a chance to do whatever it's supposed to do (i.e, read or write.)

HTH.
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

Thaddy

  • Hero Member
  • *****
  • Posts: 16200
  • Censorship about opinions does not belong here.
Re: Best practices to use tRtlCriticalSection
« Reply #12 on: November 30, 2023, 12:24:47 pm »
Thaddy, you mention floating point types, but I do not find it as an option: https://www.freepascal.org/docs-html/rtl/system/interlockedexchange.html, only integer types.
No, iit is simpllly tthe width of a register that needs no read locks and depending on CPU not even write locks. so if a global fits in a register there is no need for a critical section at all.
If I smell bad code it usually is bad code and that includes my own code.

jollytall

  • Sr. Member
  • ****
  • Posts: 366
Re: Best practices to use tRtlCriticalSection
« Reply #13 on: November 30, 2023, 12:35:06 pm »
I think the question you have to answer to yourself and the forum members is, should the thread that is read-only process the same record multiple times ?
In my case it is simply a status request, so it can read the same data multiple times. Actually this is also one of the purposes to do them in separate threads. The data collecting thread collects the data as often as it makes sense. The reader thread might read it as often as it wants without doing expensive and useless reads. E.g. my DSMR electricity meter gives data every ten seconds, or to read the status of digital inputs (e.g. thermostats in the house) does not make sense more often than x s.
During that time my automation system might inquire that data as well as I might check it through a web interface, even multiple times (pressing F5 impatiently, what I do a lot :) ). Both the automation loop and the webserver inquire the data from the data collecting thread. That always gives the latest available value, but does not try to hardware read it every time.

Mr.Madguy

  • Hero Member
  • *****
  • Posts: 859
Re: Best practices to use tRtlCriticalSection
« Reply #14 on: November 30, 2023, 12:48:45 pm »
In my case it is simply a status request, so it can read the same data multiple times. Actually this is also one of the purposes to do them in separate threads. The data collecting thread collects the data as often as it makes sense. The reader thread might read it as often as it wants without doing expensive and useless reads. E.g. my DSMR electricity meter gives data every ten seconds, or to read the status of digital inputs (e.g. thermostats in the house) does not make sense more often than x s.
During that time my automation system might inquire that data as well as I might check it through a web interface, even multiple times (pressing F5 impatiently, what I do a lot :) ). Both the automation loop and the webserver inquire the data from the data collecting thread. That always gives the latest available value, but does not try to hardware read it every time.
It's not your case, but, I guess, he tried to say, that it's better to use events in "wait for new data" scenario.
Depending on CPU the lock prefix is not even necessary. We tested that for KOL many years ago and removed most locks.
That is for all read locks and some but not all write locks,
Just curious. What kind of tests? Desync bugs - are floating bugs, that have low chance of showing. I wouldn't rely on simple type memory operations always being atomic. Again. This may be true for home computers with single CPU sockets, but it's not guaranteed for server configurations. We use interlocked operations for reason. Because we need 100% guarantee.
« Last Edit: November 30, 2023, 01:06:15 pm by Mr.Madguy »
Is it healthy for project not to have regular stable releases?
Just for fun: Code::Blocks, GCC 13 and DOS - is it possible?

 

TinyPortal © 2005-2018