Recent

Author Topic: Sleep(1) in thread's loop can reduce CPU usage much  (Read 12876 times)

scribly

  • Jr. Member
  • **
  • Posts: 80
Re: Sleep(1) in thread's loop can reduce CPU usage much
« Reply #30 on: May 22, 2019, 11:36:06 am »
When sleep(0) is executed it checks which thread is currently not in a wait state and pass the execution on to that thread
That will most of the time result in the calling thread being picked because most other threads will be in a wait state (cpu % usage is just an inverted indication of how long the cpu is sleeping in a low power state waiting for a thread to wake up)

so unless your system is always running at 100% sleep(0) is not that useful and is not meant to lower power usage
« Last Edit: May 22, 2019, 11:37:40 am by scribly »

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11383
  • FPC developer.
Re: Sleep(1) in thread's loop can reduce CPU usage much
« Reply #31 on: May 22, 2019, 02:10:32 pm »
Scribly: That's about what I know about it too.  Before multicore systems, there were more threads waiting, and it made more sense.

The first thing to do is avoid sleeps alltogether, since they are an indication of polling and latency. Wait on semaphores with a timeout allows also to do housekeeping once in a while, but if some other thread signals a chance it can interrupt immediately, so the sleeps can be much longer.

However it requires a waitfor with a timeout. I do most my multithreading programming on Windows currently, and then it is there. Afaik for linux it is now also there (based on sem_timedwaitfor or so), but that is relatively new, and maybe not yet in released versions

Thaddy

  • Hero Member
  • *****
  • Posts: 14205
  • Probably until I exterminate Putin.
Re: Sleep(1) in thread's loop can reduce CPU usage much
« Reply #32 on: May 22, 2019, 03:48:32 pm »
They are really distinct. Pseudo code:
Code: Pascal  [Select][+][-]
  1. var Amount: word  =0;
  2.  if sleep(amount) then
  3.     RelinguishTimeSlice // this is not a sleep!!!! but a signal...
  4.   else
  5.     RealSleep(amount);// not a signal, but a timer and can block...Do not use in multi-threading if you do not know the amount of cores
Sleep(0) has a different code path as can be seen in a good debugger (Ice, OllyDebug)

I don't think this has changed since the 90's. If so, I am wrong for a very long time (since NT3.5/Win3.1).
In Unix like systems this works different: that simply does not misuse a timer signal for a shorthand to a semaphore...
« Last Edit: May 22, 2019, 03:56:36 pm by Thaddy »
Specialize a type, not a var.

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11383
  • FPC developer.
Re: Sleep(1) in thread's loop can reduce CPU usage much
« Reply #33 on: May 22, 2019, 04:36:46 pm »
I don't think this has changed since the 90's. If so, I am wrong for a very long time (since NT3.5/Win3.1).

What has changed is the chance that a thread is waiting (since other tasks can run on other cores, and don't have to compete with a program running anymore), and the chance that the machine would be prohibited from going into low power state without the same thread continously being rescheduled, only to reliquish it nearly directly again with sleep(0).

As a form of possible busywaiting, the principle was just as bad in the nineties, it only mattered less.

Quote
In Unix like systems this works different: that simply does not misuse a timer signal for a shorthand to a semaphore...

Since Unix does not have waitmultiple, it is even more important to be able to regularly pauze waiting on one semaphore....

Thaddy

  • Hero Member
  • *****
  • Posts: 14205
  • Probably until I exterminate Putin.
Re: Sleep(1) in thread's loop can reduce CPU usage much
« Reply #34 on: May 22, 2019, 04:51:52 pm »
Those are the points. I Agree with that.
Specialize a type, not a var.

440bx

  • Hero Member
  • *****
  • Posts: 3946
Re: Sleep(1) in thread's loop can reduce CPU usage much
« Reply #35 on: May 22, 2019, 09:25:21 pm »
The first thing to do is avoid sleeps alltogether, since they are an indication of polling and latency.
That is bad advice.  Sleep(n) where n > 0 is simply a mechanism to inform the scheduler the thread does not require CPU cycles.   

One of the nice things about Sleep is that it doesn't require the application to create an object to wait on and the O/S to maintain and pay attention to it.  Sleep is very low overhead and lets the O/S know that for n milliseconds, the thread doesn't need any attention.

In addition to that, there are events for which there are no available objects that can be created to wait on.  e.g, at least in Windows, you can't wait on some window that belongs to another process to finish painting itself.

Polling, if done well, can be more efficient than waiting on an object (which the O/S scheduler has to pay attention to.)  A good combination of SRW locks and Sleep can produce much better results than using objects to wait on.  If you doubt that, compare the performance of Mutexes and Critical sections.   Using mutexes is higher overhead and roughly 1/20 the performance and a lot more work for the O/S.


« Last Edit: May 22, 2019, 09:56:37 pm by 440bx »
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

Peter H

  • Sr. Member
  • ****
  • Posts: 272
Re: Sleep(1) in thread's loop can reduce CPU usage much
« Reply #36 on: May 22, 2019, 10:38:38 pm »
Yes.
A good example ist multithreadingexample1 from Lazarus example code.
It refreshes the screen in an infinite loop and consumes a lot of CPU.
There is no reason to refresh the screen a thousand times per second.  8-)

scribly

  • Jr. Member
  • **
  • Posts: 80
Re: Sleep(1) in thread's loop can reduce CPU usage much
« Reply #37 on: May 22, 2019, 11:42:02 pm »
I've attached an example project showing a scenario where sleep(0) has an effect as opposed to no sleep at all

When no sleep is used, the consumer thread will constantly read the value 0 over and over until it relinquishes it's allotted timeslice to the producer thread. Basically wasting a lot of cycles doing nothing but eating cpu power
When sleep(0) (or yield) is used the two threads will swap between eachother (and sometimes other threads in the system) wasting less time. It is of course slower than with no sleep at all

440bx

  • Hero Member
  • *****
  • Posts: 3946
Re: Sleep(1) in thread's loop can reduce CPU usage much
« Reply #38 on: May 23, 2019, 01:42:26 am »
I've attached an example project showing a scenario where sleep(0) has an effect as opposed to no sleep at all
That is not a conclusion that can be derived from the example you attached.

I've modified your example to show what really happens.  You'll notice that calling sleep(0) can be said to have no effect at all.

To get the most out of the example, run it under the Lazarus debugger, set the console buffer to 9999 lines (the maximum) and make it run.

Pay attention to how much CPU is used in both cases (task manager or processhacker would help with that.)  You'll find that, there really isn't a difference when Sleep(0) is used, nor does the flip flopping between one thread and the other change.

Also, I didn't get rid of the array because I wanted you to see that the values shown in the listview are not even close to what is happening in reality.  The reality is what's displayed in the console.

You can get rid of the timer, the listview and the dynamic array.  They don't make any useful contribution.
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

scribly

  • Jr. Member
  • **
  • Posts: 80
Re: Sleep(1) in thread's loop can reduce CPU usage much
« Reply #39 on: May 23, 2019, 08:36:26 am »
I can't see what you mean. I've attached an image with the output. Left is no sleep, right is with sleep.
As you see left is reading out the old value many times while right it reads more evenly

this is on an intel i9-7960x at 2.8GHz

Also, I have to mention writeln, which uses locks to make sure the output buffer isn't accessed at the same time by other threads and when the buffer does go full eventually will cause a full sleep till the buffer is processed(while holding the lock), so not sure if you want to use that to show of threading or ever use in anything that's cpu intensive
(Changing the loops to infinite loops shows that with writeln the cpu usage is only 1/3th of it's max while when it's removed it'll go to 100%
And yes, it'll also go to 100% with sleep(0) because that's what it is supposed to do, pick a thread that isn't sleeping)

« Last Edit: May 23, 2019, 09:45:10 am by scribly »

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11383
  • FPC developer.
Re: Sleep(1) in thread's loop can reduce CPU usage much
« Reply #40 on: May 23, 2019, 09:45:53 am »
The first thing to do is avoid sleeps alltogether, since they are an indication of polling and latency.
That is bad advice.  Sleep(n) where n > 0 is simply a mechanism to inform the scheduler the thread does not require CPU cycles.   

Then it shouldn't either not be running (in case it only polled), or it should block to the next time it should actually do something.

Sleep(n) is not just saying that you don't  need cycles now, but at the same time a request to get them in "n". My reasoning is that THAT should be
avoided if possible.

Quote
One of the nice things about Sleep is that it doesn't require the application to create an object to wait on and the O/S to maintain and pay attention to it.  Sleep is very low overhead and lets the O/S know that for n milliseconds, the thread doesn't need any attention.

Objects can be reused. For the variation in overhead I'd like a reference. Afaik in both cases it is a scheduler lock on a condition.

Quote
In addition to that, there are events for which there are no available objects that can be created to wait on.  e.g, at least in Windows, you can't wait on some window that belongs to another process to finish painting itself.

There are always exceptions, and that was never denied. We are talking about the general case here. (and even then there is a lot possible with waitmultiplemessage variants)

Quote
Polling, if done well, can be more efficient than waiting on an object (which the O/S scheduler has to pay attention to.)  A good combination of SRW locks and Sleep can produce much better results than using objects to wait on.  If you doubt that, compare the performance of Mutexes and Critical sections.   Using mutexes is higher overhead and roughly 1/20 the performance and a lot more work for the O/S.

Afaik mutexes that are cross-apllication (global) in nature are fairly slow. Non named ones used within one process afaik aren't.

440bx

  • Hero Member
  • *****
  • Posts: 3946
Re: Sleep(1) in thread's loop can reduce CPU usage much
« Reply #41 on: May 23, 2019, 12:34:35 pm »
@scribly

I can't see what you mean. I've attached an image with the output. Left is no sleep, right is with sleep.
As you see left is reading out the old value many times while right it reads more evenly
I cannot comment on the screenshot you posted because you didn't post the code that produced it.  The modified example I posted, shows very clearly that sleep has little to no effect on thread switching in that case.

Also, I have to mention writeln, which uses locks to make sure the output buffer isn't accessed at the same time by other threads and when the buffer does go full eventually will cause a full sleep till the buffer is processed(while holding the lock), so not sure if you want to use that to show of threading or ever use in anything that's cpu intensive
There will not be a "full sleep" until the buffer is processed.  When both threads send text to conhost.exe (which is a separate process), one thread will be blocked until the request to write from the other thread is complete but, that doesn't imply a thread switch to the other thread in the same process.

I agree that the writeln has an effect on how the results are presented but, the writeln(s) are not what causes the scheduler to decide which thread gets to run next.  As long as a thread has not used up its time slice, it will, generally speaking because there are other threads running in the system, keep running until it uses it up.

Post the program that produced that output.   

@marcov

Then it shouldn't either not be running (in case it only polled),
That's what "sleep" is for, to "not run", to let the scheduler know that the thread doesn't need any clock cycles.

or it should block to the next time it should actually do something.
When the thread is awaken, it will do something.  In the meantime, it doesn't use up system resources and scheduler clock cycles with synchronization objects.

Sleep(n) is not just saying that you don't  need cycles now, but at the same time a request to get them in "n".
No, that is not correct.  Sleep is not a request to get clock cycles after n milliseconds have elapsed.  The thread may, or may not, get clock cycles after n milliseconds.  This is the reason why sleep is considered by many to be "inaccurate", because of the mistaken belief that after the time has elapsed the scheduler will run it.

My reasoning is that THAT should be avoided if possible.
What should be avoided is to cause the scheduler to waste clock cycles on a thread that doesn't need them. That's what sleep(n) does.

Objects can be reused. For the variation in overhead I'd like a reference. Afaik in both cases it is a scheduler lock on a condition.
Yes, synchronization objects can definitely be reused.  The overhead is simple, a mutex causes a ring transition from ring3 to ring0 and requires scheduler clock cycles to determine if the mutex is signaled and then additional clock cycles once it's been signaled to find which thread(s) was/were waiting for the object to be signaled.

If you want to measure the best case overhead, simply code a loop that waits and releases a mutex about a 1,000,000 times.  That will give you an idea of a mutex cost.  Do the same thing for critical sections and Sleep(1) (shorten the loop for that last one:)) then compare the results (and pay attention to the CPU consumption in all cases.)

There are always exceptions, and that was never denied. We are talking about the general case here. (and even then there is a lot possible with waitmultiplemessage variants)
Marco, the point was that using Sleep is a good thing, and not something to be avoided. A thread should let the processor know when it doesn't need attention.  The thread that definitely should NOT be calling "sleep" is the thread that pumps messages.

As far as the general case, I'm inclined to believe it is more of synchronization between threads within the same process and not different processes.  For threads within the same process, polling using sleep produces code that is simpler, isn't subject to deadlocks and, uses less CPU. That said, it can be a bit slower than using some sort of synchronization, in those cases critical sections are usually a good option.  A combination of TryEnterCriticalSection and Sleep often produces excellent results.

Afaik mutexes that are cross-apllication (global) in nature are fairly slow. Non named ones used within one process afaik aren't.
I haven't tested the performance of unnamed mutexes but, if this is correct https://stackoverflow.com/questions/1666653/are-mutexes-really-slower then unnamed mutexes are still kernel objects with all the associated overhead of a named mutex.

(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

scribly

  • Jr. Member
  • **
  • Posts: 80
Re: Sleep(1) in thread's loop can reduce CPU usage much
« Reply #42 on: May 23, 2019, 12:58:42 pm »
Quote
Post the program that produced that output. 

the code you posted without any changes
https://youtu.be/3nHpf54_bwA

it's possible windows 10 'fixed' sleep. I tested it on an old win7 and there there is no difference between no sleep and sleep(0) (also slow cpu but not sure if that matter)
« Last Edit: May 23, 2019, 01:25:34 pm by scribly »

440bx

  • Hero Member
  • *****
  • Posts: 3946
Re: Sleep(1) in thread's loop can reduce CPU usage much
« Reply #43 on: May 23, 2019, 02:14:16 pm »
Quote
Post the program that produced that output. 

the code you posted without any changes
https://youtu.be/3nHpf54_bwA

it's possible windows 10 'fixed' sleep. I tested it on an old win7 and there there is no difference between no sleep and sleep(0) (also slow cpu but not sure if that matter)

Very nicely done :)

I was looking at the video you posted and, it does look like, under Windows 10 Sleep(0) does cause a thread switch.  Under Windows 7 (which I am using) and older versions of Windows, it does not.

The CPU speed affects the results only at the top and bottom of the output.  One of the threads is going to start executing first, that will cause the other thread to output several lines at the bottom because it started later and couldn't catch up until the other one ended.

It would be really nice if they had fixed that problem with Sleep(0) in Windows 10 (about time they did too.)  There are a fair number of commercial programs that used Sleep(0) which caused them to use 100% of a core (the old Turbo Debugger - TD32.EXE - is one of them.)

But, even if they fixed it in Windows 10, I'd still use Sleep(1) to ensure the program is "friendly" to older versions of Windows.

Thank you for that video. 

(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

 

TinyPortal © 2005-2018