Recent

Author Topic: Benchmark test in nanoseconds  (Read 2599 times)

LeP

  • Full Member
  • ***
  • Posts: 197
Re: Benchmark test in nanoseconds
« Reply #30 on: March 03, 2026, 04:01:58 pm »
Sleep(x) works inside the operating system limits (1 ms. with MMSystem functions active).

This means that with x = 1000 you can have "1 per thousand" error.

1 MHz on 3600 MHz is less than "1 per thousand"... you cannot do better.

If you use HPET to calibrate, than you will have a better divisor, but it is a nonsense inside Windows or others OS not real time.
« Last Edit: March 03, 2026, 04:05:46 pm by LeP »

backprop

  • Full Member
  • ***
  • Posts: 194
Re: Benchmark test in nanoseconds
« Reply #31 on: March 03, 2026, 04:06:17 pm »
Well sure, that is what I need with RDTSC measurements. Much better than to have 500 microsecons error range which is with solution based on clock_gettime().

LeP

  • Full Member
  • ***
  • Posts: 197
Re: Benchmark test in nanoseconds
« Reply #32 on: March 03, 2026, 04:27:00 pm »
Real problem here is how to know that used core is at maximum speed. It was easy with single core CPUs... My a bit older Ryzen can go from 1.6 to 3.6 GHz. I would try to make a thread and set higher priority execution, if that is possible...
TSC on AMD is invariant form 2007 (but you can test it with CPUID like I said).
The thing that I know in newer AMD than 2007 but older than today processor is that they had "problem" to syncs TSC between core. A little slice was present (or could be present) so if you read the TSC form different cores you may have different evaluation.
Intel don't use the core frequency to update the TSC, so this is invariant from core.

But I really don't know how is AMD, 'cause i never used it from Athlon K6 (wonderfull memories  :D ).

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 12706
  • FPC developer.
Re: Benchmark test in nanoseconds
« Reply #33 on: March 03, 2026, 04:47:10 pm »
I use queryperformancecounter on Windows, and it goes below 1ms.   

I started using Intel again with Core-2 (Conroe) till Core 3rd generation, then switched to AMD for desktop PCs, I think AMD Kaveri (A10) was already fine, and I'm sure about anything Ryzen.  After Core 6th generation, even work laptops changed to AMD.


LeP

  • Full Member
  • ***
  • Posts: 197
Re: Benchmark test in nanoseconds
« Reply #34 on: March 03, 2026, 04:59:03 pm »
I use queryperformancecounter on Windows, and it goes below 1ms.   

QueryPerformanceCounter, TStopWatch, HPET ... are the same. Resolution of 100 ns. but with a latency and less impact on code.
RDTSC .. resolution less the 1 ns. with little latency but with a possible impact on code.

I started using Intel again with Core-2 (Conroe) till Core 3rd generation, then switched to AMD for desktop PCs, I think AMD Kaveri (A10) was already fine, and I'm sure about anything Ryzen.  After Core 6th generation, even work laptops changed to AMD.
I started with AMD unitll Atlhon K6 (desktop) and Intel (laptop). After that only Intel (and Nvidia where I need dedicated GPU).

440bx

  • Hero Member
  • *****
  • Posts: 6123
Re: Benchmark test in nanoseconds
« Reply #35 on: March 03, 2026, 06:46:22 pm »
Just for the record...

Depending on Sleep() delivering an accurate sleep time is a very bad idea.  The scheduler is under no obligation whatsoever to give clock cycles to a thread just because it has come out of a sleep state.  In addition to that, the default resolution of Sleep is rather low.  It can be manipulated with timeBeginPeriod and timeEndPeriod but, it's better and simpler to use the API it uses internally (NtSetTimerResolution)

IOW, Sleep is _not_ a way to influence the O/S' scheduler.

In the best case, in a system that is running _very_ smoothly (and _not_ running some version of Win 10 or newer), its accuracy is roughly 1ms (that's in the very best case.)  Starting with some version of Win 10, its accuracy is lower, usually in the order of 2 ms (again, in the very best case.)  I know because, I didn't just read MS' documented wet dreams, I've tested Sleep many, many times.

FPC v3.2.2 and Lazarus v4.0rc3 on Windows 7 SP1 64bit.

LeP

  • Full Member
  • ***
  • Posts: 197
Re: Benchmark test in nanoseconds
« Reply #36 on: March 03, 2026, 07:46:31 pm »
@440bx

Sleep is only used for calibration under Multimedia functions (timerbeginperiod .....).
I don't think that one want to "measure" times with sleep.

The accuracy depends ... thousand of elements make the sleep (and not only sleep) variable. In my PC (industrial environment) with full load of 75% of all cores the sleep under timerbeginperiod(1) still maintain 95% of 1 ms. Other 5% is far +/-0,2 ms. On 80% of load sleep start to derive (1,5 ms), with more load go to 2 ms.

Attached, just to have a refer, it's a screenshoot of one my application that is working (low work): see the performance graph, now at 41%. When I said 75% I means all the core at 75% or more (you don't find that in normal use of a PC).
« Last Edit: March 03, 2026, 07:49:57 pm by LeP »

backprop

  • Full Member
  • ***
  • Posts: 194
Re: Benchmark test in nanoseconds
« Reply #37 on: March 03, 2026, 07:59:36 pm »
Just for the record...

Depending on Sleep() delivering an accurate sleep time is a very bad idea.

This have nothing to do with "precise measuring" of sleep function. It is just plain test to get close figures of manufactured frequency and prove RDTSC function gives correct values.

440bx

  • Hero Member
  • *****
  • Posts: 6123
Re: Benchmark test in nanoseconds
« Reply #38 on: March 03, 2026, 08:24:39 pm »
The point is that using Sleep, which is inherently inaccurate, for calibration purposes is a questionable choice. The calibration cannot be more accurate than what is being used to perform it.

It definitely will _not_ result in anything providing nanosecond accuracy.

FPC v3.2.2 and Lazarus v4.0rc3 on Windows 7 SP1 64bit.

LeP

  • Full Member
  • ***
  • Posts: 197
Re: Benchmark test in nanoseconds
« Reply #39 on: March 03, 2026, 10:30:29 pm »
The point is that using Sleep, which is inherently inaccurate, for calibration purposes is a questionable choice. The calibration cannot be more accurate than what is being used to perform it.
What are you speaking ?
If I sleep 1 ms, it relays in 2 ms ? The error is 100%.

If I sleep 1000 ms. and it relays in 1002 ms., the error is? Or You think that, if I sleep 1000 ms. (with Multimedia functions active) the sleep return in 10 minutes ?

There is no need to access the Cesium atomic clock to have good accuracy.

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 12706
  • FPC developer.
Re: Benchmark test in nanoseconds
« Reply #40 on: March 03, 2026, 10:33:50 pm »
I use queryperformancecounter on Windows, and it goes below 1ms.   

QueryPerformanceCounter, TStopWatch, HPET ... are the same. Resolution of 100 ns. but with a latency and less impact on code.
RDTSC .. resolution less the 1 ns. with little latency but with a possible impact on code.

Afaik on non antique queryperformancecounter is a layer over rdtsc that also works if your process goes from one CPU to the next during the measured interval (CPU, not core). You can tell it to use the HPET though.

https://learn.microsoft.com/en-us/windows/win32/sysinfo/acquiring-high-resolution-time-stamps?redirectedfrom=MSDN


Quote
I started with AMD unitll Atlhon K6 (desktop) and Intel (laptop). After that only Intel (and Nvidia where I need dedicated GPU).

intel 386sx-20, AMD 486 DX-80, Cyrix P166+, AMD K6-2 500, AMD Athlon XP2000+, AMD Athlon 64 3700+, Core2 6600, i7-3770, A10-7850K , Ryzen 2600 upgraded later to 5700X

The A10 and the 2600 were mostly used as second computer, but with the upgrade it replaced the i7-3770

Graphics cards Trident 9000, S3 Trio+, Diamond Monster 3d (3dFX 1st gen), Nvidia TNT2, Nvidia GF2, GF5500, GF6200 ( bought for TV-OUT), GF7800, AMD 5770, AMD 7850 , NV 4060.

440bx

  • Hero Member
  • *****
  • Posts: 6123
Re: Benchmark test in nanoseconds
« Reply #41 on: March 03, 2026, 10:56:18 pm »
@LeP,

You're trying to contrive absurd conclusions. 

Sleep() is usually off by about 1ms and starting in some versions of Win 10 consistently by 2ms _regardless_ of the amount of sleep time requested.  I know because I have measured it.   This is _not_ a guess, nor an estimate, it is _fact_.

Feel free to conclude whatever you want from that but, the fact is crystal clear, using Sleep() to calibrate any measurement method with the objective of obtaining nanosecond resolution is simply absurd.
FPC v3.2.2 and Lazarus v4.0rc3 on Windows 7 SP1 64bit.

backprop

  • Full Member
  • ***
  • Posts: 194
Re: Benchmark test in nanoseconds
« Reply #42 on: March 03, 2026, 11:15:04 pm »
@LeP,

You're trying to contrive absurd conclusions. 

Sleep() is usually off by about 1ms and starting in some versions of Win 10 consistently by 2ms _regardless_ of the amount of sleep time requested.  I know because I have measured it.   This is _not_ a guess, nor an estimate, it is _fact_.

I'm not interested about Sleep() function. You may open new thread about, if you want.


LeP

  • Full Member
  • ***
  • Posts: 197
Re: Benchmark test in nanoseconds
« Reply #43 on: March 03, 2026, 11:54:35 pm »
@backprop

I think you have enough material to do your work.
If you don't have any other question about the topic I stop here.

Last suggestion: like @marcov suggest with his link, use the coreinfo (or better coreinfo64) utility of sysinternals to know if your system has TSC invariant.

And if you decide to use RDTSC (that is not a bad choice at all), construct something like TStopWatch with RDTSC to use in your applications, so you can adapt this to all platforms you use with only one design in you sources.

creaothceann

  • Sr. Member
  • ****
  • Posts: 276
Re: Benchmark test in nanoseconds
« Reply #44 on: March 04, 2026, 12:30:09 am »
Have a look at TJclCounter from project Jedi. Its implementation is even more sophisticated than Delphi's TStopwatch by taking QueryPerformanceCounter call overhead into account.

 

TinyPortal © 2005-2018