Recent

Author Topic: For Loops and Registers  (Read 3629 times)

winni

  • Hero Member
  • *****
  • Posts: 3197
Re: For Loops and Registers
« Reply #15 on: December 09, 2019, 10:32:59 pm »
Still getting the hang of things...

One tip for reducing the overhead of the milliseconds:
Code: Pascal  [Select][+][-]
  1. T1: QWord;
  2.  
  3. ...
  4. T1 := GetTickCount64;
  5.  
  6. ......
  7.  
  8. writeln (GetTickCount64 - T1);
  9.  

Winni

syntonica

  • Full Member
  • ***
  • Posts: 120
Re: For Loops and Registers
« Reply #16 on: December 09, 2019, 10:41:21 pm »
Still getting the hang of things...

One tip for reducing the overhead of the milliseconds:
Code: Pascal  [Select][+][-]
  1. T1: QWord;
  2.  
  3. ...
  4. T1 := GetTickCount64;
  5.  
  6. ......
  7.  
  8. writeln (GetTickCount64 - T1);
  9.  

Winni


Thanks! But it looks like this is ticks of indeterminate length. I need real time since I'm comparing FP with C. It'll be useful, though, when I need to start hand-optimizing.

winni

  • Hero Member
  • *****
  • Posts: 3197
Re: For Loops and Registers
« Reply #17 on: December 09, 2019, 10:54:42 pm »
Hi!

GetTickCount64 is defined as milliseconds since systemstart.

It relies on the OS ticks.

It  is the same 'exact' as now is.

Winni

PS: After 5.8E14 years it will wrap to 0!! Take care!
« Last Edit: December 09, 2019, 11:12:53 pm by winni »

syntonica

  • Full Member
  • ***
  • Posts: 120
Re: For Loops and Registers
« Reply #18 on: December 09, 2019, 11:25:21 pm »
Hi!

GetTickCount64 is defined as milliseconds since systemstart.

It relies on the OS ticks.

It  is the same 'exact' as now is.

Winni

PS: After 5.8E14 years it will wrap to 0!! Take care!

Oh, cool! The docs said: "It is useful for time measurements, but no assumptions should be made as to the interval between the ticks."

How trustworthy are the docs?  Apple's are pretty awful, even to the point of just being flat out wrong, and not because they're outdated.

The sun only has 5.0E12 years before it dies...  Really put's things in perspective. :-\

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11383
  • FPC developer.
Re: For Loops and Registers
« Reply #19 on: December 09, 2019, 11:34:49 pm »
Does your FFT routine use complex numbers?  I have a routine that does it in place in a 1024 sample window, no complex numbers needed.

Yes. But I need mixed radix because I have a 400 sample window.  But it would always be good to see a different solution.

The assembler now parallelizes most re:im operations though, and even does two complex per instruction about half the time. It is not production ready yet though, just preparation for a planned move to 64-bit of our only remaining 32-bit application.

This application uses a lot of floating point calculations and is 140-200% slower than on 64-bit (using Delphi btw). Worse, the number of calculations is only going to increase, so buying yourself out of trouble with newer hardware is costly.

That said I benchmark with Delphi mostly and Delphi converts every single load to double and back and does all main operations in double.

Most of the calculations are not repeated a lot or parallelize, so I mostly started analysing (creating a good benchmark) and tackling a few simple but common primitives, FFT included.
« Last Edit: December 10, 2019, 12:02:02 am by marcov »

winni

  • Hero Member
  • *****
  • Posts: 3197
Re: For Loops and Registers
« Reply #20 on: December 10, 2019, 12:03:12 am »
@syntonica

I just hat a look into the internals: Linux takes the ticks from the libc, Windows uses kerne32.dll to get the ticks.

And the joke with the 5.8E14 years has this Background: The predecessor of GetTickCount64 is GetTickCount - 32 bit.

In old days a lot of companies were happy that they had Windows NT 3.5 - a "stable" Windows. But from time to time NT crashed. Nobody knew why. Then the logic appeared: MaxDWord Milliseconds are 49.xx days - and then the server crashed. So you had to reboot the system after 48 days.  And the GetTickCount64 came around and everybody was computing, when the overfow will happen ....

Winni
« Last Edit: December 10, 2019, 12:05:02 am by winni »

syntonica

  • Full Member
  • ***
  • Posts: 120
Re: For Loops and Registers
« Reply #21 on: December 10, 2019, 12:04:28 am »
Here's my code in C/C++. I just use it as static methods.  It's been specifically stripped down to only work on 1024 sample windows. Unfortunately, the original has no attribution.

syntonica

  • Full Member
  • ***
  • Posts: 120
Re: For Loops and Registers
« Reply #22 on: December 10, 2019, 12:10:44 am »
In old days a lot of companies were happy that they had Windows NT 3.5 - a "stable" Windows. But from time to time NT crashed. Nobody knew why. Then the logic appeared: MaxDWord Milliseconds are 49.xx days - and then the server crashed. So you had to reboot the system after 48 days.  And the GetTickCount64 came around and everybody was computing, when the overfow will happen ....

Winni
I completely forgot about that!  When that was happening, the company I was working for was running Novell. Rock solid, but don't accidentally print 500 pages as my sysadmin couldn't find a way to kill a print job.

jamie

  • Hero Member
  • *****
  • Posts: 6090
Re: For Loops and Registers
« Reply #23 on: December 10, 2019, 12:55:45 am »
Well, it seems I can't update a PAS file...

I was going to offer you a "fourier.pas" file
oh well,...

http://jean-pierre.moreau.pagesperso-orange.fr/p_signal.html
« Last Edit: December 10, 2019, 01:01:34 am by jamie »
The only true wisdom is knowing you know nothing

PascalDragon

  • Hero Member
  • *****
  • Posts: 5446
  • Compiler Developer
Re: For Loops and Registers
« Reply #24 on: December 10, 2019, 09:44:50 am »
Oh, cool! The docs said: "It is useful for time measurements, but no assumptions should be made as to the interval between the ticks."

For the upcoming 3.2 release the docs have been updated some months ago (not yet published). It now reads like this:
Quote
GetTickCount64 returns an increasing clock tick count in milliseconds.
It is useful for time measurements, but no assumptions should be made as to the interval between the ticks.

This means that the unit is always considered to be milliseconds, but the accuracy depends on the platform. E.g. if it would be used on a hypothetical platform that only supports a timer with a frequency of a second than you'd only get multiples of 1000 as a result.

How trustworthy are the docs?  Apple's are pretty awful, even to the point of just being flat out wrong, and not because they're outdated.
As everything in FPC and Lazarus they are done by volunteers. So they can either be incomplete or could be improved, but they shouldn't be wrong. However unlike with Apple you can post a bug report and more often than not they are fixed rather quickly (though the changes will only be visible in the next release).

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11383
  • FPC developer.
Re: For Loops and Registers
« Reply #25 on: December 10, 2019, 09:51:09 am »
Here's my code in C/C++. I just use it as static methods.  It's been specifically stripped down to only work on 1024 sample windows. Unfortunately, the original has no attribution.

Thanks. I filed it for future reference. But yeah, it is really power of 2 only, not mixed radix.

Currently the 400 samples are related to camera and lighting, (400/10s = 40/s), but while it would be a whale to change, you never know long term. But in general the various physical factors get priority over the samplesize.

I got my mixed radix from this page https://www.simdesign.nl/components.html   but originally I did only 10/frame, so performance was not THAT important.

Meanwhile I tripled the number of cameras and it is going in the direction of hundreds/frame.
« Last Edit: December 10, 2019, 10:46:21 am by marcov »

syntonica

  • Full Member
  • ***
  • Posts: 120
Re: For Loops and Registers
« Reply #26 on: December 10, 2019, 05:08:06 pm »
Here's my code in C/C++. I just use it as static methods.  It's been specifically stripped down to only work on 1024 sample windows. Unfortunately, the original has no attribution.

Thanks. I filed it for future reference. But yeah, it is really power of 2 only, not mixed radix.

Currently the 400 samples are related to camera and lighting, (400/10s = 40/s), but while it would be a whale to change, you never know long term. But in general the various physical factors get priority over the samplesize.

I got my mixed radix from this page https://www.simdesign.nl/components.html   but originally I did only 10/frame, so performance was not THAT important.

Meanwhile I tripled the number of cameras and it is going in the direction of hundreds/frame.
40 seems an odd number for video, though. I'm used to the usual suspects when it comes to frame rates. Also, I'm assuming that FFT is used to transform luminance and color channels to the frequency domain? So much interesting stuff out there, so little time to look at it all!

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11383
  • FPC developer.
Re: For Loops and Registers
« Reply #27 on: December 10, 2019, 05:55:20 pm »
40 seems an odd number for video, though.

These are industrial cameras doing real frames, not compressed streams.

Quote
I'm used to the usual suspects when it comes to frame rates. Also, I'm assuming that FFT is used to transform luminance and color channels to the frequency domain? So much interesting stuff out there, so little time to look at it all!

No. It is to filter image primitives like edges. It works because the camera is looking at an object on a turntable so there is some sinusoid tendency in the data, so discarding higher order terms is a noise filter that doesn't change amplitude too much. (which most forms of averaging filters do). The exact amplitude is important to correct for variable perspective of offcenter objects.

 

TinyPortal © 2005-2018