Recent

Author Topic: efficiency problem  (Read 51172 times)

Paolo

  • Hero Member
  • *****
  • Posts: 579
Re: efficiency problem
« Reply #135 on: March 14, 2025, 07:46:05 pm »
@LV  is it multithread ?

Thaddy

  • Hero Member
  • *****
  • Posts: 16763
  • Ceterum censeo Trump esse delendam
Re: efficiency problem
« Reply #136 on: March 14, 2025, 07:52:08 pm »
Did you even read the code?

This has nothing to do with static/dynamic memory allocation - it's about fixed-size, contiguous, true 2D arrays vs. dynamically-sized arrays of arrays,
Yes it does. It even implies what I wrote.
Quote
the latter of which requires an extra pointer dereference and may have terrible memory locality.
NO. There is no difference once the memory is allocated. None whatsoever. There are only indirections by way of the programmer, not the compiler.
Changing servers. thaddy.com may be temporary unreachable but restored when the domain name transfer is done.

Thaddy

  • Hero Member
  • *****
  • Posts: 16763
  • Ceterum censeo Trump esse delendam
Re: efficiency problem
« Reply #137 on: March 14, 2025, 07:55:05 pm »
@LV  is it multithread ?
Multi threading is not the holy grail. If your number of threads exceed the cpu count you are likely to write less efficient code if the code relies on a combined result.
Changing servers. thaddy.com may be temporary unreachable but restored when the domain name transfer is done.

PascalDragon

  • Hero Member
  • *****
  • Posts: 5930
  • Compiler Developer
Re: efficiency problem
« Reply #138 on: March 14, 2025, 07:57:49 pm »
Quote
the latter of which requires an extra pointer dereference and may have terrible memory locality.
NO. There is no difference once the memory is allocated. None whatsoever. There are only indirections by way of the programmer, not the compiler.

You are wrong. A static multidimensional array is one consecutive block in memory, so only a single memory access needs to be done. In a multidimensional dynamic array each sub array is again a pointer, so to access a value in a N-dimensional dynamic array N memory accesses need to be done.

Thaddy

  • Hero Member
  • *****
  • Posts: 16763
  • Ceterum censeo Trump esse delendam
Re: efficiency problem
« Reply #139 on: March 14, 2025, 08:16:06 pm »
That should only matter on relocation (ragged array as result) but never on initialization. The memory layout should be the same.
Changing servers. thaddy.com may be temporary unreachable but restored when the domain name transfer is done.

PascalDragon

  • Hero Member
  • *****
  • Posts: 5930
  • Compiler Developer
Re: efficiency problem
« Reply #140 on: March 14, 2025, 08:24:38 pm »
It has nothing to do with relocation. It's how dynamic arrays work.

If you initialize a 2-dimensional array, let's say to dimensions of length 5 and 10 then the RTL will do 6 memory allocations (1 for the first dimension and 5 for the arrays of the second dimension) and all these might reside in different parts of the heap. Thus you will always have two dereferentiations (in this example).

ALLIGATOR

  • Full Member
  • ***
  • Posts: 148
Re: efficiency problem
« Reply #141 on: March 15, 2025, 02:50:47 am »
If you initialize a 2-dimensional array, let's say to dimensions of length 5 and 10 then the RTL will do 6 memory allocations (1 for the first dimension and 5 for the arrays of the second dimension) and all these might reside in different parts of the heap.
I don't use dynamic arrays n-dimensional (n>1), but I also thought that memory for them is allocated once in one continuous chunk. Thank you, now I know this feature 🙂
(✍️ memorized)

photor

  • Jr. Member
  • **
  • Posts: 80
Re: efficiency problem
« Reply #142 on: March 15, 2025, 07:08:45 am »
As part of the exercise, I:
1. downloaded the libopenblas.dll file from https://sourceforge.net/projects/openblas/.
2. wrote a program.

Output (i7 8700; windows 11; fpc 3.2.2):

Code: [Select]
OpenBLAS time: 16 ms
Naive time:    2140 ms
Max difference: 0.0000000000
Results are consistent!

It appears that I am approaching the performance limit (16 ms), at least on my machine.  ;)

OpenBLAS is automatically multi-threaded. Please set the environment variable
Quote
OPENBLAS_NUM_THREADS=1
to disable it.
« Last Edit: March 15, 2025, 07:11:56 am by photor »

Thaddy

  • Hero Member
  • *****
  • Posts: 16763
  • Ceterum censeo Trump esse delendam
Re: efficiency problem
« Reply #143 on: March 15, 2025, 07:15:44 am »
Amazed that I was wrong.
Changing servers. thaddy.com may be temporary unreachable but restored when the domain name transfer is done.

LV

  • Full Member
  • ***
  • Posts: 245
Re: efficiency problem
« Reply #144 on: March 15, 2025, 07:00:15 pm »
OpenBLAS is automatically multi-threaded. Please set the environment variable
Quote
OPENBLAS_NUM_THREADS=1
to disable it.

Yes, the OpenBLAS library does support multithreading. A procedure has been added to set the number of threads during execution, and the matrix sizes have been expanded to 3000x3000.

Code: Pascal  [Select][+][-]
  1. ...
  2. const
  3.   N = 3000;
  4.  
  5. ...
  6.   procedure openblas_set_num_threads(num: cint); cdecl; external 'libopenblas.dll';
  7.  
  8. begin
  9. ...
  10.   openblas_set_num_threads(1);
  11. ...
  12.   openblas_set_num_threads(2);
  13. ...
  14.   openblas_set_num_threads(3);
  15. ...
  16. end.
  17.  

output:  :-[

Code: Text  [Select][+][-]
  1. OpenBLAS NUM_THREADS=1: 890 ms
  2. OpenBLAS NUM_THREADS=2: 469 ms
  3. OpenBLAS NUM_THREADS=3: 312 ms
  4. OpenBLAS NUM_THREADS=4: 250 ms
  5. OpenBLAS NUM_THREADS=5: 235 ms
  6. OpenBLAS NUM_THREADS=6: 234 ms
  7. Naive Time : 111141 ms
  8. Max difference: 0.0000000000
  9. Results are consistent!
  10.  

photor

  • Jr. Member
  • **
  • Posts: 80
Re: efficiency problem
« Reply #145 on: March 16, 2025, 05:26:02 am »
Yes, the OpenBLAS library does support multithreading. A procedure has been added to set the number of threads during execution, and the matrix sizes have been expanded to 3000x3000.

What's the result for N=1000, single-threaded?

LV

  • Full Member
  • ***
  • Posts: 245
Re: efficiency problem
« Reply #146 on: March 16, 2025, 08:57:13 am »
What's the result for N=1000, single-threaded?

For matrices with dimensions of N = 1000:

OpenBLAS NUM_THREADS=1: 31 ms

About two years ago, I chose FPC and Lazarus for various reasons. For symbolic calculations, I utilized Maxima and the DMath, LMath, and AlgLib libraries for algebraic computations. I now recognize the value of integrating OpenBlas for tasks where performance is critical.

jamie

  • Hero Member
  • *****
  • Posts: 6867
Re: efficiency problem
« Reply #147 on: March 16, 2025, 12:33:38 pm »
Amazed that I was wrong.
Maybe you were.
Jamie
The only true wisdom is knowing you know nothing

PascalDragon

  • Hero Member
  • *****
  • Posts: 5930
  • Compiler Developer
Re: efficiency problem
« Reply #148 on: March 18, 2025, 08:48:11 pm »
Amazed that I was wrong.

Then you might be the only one that is amazed :P

 

TinyPortal © 2005-2018