Recent

Author Topic: [SOLVED] Why can't I beat Python on generating 1 Billion Rows?  (Read 9778 times)

Gustavo 'Gus' Carreno

  • Hero Member
  • *****
  • Posts: 1342
  • Professional amateur ;-P
Re: Why can't I beat Python on generating 1 Billion Rows?
« Reply #30 on: March 03, 2024, 01:54:13 pm »
Hey Warfley,

I was able to get perf to work with FlameGraph and I'm posting the image on this post, because I'm too stooopid to get any insight from it, so maybe someone can give me some help...

Simple your code is a call graph bottom to top. So you can see your code spends most of it's time doing a syscall, which is the write syscall. So most of the time is spent writing to the filesystem. Thats pretty much it.

Which makes a ton of sense because I did the stooopid mistake of not altering my code to use the buffered write !!!!

I'm gonna test now with the REAL buffered stream !!!!

Cheers,
Gus

Warfley

  • Hero Member
  • *****
  • Posts: 2038
Re: Why can't I beat Python on generating 1 Billion Rows?
« Reply #31 on: March 03, 2024, 02:14:41 pm »
But I think this is also then a very good example on how performance is much more than "language fast" or "optimizer good". Like if all you do is just a bunch of OS syscalls like the write in your case, your pascal code is not going to be faster than python or probably even bash for that matter.

Gustavo 'Gus' Carreno

  • Hero Member
  • *****
  • Posts: 1342
  • Professional amateur ;-P
Re: Why can't I beat Python on generating 1 Billion Rows?
« Reply #32 on: March 03, 2024, 02:25:50 pm »
Hey Warfley,

But I think this is also then a very good example on how performance is much more than "language fast" or "optimizer good". Like if all you do is just a bunch of OS syscalls like the write in your case, your pascal code is not going to be faster than python or probably even bash for that matter.

Yeap, makes a lot of sense !!

I think I need to devise a way to write chunks instead of every line!!

Thanks for that!!

Cheers,
Gus

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 12098
  • Debugger - SynEdit - and more
    • wiki
Re: Why can't I beat Python on generating 1 Billion Rows?
« Reply #33 on: March 03, 2024, 02:47:24 pm »
https://apps.kde.org/kcachegrind/

And it has many other views...

Gustavo 'Gus' Carreno

  • Hero Member
  • *****
  • Posts: 1342
  • Professional amateur ;-P
Re: Why can't I beat Python on generating 1 Billion Rows?
« Reply #34 on: March 03, 2024, 03:12:17 pm »
Hey Martin_fr,

https://apps.kde.org/kcachegrind/

And it has many other views...

I'm aware of it since you mentioned it before.
But I'm on Ubuntu, which is GNOME bases and when I try to install that, it comes with a stooopid amount of KDE dependencies, which I would really like to avoid.

Are you aware of a Qt version that I could use?

Nonetheless, many thanks !!

Cheers,
Gus

Gustavo 'Gus' Carreno

  • Hero Member
  • *****
  • Posts: 1342
  • Professional amateur ;-P
Re: Why can't I beat Python on generating 1 Billion Rows?
« Reply #35 on: March 03, 2024, 03:18:41 pm »
Hey Y'all,

After correcting my mistake about not using the buffered stream, this is where I'm at.

With buffered stream's capacity of 4*1024:
Code: Bash  [Select][+][-]
  1. $ time ./bin/generator -i data/weather_stations.csv -o /tmp/measurements-1_000_000_000.txt -n 1_000_000_000
  2. Input File : /home/gcarreno/Programming/1brc-ObjectPascal/data/weather_stations.csv
  3. Output File: /tmp/measurements-1_000_000_000.txt
  4. Line Count : 1,000,000,000
  5.  
  6. Building Weather Stations...
  7. Done.
  8.  
  9. [##################################################] 100.00 % done.
  10.  
  11. real    20m53.633s
  12. user    20m29.230s
  13. sys     0m24.062s

With buffered stream's capacity of 64*1024:
Code: Bash  [Select][+][-]
  1. $ time ./bin/generator -i data/weather_stations.csv -o /tmp/measurements-1_000_000_000.txt -n 1_000_000_000
  2. Input File : /home/gcarreno/Programming/1brc-ObjectPascal/data/weather_stations.csv
  3. Output File: /tmp/measurements-1_000_000_000.txt
  4. Line Count : 1,000,000,000
  5.  
  6. Building Weather Stations...
  7. Done.
  8.  
  9. [##################################################] 100.00 % done.
  10.  
  11. real    20m46.110s
  12. user    20m29.452s
  13. sys     0m16.467s

With buffered stream's capacity of 20*1024*1024:
Code: Bash  [Select][+][-]
  1. $ time ./bin/generator -i data/weather_stations.csv -o /tmp/measurements-1_000_000_000.txt -n 1_000_000_000
  2. Input File : /home/gcarreno/Programming/1brc-ObjectPascal/data/weather_stations.csv
  3. Output File: /tmp/measurements-1_000_000_000.txt
  4. Line Count : 1,000,000,000
  5.  
  6. Building Weather Stations...
  7. Done.
  8.  
  9. [##################################################] 100.00 % done.
  10.  
  11. real    20m11.721s
  12. user    19m58.077s
  13. sys     0m13.559s

Nothing of note in the different value of the buffered stream's capacity.
I'll leave it at 64K for the time being.


I'm now gonna attempt to use a TStringStream to write the values, and then, for every 10k lines, dump it on the buffered stream.
And then, just for the sake of completion, remove the buffered stream as the middle man and see if that brings a difference.


Cheers,
Gus

Warfley

  • Hero Member
  • *****
  • Posts: 2038
Re: Why can't I beat Python on generating 1 Billion Rows?
« Reply #36 on: March 03, 2024, 03:22:53 pm »
Try doing the profiling again after you now optimized around 2/3 of your runtime. This makes the things that previously where to "small" to notice (compared to the giant writing block) more visible.

Also wrt. your previous measurement, what I noticed is that it does not show any of your actual functions but only the FPC functions. Can it be that you have very high optimization on? Or that you don't have debug symbols enabled?

You could also possibly upload your profiling results directly, so we can inspect them directly rather than only through the summary graph.

JuhaManninen

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 4673
  • I like bugs.
Re: Why can't I beat Python on generating 1 Billion Rows?
« Reply #37 on: March 03, 2024, 03:23:53 pm »
Keeping the list sorted will continuously move big chunks of memory to make room for add(). Maybe using of a hash map for checking duplicates and simply append to the end of the list? Sorting just once at the end (if needed)? 
+1
I would guess this was the slowest part. Adding items to a long sorted list is super-slow.
AvgLvlTree (in LazUtils) has TStringMap. It is a balanced tree. Adding items is very fast. It is more memory efficient than a hash map.
If you need the original order of your list, there is a nice hybrid container LookupStringList. It has the TStrings API but uses TStringMap for fast lookups.
Mostly Lazarus trunk and FPC 3.2 on Manjaro Linux 64-bit.

Thaddy

  • Hero Member
  • *****
  • Posts: 18729
  • To Europe: simply sell USA bonds: dollar collapses
Re: Why can't I beat Python on generating 1 Billion Rows?
« Reply #38 on: March 03, 2024, 03:29:08 pm »
or using settextbuf() on the output file descriptor ?
Old one but good one.
Also note the writestr/write/writeln formatting is MUCH faster than the formatxx functions.
Only almost everybody forgets that.
« Last Edit: March 03, 2024, 03:34:11 pm by Thaddy »
If Europe sells their USA bonds the USD will collapse. Europe can affort that given average state debts. The USA can't affort that. Just an advice...

vfclists

  • Hero Member
  • *****
  • Posts: 1157
    • HowTos Considered Harmful?
Re: Why can't I beat Python on generating 1 Billion Rows?
« Reply #39 on: March 03, 2024, 03:34:30 pm »
>outputFileStream.WriteBuffer(line[1], Length(line));


You create an outputBufWriter, but do not use it

Doesn't this vindicate the question Why do Lazarus users share code via zip files rather than repo links?

I guess sharing may not be the right word because the intent is to get others to review code rather than give them a bundle of code they can incorporate into their own programs. But where reviewing the code is important repo links are probably better.

Not to mention, how can you get CoPilot to help with your Pascal code if you don't use Github or some other repo hosting facility? :) :)

Hopefully this comment will not digress this thread on a wild tangent. ;D
Lazarus 3.0/FPC 3.2.2

Gustavo 'Gus' Carreno

  • Hero Member
  • *****
  • Posts: 1342
  • Professional amateur ;-P
Re: Why can't I beat Python on generating 1 Billion Rows?
« Reply #40 on: March 03, 2024, 03:35:40 pm »
Hey Warfley,

Try doing the profiling again after you now optimized around 2/3 of your runtime. This makes the things that previously where to "small" to notice (compared to the giant writing block) more visible.

Yeaps, I will do that cuz it makes sense to now redo it due to my MASSIVE stupidity  :-[

Also wrt. your previous measurement, what I noticed is that it does not show any of your actual functions but only the FPC functions. Can it be that you have very high optimization on? Or that you don't have debug symbols enabled?

I was kinda doing stuff blindly and just used the stuff that comes with the Default Build Mode.
I'll have a better look at the switches and redo the perf run.

You could also possibly upload your profiling results directly, so we can inspect them directly rather than only through the summary graph.

I'll do that. The valgring file is not so massive(I think...), but the perf one is just too big to even think putting it here.

Cheers,
Gus

Gustavo 'Gus' Carreno

  • Hero Member
  • *****
  • Posts: 1342
  • Professional amateur ;-P
Re: Why can't I beat Python on generating 1 Billion Rows?
« Reply #41 on: March 03, 2024, 03:45:00 pm »
Hey Y'All,

For those of you still stuck on optimising the weather station list:
Code: Bash  [Select][+][-]
  1. $ time ./bin/generator -i data/weather_stations.csv -o /tmp/measurements-10_000_000.txt -n 10_000_000
  2. Input File : /home/gcarreno/Programming/1brc-ObjectPascal/data/weather_stations.csv
  3. Output File: /tmp/measurements-10_000_000.txt
  4. Line Count : 10,000,000
  5.  
  6. Building Weather Stations...
  7. Done: Processed 44691 entries from a total of 41343 weather stations in 183 ms
  8.  
  9. [##################################################] 100.00 % done.
  10.  
  11. real    0m12.597s
  12. user    0m12.480s
  13. sys     0m0.116s

Can we agree that 183 millisecs is a good time for ~44k entries ?
Even with the internal sorting being turned on right after creation and with a capacity of 50K?

The access of the weather station list is then accessed via a randomly generated index, not by it's name.
So it makes no sense that I use any sort of hash based container, since I'm not using a string to get the name, amirite?! :D

Cheers,
Gus

Gustavo 'Gus' Carreno

  • Hero Member
  • *****
  • Posts: 1342
  • Professional amateur ;-P
Re: Why can't I beat Python on generating 1 Billion Rows?
« Reply #42 on: March 03, 2024, 03:50:32 pm »
Hey Y'All,

While I DO appreciate, very much, all the wonderful tips about how to cut on time for Write/WriteLn, that is an issue I solved a long time ago:
  • I'm only dumping the progress bar at each 10% of the total done.
This means that, no matter, the total amount of rows, it will only be called 10 times, no more, no less!!!
I don't think that 10 total calls to the progress bar is gonna impact when we're performing 1 Billion rows, right?

Cheers,
Gus

Thaddy

  • Hero Member
  • *****
  • Posts: 18729
  • To Europe: simply sell USA bonds: dollar collapses
Re: Why can't I beat Python on generating 1 Billion Rows?
« Reply #43 on: March 03, 2024, 03:52:21 pm »
Then I will do it myself as a console app. I am sure I can equal or better Python.
Job for tomorrow.
If Europe sells their USA bonds the USD will collapse. Europe can affort that given average state debts. The USA can't affort that. Just an advice...

Gustavo 'Gus' Carreno

  • Hero Member
  • *****
  • Posts: 1342
  • Professional amateur ;-P
Re: Why can't I beat Python on generating 1 Billion Rows?
« Reply #44 on: March 03, 2024, 04:07:19 pm »
Hey Thaddy,

Then I will do it myself as a console app. I am sure I can equal or better Python.
Job for tomorrow.

This is AWESOME!!!

I welcome any competition, mainly because I'm definitely gonna learn something from your, or anyone else's, approach.

I'm now a bit stuck in my own rut, and maybe I can't see the trees for the forest :D

Oh, an please, then follow up with the actual challenge. Would be awesome to have peeps from this forum dropping in their entry!!!

Cheers,
Gus

 

TinyPortal © 2005-2018