Forum > General
[SOLVED] Speed-up masive file writing.
johnsson:
--- Quote from: miki on August 23, 2016, 08:42:59 pm ---Ok, people, sumary (all test in SSD drive):
My last approach: 580~610ms
Phil's approach: 620~630ms
johnsson's approach: 460~480ms (the result is wrong)
Lanksen's approach: 435ms (Phil's + SetTextBuf)
Notes:
Yes, it seems SetTextBuf makes a difference. I don't like the remarks in docs, but it's the fastest!
Phil's approach: faster now? yes, I made a profiling error yesterday, sorry for that.
johnsson's: what is wrong? The line "Move(LineEnding...)" didn't compile for some reason. I guess we are using different OS, and this constant may be a string for you and a char for me. I made a little change in the code to make it work, but introduced some mistake, because the final file is a bit bigger and has broken line breaks [get the joke? broken breaks]. Anyways, very fast, too.
Lots of thanks to all of you.
Regards.
--- End quote ---
I' m working with Win 10 64bits and Lazarus 1.6, the LineEnding is just a constant #13#10 in the System unit. Btw here the SetBufText approach take more time to execute than my approach, also, I don't got any compiler error and the final file is correct. Maybe a few diferences between SO and my PC specs cause this.
PC Spec
Laptop Asus S400CA
Core I3 1.5ghz (2 Cores + 2 HT)
6GB DDR3 1600mhz
SSD 120gb PNY
Anyaway the most important is the best time execution front the java version.
Free Pascal Rocks 8-)
A last comment, there a few routines optimization to convert Int type to string using assembly and SIMD commands, probably this will result in a better time execution.
:D
marcov:
--- Quote from: User137 on August 24, 2016, 04:41:33 am ---Is there a typo in the docs? http://www.freepascal.org/docs-html/rtl/system/settextbuf.html
--- Quote ---The maximum size of the newly assigned buffer is 65355 bytes.
--- End quote ---
Should be 65535? (2^16 - 1)
--- End quote ---
Maybe not. If a 16-bits value is used to hold buffersize. My guess however is that that is old TP leftover, and FPC accepts larger ones. Delphi allows buffers of MBs and larger (not that it matters much)
miki:
--- Quote from: johnsson on August 24, 2016, 05:58:55 am ---I' m working with Win 10 64bits and Lazarus 1.6, the LineEnding is just a constant #13#10 in the System unit. Btw here the SetBufText approach take more time to execute than my approach, also, I don't got any compiler error and the final file is correct. Maybe a few diferences between SO and my PC specs cause this.
PC Spec
Laptop Asus S400CA
Core I3 1.5ghz (2 Cores + 2 HT)
6GB DDR3 1600mhz
SSD 120gb PNY
--- End quote ---
Similar computer here, but desktop, not laptop. Using Manjaro Linux 64bits. According to documentation, LineEnding is system dependent.
In my case (Linux), it's just a #10, so it may be seen as char, while your Windows' #13#10 is a string.
Anyways, the wrong file I was getting was my fault, wrong adaption of your code. I realized later. Here the correct one:
--- Code: Pascal [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---procedure .arrayToFile(numbers: Array of Integer; fileName: String);var Data: TMemoryStream; I, Ll: Integer; P, S: Pointer; V, L: string;begin Data := TMemoryStream.Create; L := LineEnding; // <-- if it's a char, now it's a string Ll:= Length(LineEnding); // <-- 1 on *NIX, 2 on windows Data.SetSize((11 + Ll) * Length(numbers)); //<-- as numbers are signed 32bits integers, they will never be longer than 11digits + LineEnding P := Data.Memory; S := P; for I := 0 to High(numbers) do begin System.Str(numbers[i], V); Move(V[1], P^, Length(V)); P += Length(V); Move(L, P^, Ll); P += Ll; // <--- my error was here, leaving your "2"; that's the wrong thing with magic numbers :) end; Data.SetSize(P - S); Data.SaveToFile(fileName); Data.Free;end;
FreePascal faster than Java? Well, I expected so, but it seems I have to do some tricks to achieve that!
I did the reverse-way function (fileToArray), and to get a fast result, I had to write my own StrToInt. Special one, because it reads.
First attempt, with TStringList + StrToInt loop, 2.8s
--- Code: Pascal [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---function fileToArray(fileName: String): TIntArray;var list: TStrings; i, len: Integer;begin list := TStringList.Create; list.LoadFromFile(fileName); len := list.Count - 1; setLength(result, len + 1); for i := 0 to len do begin result[i] := StrToInt(list[i]); end; list.Free;end;
Second attempt, using AssignFile, ReadLn, ... 1.2~1.6s. The magic number makes a lot!!! big one is faster. SetTextBuf is relevant, but not that much.
--- Code: Pascal [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---function fileToArray(fileName: String): TIntArray;var F: TextFile; i, bufs, s: Integer; buf: array[0..65535] of char;begin AssignFile(F, fileName); SetTextBuf(F, buf[0], sizeof(buf)); Reset(F); i := 0; bufs := 1; s := bufs*10000000; SetLength(result, s); while not eof(F) do begin ReadLn(F, result[i]); Inc(i); if i = s then begin Inc(bufs); s := bufs*10000000; SetLength(result, s); end; end; CloseFile(F); setLength(result, i);end;
And the winner, using a stream, fastest but bigger, algo more memory footprint, 350ms!!!!!!
--- Code: Pascal [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---function fileToArray(fileName: String): TIntArray;var fs: TFileStream; i, bufs, s: Integer; n, num, sign: Integer; ss: string; c:char; skip: boolean;begin fs := TFileStream.Create(fileName, fmOpenRead); setLength(ss, fs.size); fs.read(ss[1], fs.size); fs.free; i := 0; bufs := 1; s := bufs*10000000; SetLength(result, s); num := 0; sign := 1; skip := true; for n := 1 to high(ss) do begin c := ss[n]; if c = '-' then begin skip := false; sign := -1; end else if c in ['0'..'9'] then begin skip := false; num := num*10 + (ord(c) - 48); end else if not skip then begin skip := true; result[i] := num * sign; sign := 1; num := 0; Inc(i); if i = s then begin Inc(bufs); s := bufs*10000000; SetLength(result, s); end; end; end; setLength(result, i);end;
The Java version is not that fast, but still competes and much cleaner. It takes about 900ms.
--- Code: Javascript [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} --- public int[] fileToArray(String fileName) throws IOException { try (BufferedReader br = new BufferedReader(new FileReader(fileName))) { int[] res = br.lines() .mapToInt(Integer::parseInt) .toArray(); return res; } }
Comparisons are not fair, anyways. Java hasn't the compatibility with old code and Delphi and other stuff, and FreePascal hasn't the Java budget.
What do you say?
Regards.
marcov:
--- Quote from: miki on August 24, 2016, 09:28:41 pm ---
Comparisons are not fair, anyways. Java hasn't the compatibility with old code and Delphi and other stuff, and FreePascal hasn't the Java budget.
What do you say?
--- End quote ---
And it is just one terribly small piece, aka microbenchmarking. Make an application that actually does something.
A task that maps (pun intended) nearly wholly onto a library or language feature will look shorter using that.
Leledumbo:
--- Quote from: miki on August 24, 2016, 09:28:41 pm ---The Java version is not that fast, but still competes and much cleaner. It takes about 900ms.
--- End quote ---
It's not difficult to make Pascal version that looks like Java one. TReadBufStream is practically BufferedReader counterpart, so is TFileStream for FileReader. Their interface is different, though. TReadBufStream has no lines() method, but extending the class to create such method is no difficult (just use array of String or TStrings for return value) with the help of StreamIO unit, just ReadLn until EOF. Again, mapToInt() can be implemented using type helper (for array of String) or extending TString(List) with such a method. No need for toArray() as it's better for the method to directly return an array (of Integer) instead of another stream (of int).
Wanna try creating and benchmarking that version?
Navigation
[0] Message Index
[*] Previous page