Forum > General
[SOLVED] Speed-up masive file writing.
miki:
Hi everybody :D
I'm making a program with some test and PoC, in both FreePascal [3.0] and Java [1.8].
I have a function to write integer arrays into text files (each line is a number). The Java code:
--- Code: Javascript [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} --- public void arrayToFile(int[] array, String fileName) throws IOException { BufferedWriter writer = new BufferedWriter(new FileWriter(new File(fileName))); for (int number: array) { writer.write(Integer.toString(number)); writer.newLine(); } writer.close(); }
This code takes 700~800ms to save 10,000,000 integers, creating a 110MB file.
When writing this in FreePascal, I first came to this solution
--- Code: Pascal [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---procedure arrayToFile(numbers: Array of Integer; fileName: String);var list: TStrings; i, len: Integer;begin list := TStringList.Create; len := Length(numbers) - 1; for i := 0 to len do begin list.Add(IntToStr(numbers[i])); end; list.SaveToFile(FileName); list.Free;end;
It took 2.2~2.8 seconds. Setting list.Capacity or list.Begin/EndUpdate had no measurable effect.
Searching for something better, I tried this approach:
--- Code: Pascal [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---procedure arrayToFile(numbers: Array of Integer; fileName: String);var data: TMemoryStream; i, len: Integer; Str: String;begin data := TMemoryStream.Create; len := Length(numbers) - 1; for i := 0 to len do begin Str := IntToStr(numbers[i]) + lineEnding; data.write(str[1], Length(str)); end; data.Position := 0; data.SaveToFile(fileName); data.Free; done;end;
I removed the sizeOf(char) part, because I expect this to be always 1, right?
This was better, 1.8~2.0 seconds. Still slower than Java.
Searching further, I also found this other thing:
--- Code: Pascal [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---procedure arrayToFile(numbers: Array of Integer; fileName: String);var data: TMemoryStream; i, len: Integer; str, t: String;begin data := TMemoryStream.Create; len := Length(numbers) - 1; for i := 0 to len do begin System.Str(numbers[i], t); Str := t + lineEnding; data.write(str[1], Length(str)); end; data.Position := 0; data.SaveToFile(fileName); data.Free;end;
Faster!!! 1.0~1.2 seconds. Still slower than Java, but very close.
The question: any ideas on how to speed up this process? To make it at least as fast as Java, problably reaching my HDD bottleneck.
Regards :)
PS - extra details:
Compiler options: -O4 -XX -CX -Xs
JVM options: -server -XX:CompileThreshold=2 -XX:+AggressiveOpts -XX:+UseFastAccessorMethods
Phil:
Try doing a WriteLn to the file.
You'll need AssignFile on a TextFile, then Rewrite, then the WriteLn calls, finally CloseFile.
See FPC docs.
miki:
Hi Phil, thanks for your fast answer. Today it's about speed 8)
--- Quote from: Phil on August 23, 2016, 12:31:46 am ---You'll need AssignFile on a TextFile, then Rewrite, then the WriteLn calls, finally CloseFile.
See FPC docs.
--- End quote ---
Don't need to see any docs, I learnt to use those functions when I was 14 (maaaany years ago).
For some reason, I excepted them to be slower. But I was wrong: 900ms 620ms
Anyways, I got the answer for my own question. Just removing the string concatenation, things go much faster.
--- Code: Pascal [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---procedure arrayToFile(numbers: Array of Integer; fileName: String);var data: TMemoryStream; i, len, lendl: Integer; str, lend: String;begin data := TMemoryStream.Create; len := Length(numbers) - 1; lend := lineEnding; lendl := length(str); for i := 0 to len do begin System.Str(numbers[i], str); data.write(Str[1], Length(Str)); data.write(lend[1], lendl); end; data.Position := 0; data.SaveToFile(fileName); data.Free;end;
Time: 660~680ms. I'm happy for today :)
Of course, the WriteLn approach has a great pro: smaller memory footprint. But 100MB aren't that much in modern computers.
Regards.
EDIT: running this in a SSD device, the saving time is even a bit faster, about 600ms. So hey, the code is faster than the HDD.
johnsson:
Can I suggest a modification?
I' m realy sorry, I don't see the string conversion, so here is the correct version.
--- Code: ---procedure arrayToFile(var numbers: Array of Integer; fileName: String);
var
Data: TMemoryStream;
I: Integer;
P, S: Pointer;
V: string;
begin
Data := TMemoryStream.Create;
Data.SetSize(15 * Length(numbers));
P := Data.Memory;
S := P;
for I := 0 to High(numbers) do
begin
System.Str(numbers[i], V);
Move(V[1],P^,Length(V));
P += Length(V);
Move(LineEnding,P^,2);
P += 2;
end;
Data.SetSize(P - S);
Data.SaveToFile(fileName);
Data.Free;
end;
--- End code ---
Here I got a 1400ms execution time, but I' am running this in a laptop with rly poor CPU. The major cost is the conversion procedure.
:D
Laksen:
Try this. Much simpler and much faster
--- Code: Pascal [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---rocedure arrayToFile3(const numbers: Array of longint; const fileName: String);var i: longint; buf: array[0..65535] of char; f: Text;begin AssignFile(f, fileName); Rewrite(f); SetTextBuf(f,buf[0],sizeof(buf)); for i := 0 to high(numbers) do writeln(f, numbers[i]); CloseFile(f);end;
Navigation
[0] Message Index
[#] Next page