Forum > General

[SOLVED] Speed-up masive file writing.

(1/3) > >>

miki:
Hi everybody  :D

I'm making a program with some test and PoC, in both FreePascal [3.0] and Java [1.8].

I have a function to write integer arrays into text files (each line is a number). The Java code:

--- Code: Javascript  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---    public void arrayToFile(int[] array, String fileName) throws IOException    {        BufferedWriter writer = new BufferedWriter(new FileWriter(new File(fileName)));        for (int number: array)        {            writer.write(Integer.toString(number));            writer.newLine();        }        writer.close();    } 
This code takes 700~800ms to save 10,000,000 integers, creating a 110MB file.

When writing this in FreePascal, I first came to this solution

--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---procedure arrayToFile(numbers: Array of Integer; fileName: String);var    list: TStrings;    i, len: Integer;begin    list := TStringList.Create;    len := Length(numbers) - 1;    for i := 0 to len do    begin        list.Add(IntToStr(numbers[i]));    end;    list.SaveToFile(FileName);    list.Free;end; 
It took 2.2~2.8 seconds. Setting list.Capacity or list.Begin/EndUpdate had no measurable effect.
Searching for something better, I tried this approach:

--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---procedure arrayToFile(numbers: Array of Integer; fileName: String);var    data: TMemoryStream;    i, len: Integer;    Str: String;begin    data := TMemoryStream.Create;    len := Length(numbers) - 1;    for i := 0 to len do    begin        Str := IntToStr(numbers[i]) + lineEnding;        data.write(str[1], Length(str));    end;    data.Position := 0;    data.SaveToFile(fileName);    data.Free;    done;end; 
I removed the sizeOf(char) part, because I expect this to be always 1, right?
This was better, 1.8~2.0 seconds. Still slower than Java.

Searching further, I also found this other thing:

--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---procedure arrayToFile(numbers: Array of Integer; fileName: String);var    data: TMemoryStream;    i, len: Integer;    str, t: String;begin    data := TMemoryStream.Create;    len := Length(numbers) - 1;    for i := 0 to len do    begin        System.Str(numbers[i], t);        Str := t + lineEnding;        data.write(str[1], Length(str));    end;    data.Position := 0;    data.SaveToFile(fileName);    data.Free;end; 
Faster!!! 1.0~1.2 seconds. Still slower than Java, but very close.

The question: any ideas on how to speed up this process? To make it at least as fast as Java, problably reaching my HDD bottleneck.

Regards  :)

PS - extra details:
Compiler options: -O4 -XX -CX -Xs
JVM options: -server -XX:CompileThreshold=2 -XX:+AggressiveOpts -XX:+UseFastAccessorMethods

Phil:
Try doing a WriteLn to the file.

You'll need AssignFile on a TextFile, then Rewrite, then the WriteLn calls, finally CloseFile.

See FPC docs.

miki:
Hi Phil, thanks for your fast answer. Today it's about speed  8)


--- Quote from: Phil on August 23, 2016, 12:31:46 am ---You'll need AssignFile on a TextFile, then Rewrite, then the WriteLn calls, finally CloseFile.
See FPC docs.

--- End quote ---

Don't need to see any docs, I learnt to use those functions when I was 14 (maaaany years ago).
For some reason, I excepted them to be slower. But I was wrong: 900ms 620ms

Anyways, I got the answer for my own question. Just removing the string concatenation, things go much faster.

--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---procedure arrayToFile(numbers: Array of Integer; fileName: String);var    data: TMemoryStream;    i, len, lendl: Integer;    str, lend: String;begin    data := TMemoryStream.Create;    len := Length(numbers) - 1;    lend := lineEnding;    lendl := length(str);    for i := 0 to len do    begin        System.Str(numbers[i], str);        data.write(Str[1], Length(Str));        data.write(lend[1], lendl);    end;    data.Position := 0;    data.SaveToFile(fileName);    data.Free;end; 
Time: 660~680ms. I'm happy for today :)

Of course, the WriteLn approach has a great pro: smaller memory footprint. But 100MB aren't that much in modern computers.

Regards.

EDIT: running this in a SSD device, the saving time is even a bit faster, about 600ms. So hey, the code is faster than the HDD.

johnsson:
Can I suggest a modification?

I' m realy sorry, I don't see the string conversion, so here is the correct version.


--- Code: ---procedure arrayToFile(var numbers: Array of Integer; fileName: String);
var
  Data: TMemoryStream;
  I: Integer;
  P, S: Pointer;
  V: string;
begin
  Data := TMemoryStream.Create;
  Data.SetSize(15 * Length(numbers));
  P := Data.Memory;
  S := P;
  for I := 0 to High(numbers) do
  begin
    System.Str(numbers[i], V);
    Move(V[1],P^,Length(V));
    P += Length(V);
    Move(LineEnding,P^,2);
    P += 2;
  end;
  Data.SetSize(P - S);
  Data.SaveToFile(fileName);
  Data.Free;
end;

--- End code ---

Here I got a 1400ms execution time, but I' am running this in a laptop with rly poor CPU. The major cost is the conversion procedure.

 :D

Laksen:
Try this. Much simpler and much faster

--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---rocedure arrayToFile3(const numbers: Array of longint; const fileName: String);var  i: longint;  buf: array[0..65535] of char;  f: Text;begin  AssignFile(f, fileName);  Rewrite(f);   SetTextBuf(f,buf[0],sizeof(buf));   for i := 0 to high(numbers) do    writeln(f, numbers[i]);   CloseFile(f);end; 

Navigation

[0] Message Index

[#] Next page

Go to full version