Recent

Author Topic: concatenate multiple strings  (Read 13745 times)

tclfpc

  • Newbie
  • Posts: 5
concatenate multiple strings
« on: July 17, 2015, 10:28:49 pm »
Presently I'm trying to concatenate an array of ansistrings to one big string.
The first attempt was:

finalstr:='';
for element in array1 do finalstr:=finalstr+element;

Alas this has quadratic runtime and requires continuous reallocation. In an attempt to speed up the process I'm now allocating space for one big string with length as the sum of the length of all array alements together at the start. Then I'm trying to paste the elements at the proper positions in the final string. The Problem now is that trying to:

for i in array1 do begin
  finalstring[pos1..(pos1+length(I)-1)]:=i;
  updatepos;
end

doesn't work, as the start and end positions for the index range have to be literals, not variables. How would one proceed here?

lainz

  • Hero Member
  • *****
  • Posts: 4742
  • Web, Desktop & Android developer
    • https://lainz.github.io/
Re: concatenate multiple strings
« Reply #1 on: July 17, 2015, 10:38:24 pm »
Maybe you can use a TStringList and concatenate all of them just at the end.

http://wiki.freepascal.org/TStringList-TStrings_Tutorial

Code: [Select]
program StrList;
{$mode objfpc}
uses
 Classes, SysUtils;
var
  Str: TStringList;
begin
  Str := TStringList.Create; // This is needed when using this class(or most classes)
  Str.Add('Some String!');
  writeln('The stringlist now has ' + IntToStr(Str.Count) + ' string(s).');
  Readln;
  Str.Free; //Release the memory used by this stringlist instance
end.

Code: [Select]
You can get the text of all strings as a single string using the Text property.

tclfpc

  • Newbie
  • Posts: 5
Re: concatenate multiple strings
« Reply #2 on: July 17, 2015, 10:53:03 pm »
Thanks for the suggestion and in particular the quick response.

Despite that being a very convenient method I definitely do *not* want to use TStringlist as it is far slower! Compared to the TStringlist approach I could already save about 30% of time (i.e. 15-20 min per program run) by using the piecewise concatenation posted above.

Maybe I could use some pointer for accessing the middle of a string, but which type?

lainz

  • Hero Member
  • *****
  • Posts: 4742
  • Web, Desktop & Android developer
    • https://lainz.github.io/
Re: concatenate multiple strings
« Reply #3 on: July 17, 2015, 11:07:41 pm »
If you know the size of the array you can set it and the array of strings will be already there

    StringList.Size := X;

And maybe is faster?

Edit: Alright, you already have the array created so, check TStringList method of concatenating maybe is fast, else we found something that may be improved.

Code: [Select]
Function TStrings.GetTextStr: string;

Var P : Pchar;
    I,L,NLS : Longint;
    S,NL : String;

begin
  CheckSpecialChars;
  // Determine needed place
  Case FLBS of
    tlbsLF   : NL:=#10;
    tlbsCRLF : NL:=#13#10;
    tlbsCR   : NL:=#13;
  end;
  L:=0;
  NLS:=Length(NL);
  For I:=0 to count-1 do
    L:=L+Length(Strings[I])+NLS;
  Setlength(Result,L);
  P:=Pointer(Result);
  For i:=0 To count-1 do
    begin
    S:=Strings[I];
    L:=Length(S);
    if L<>0 then
      System.Move(Pointer(S)^,P^,L);
    P:=P+L;
    For L:=1 to NLS do
      begin
      P^:=NL[L];
      inc(P);
      end;
    end;
end;

balazsszekely

  • Guest
Re: concatenate multiple strings
« Reply #4 on: July 17, 2015, 11:08:25 pm »
Hi,

Try something like this:
Code: [Select]
uses LCLIntf;
{$R *.lfm}

{ TForm1 }

procedure TForm1.Button1Click(Sender: TObject);
var
  A: array of AnsiString;
  Res: AnsiString;
  Ms: TMemoryStream;
  I: integer;
  Start: DWord;
begin
  //init array
  SetLength(A, 100000);
  for I := Low(A) to High(A) do
    A[I] := IntToStr(I) + IntToStr(I) + IntToStr(I);
{  A[0] := '000';
   A[1] := '111';
    ...
   A[n] := 'nnn'}


  Start := GetTickCount;
  Ms := TMemoryStream.Create;
  try
    //copy to memory stream;
    for I := Low(A) to High(A) do
      Ms.WriteBuffer(Pointer(A[I])^, Length(A[I]));

    //copy memory stream to result
    Ms.Position := 0;
    SetLength(Res, Ms.Size);
    Ms.ReadBuffer(Pointer(Res)^, Ms.Size);
  finally
    Ms.Free;
  end;
  ShowMessage('Executed in: ' + IntToStr(GetTickCount - Start) + '  ms' + sLineBreak +
              'Result: ' + Res);
end;                       

regards,
GetMem

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 12288
  • Debugger - SynEdit - and more
    • wiki
Re: concatenate multiple strings
« Reply #5 on: July 17, 2015, 11:24:14 pm »
If you need speed, then "move" is what you want.
But be aware, it has no build in security, no checks that you stay in the boundaries of allocated memory

incomplete/sample
Code: [Select]
  SetLength(s, totallen),
  p := 1; pos
  for i := 0 to count-1) do begin
    Move( src[i][1],  // from first char, of i'th string
             s[p],  // p current insert pos
             length(src[i])
    ):
    inc(p, length(src[i]));
  end;

howardpc

  • Hero Member
  • *****
  • Posts: 4144
Re: concatenate multiple strings
« Reply #6 on: July 18, 2015, 12:25:20 am »
Based on Martin's suggestion this may be faster than simple concatenations:

Code: [Select]
uses types;

function Concatenate(aStrArray: TStringDynArray): string;
var
  a, i, w: integer;
  len: integer = 0;
  p: integer = 1;
  iArr: TIntegerDynArray;
begin
  a:=High(aStrArray);
  SetLength(iArr, Succ(a));
  for i:=0 to a do
    begin
      w:=Length(aStrArray[i]);
      iArr[i]:=w;
      Inc(len, w);
    end;
  SetLength(Result, len);
  for i:=0 to a do
    begin
      Move(aStrArray[i][1], Result[p], iArr[i]);
      Inc(p, iArr[i]);
    end;
end;

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 12288
  • Debugger - SynEdit - and more
    • wiki
Re: concatenate multiple strings
« Reply #7 on: July 18, 2015, 02:36:37 pm »
I wouldn't store a copy of the length in iArr.

It has no benefit. the length is already stored as an integer as part of the string. So "length()" does no calculation, it is just a "get one integer from memory". Same as accessing iarr.

But using iarr means storing at a separate memory location, and therefore potentially more cache misses for the cpu. So it could even be slower. (the string needs to be loaded into the cache anyway)

howardpc

  • Hero Member
  • *****
  • Posts: 4144
Re: concatenate multiple strings
« Reply #8 on: July 19, 2015, 12:54:09 pm »
I wouldn't store a copy of the length in iArr.

It has no benefit.

Thanks for constructive feedback. Foolish of me to forget that this is not C where string lengths have to be constantly calculated, but sensible Pascal where strings carry their lengths around with them.

In informal testing I find that concatenation from a string array of reasonable numbers of strings of average length (up to say 20) is about 15 to 20 times faster using Move() on my machine. Pushing the average string length to 50 or higher, and testing with larger numbers of strings (upwards of a million) gives Move() more like a hundredfold advantage over straight string addition, though trying to concatenate a multi-million sized string array of longish strings can cause an out of memory exception...

tclfpc

  • Newbie
  • Posts: 5
Re: concatenate multiple strings
« Reply #9 on: July 19, 2015, 10:25:09 pm »
You made my day. "Move" was the way to go.

Thanks a lot to all responders, in particular to Martin who additionally gave the hint to explicitely index the 1st charcters of the respective strings. When using ansistrings only providing the string variable names/array indices leads to access violations. This does not occur when the provided arguments where indexed with their 1st charcters.

Best tclfpc

 

TinyPortal © 2005-2018