Forum > General

[Solved:] Bug in concat function for AnsiStrings?

(1/2) > >>

jwdietrich:
FPCs behaviour with respect of concatenating long strings is somewhat unexpected and seems to depend on the implementation. I assume that this is a bug, but before I submit a bug report I would like to ask if there could be a form of intention behind this response, which is a bit unexpected for me.

If we define 20 string constants with a length of 20 characters each with


--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---const  part1: string[20] = 'Part1_7890123456789_';  part2: string[20] = 'Part2_7890123456789_';  part3: string[20] = 'Part3_7890123456789_';  part4: string[20] = 'Part4_7890123456789_';  part5: string[20] = 'Part5_7890123456789_';  part6: string[20] = 'Part6_7890123456789_';  part7: string[20] = 'Part7_7890123456789_';  part8: string[20] = 'Part8_7890123456789_';  part9: string[20] = 'Part9_7890123456789_';  part10: string[20] = 'Part10_890123456789_';  part11: string[20] = 'Part11_890123456789_';  part12: string[20] = 'Part12_890123456789_';  part13: string[20] = 'Part13_890123456789_';  part14: string[20] = 'Part14_890123456789_';  part15: string[20] = 'Part15_890123456789_';  part16: string[20] = 'Part16_890123456789_';  part17: string[20] = 'Part17_890123456789_';  part18: string[20] = 'Part18_890123456789_';  part19: string[20] = 'Part19_890123456789_';  part20: string[20] = 'Part20_890123456789_'; 
and the following three functions to combine the 20 strings to one


--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---function CombinedByMethodA: AnsiString;var  tempString: AnsiString;begin  tempString := part1 + part2 + part3 + part4 + part5 + part6 + part7 + part8 +                part9 + part10 + part11 + part12 + part13 + part14 + part15 +                part16 + part17 + part18 + part19 + part20;  result := tempString;end; function CombinedByMethodB: AnsiString;var  tempString1, tempString2, tempString3: AnsiString;begin  tempString1 := part1 + part2 + part3 + part4 + part5 + part6 + part7 + part8 +                part9 + part10;  tempString2 := part11 + part12 + part13 + part14 + part15 +                  part16 + part17 + part18 + part19 + part20;  tempString3 := tempString1 + tempString2;  result := tempString3;end; function CombinedByMethodC: AnsiString;var  tempString: AnsiString;begin  tempString := concat(part1, part2, part3, part4, part5, part6, part7, part8,                part9, part10, part11, part12, part13, part14, part15, part16,                part17, part18, part19, part20);  result := tempString;end; 
then only method B seems to deliver an expected result, i.e. a string of length 400 with the content
--- Quote ---Part1_7890123456789_Part2_7890123456789_Part3_7890123456789_Part4_7890123456789_Part5_7890123456789_Part6_7890123456789_Part7_7890123456789_Part8_7890123456789_Part9_7890123456789_Part10_890123456789_Part11_890123456789_Part12_890123456789_Part13_890123456789_Part14_890123456789_Part15_890123456789_Part16_890123456789_Part17_890123456789_Part18_890123456789_Part19_890123456789_Part20_890123456789_
--- End quote ---
.

Methods A and B deliver shorter strings, which end in the middle of part 13:
--- Quote ---Part1_7890123456789_Part2_7890123456789_Part3_7890123456789_Part4_7890123456789_Part5_7890123456789_Part6_7890123456789_Part7_7890123456789_Part8_7890123456789_Part9_7890123456789_Part10_890123456789_Part11_890123456789_Part12_890123456789_Part13_89012345
--- End quote ---

I assume that this is a bug, which occurs if a large numbers of substrings is combined. Or is there any reason that motivates this behaviour?

A very simple Lazarus program demonstrating this effect is attached.

Thaddy:
string[20] is shortstring, not longstring. It will overflow on concats beyond a length of 255 of course.
If you declare the string constants simply as string or AnsiString, the code works as expected.
So I guess this is not a bug, but expected?
Meaning the string conversion to AnsiString is ultimately done at concat end and internally the parts are still shortstring because if the length specifier.
Also be  careful in Lazarus: string equals UTF8 string in Lazarys, not Ansi. Always better to specify AnsiString if you expect AnsiString.

jwdietrich:

--- Quote from: Thaddy on April 24, 2018, 08:58:11 am ---string[20] is shortstring, not longstring. It will overflow on concats beyond a length of 255 of course.
If you declare the string constants simply as string or AnsiString, the code works as expected.
So I guess this is not a bug, but expected?

--- End quote ---

Does this mean that the short strings are first concatenated to a combined short string and the resulting short string is then, in a second step only, converted to an AnsiString?

Thaddy:

--- Quote from: jwdietrich on April 24, 2018, 09:04:31 am ---
--- Quote from: Thaddy on April 24, 2018, 08:58:11 am ---string[20] is shortstring, not longstring. It will overflow on concats beyond a length of 255 of course.
If you declare the string constants simply as string or AnsiString, the code works as expected.
So I guess this is not a bug, but expected?

--- End quote ---

Does this mean that the short strings are first concatenated to a combined short string and the resulting short string is then, in a second step only, converted to an AnsiString?

--- End quote ---
Yes, because concatting shortstrings is also legal. You were assuming something that the compiler can't forsee and ran into the 255 char limitation for shortstrings..
There are no AnsiStrings declarable with fixed lengths. See:

--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---program testAnsiStrings;{$ifdef fpc}{$mode delphi}{$H+}{$endif}const  part1: AnsiString = 'Part1_7890123456789_';  part2: AnsiString = 'Part2_7890123456789_';  part3: AnsiString = 'Part3_7890123456789_';  part4: AnsiString = 'Part4_7890123456789_';  part5: AnsiString = 'Part5_7890123456789_';  part6: AnsiString = 'Part6_7890123456789_';  part7: AnsiString = 'Part7_7890123456789_';  part8: AnsiString = 'Part8_7890123456789_';  part9: AnsiString = 'Part9_7890123456789_';  part10: AnsiString = 'Part10_890123456789_';  part11: AnsiString = 'Part11_890123456789_';  part12: AnsiString = 'Part12_890123456789_';  part13: AnsiString = 'Part13_890123456789_';  part14: AnsiString = 'Part14_890123456789_';  part15: AnsiString = 'Part15_890123456789_';  part16: AnsiString = 'Part16_890123456789_';  part17: AnsiString = 'Part17_890123456789_';  part18: AnsiString = 'Part18_890123456789_';  part19: AnsiString = 'Part19_890123456789_';  part20: AnsiString = 'Part20_890123456789_'; function CombinedByMethodA: AnsiString;var  tempString: AnsiString;begin  tempString := part1 + part2 + part3 + part4 + part5 + part6 + part7 + part8 +                part9 + part10 + part11 + part12 + part13 + part14 + part15 +                part16 + part17 + part18 + part19 + part20;  result := tempString;end; function CombinedByMethodB: AnsiString;var  tempString1, tempString2, tempString3: AnsiString;begin  tempString1 := part1 + part2 + part3 + part4 + part5 + part6 + part7 + part8 +                part9 + part10;  tempString2 := part11 + part12 + part13 + part14 + part15 +                  part16 + part17 + part18 + part19 + part20;  tempString3 := tempString1 + tempString2;  result := tempString3;end; function CombinedByMethodC: AnsiString;var  tempString: AnsiString;begin  tempString := concat(part1, part2, part3, part4, part5, part6, part7, part8,                part9, part10, part11, part12, part13, part14, part15, part16,                part17, part18, part19, part20);  result := tempString;end;  begin  writeln(CombinedByMethodA);  writeln(CombinedByMethodB);  writeln(CombinedByMethodC);end.Which gives the correct output for all three.
You need a type cast to AnsiString *before* you attempt the concat operation.
This is mostly if not all documented.

Thaddy:
Btw, given String[20] you can also do:

--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---function CombinedByMethodD: AnsiString;var  tempString: AnsiString;begin  tempString := (part1 + part2 + part3 + part4 + part5 + part6 + part7 + part8 +                part9 + part10 + part11 + part12) + (part13 + part14 + part15 +                part16 + part17 + part18 + part19 + part20);  result := tempString;end;The brackets resolve a maxlength of 240 and then convert....two AnsiSTrings w/o typecasts
But that is implied logic and not documented directly.

Navigation

[0] Message Index

[#] Next page

Go to full version