### Bookstore

 Computer Math and Games in Pascal (preview) Lazarus Handbook

### Author Topic: Quick question about const array of string  (Read 3366 times)

#### lucamar

• Hero Member
• Posts: 3435
##### Quick question about const array of string
« on: June 25, 2019, 12:13:21 am »
Just to test my understanding, in this construction:

Code: Pascal  [Select][+][-]
1. const
2.   DTFmt: array[DTFmtKind] of string =
3.     ('',
4.      'yyyy-mm-dd hh:nn', 'yyyy-mm-dd', 'yyyy-mm-dd', {ISO formats}
5.      'dddddd tt', 'dddddd', 'tt',  {Long formats}
6.      'ddddd t', 'ddddd', 't',  {Short formats}
7.      'f', {Mixed short date + long time}
8.      '');

the two empty strings (first and last) ocuppy no memory, right? Or just two Nil pointers?
Turbo Pascal 3 CP/M - Amstrad PCW 8256 (512 KB !!!)
Lazarus/FPC 2.0.8/3.0.4 & 2.0.10/3.2.0 - 32/64 bits on:
(K|L|X)Ubuntu 12..18, Windows XP, 7, 10 and various DOSes.

#### howardpc

• Hero Member
• Posts: 3610
##### Re: Quick question about const array of string
« Reply #1 on: June 25, 2019, 12:21:54 am »
With {\$H+} (ansistring) empty strings are Nil pointers.

#### lucamar

• Hero Member
• Posts: 3435
##### Re: Quick question about const array of string
« Reply #2 on: June 25, 2019, 12:31:53 am »
So it is kind of like if it were an array of pointers of which the first an last are Nil. Is that (kind of) it?

ETA Got a WTH question from outside, so I'll explain a little what I'm doing.

I have some menu/listbox/radiogroup items which select which format should be used to insert a date/time somewhere else. To select the format I'm using the components Tag, and I've declared a enumeration:
Code: Pascal  [Select][+][-]
1. type
2.   DTFmtKind = (dtfNone,
3.       dtfISOFull, dtfISODate, dtfISOTime,
4.       dtfLongFull, dtfLongDate, dtfLongTime,
5.       dtfShortFull, dtfShortDate, dtfShortTime,
6.       dtfMixedFull,
7.       dtfLast);
The first and last members of the enumeration serve to test against (Sender as TComponent).Tag to see whether it's in or out of bounds. It's basicaly an elaborated Murphy-guard to prevent tryng to access the array with an out-of-bounds index.

And if the number of possible formats change in the future (very probable!) I'll have to change nothing but the array and the enumeration,

That's it: cautious laziness.
« Last Edit: June 25, 2019, 12:47:18 am by lucamar »
Turbo Pascal 3 CP/M - Amstrad PCW 8256 (512 KB !!!)
Lazarus/FPC 2.0.8/3.0.4 & 2.0.10/3.2.0 - 32/64 bits on:
(K|L|X)Ubuntu 12..18, Windows XP, 7, 10 and various DOSes.

#### 440bx

• Hero Member
• Posts: 2096
##### Re: Quick question about const array of string
« Reply #3 on: June 25, 2019, 08:09:15 am »
Just to test my understanding, in this construction:
The first thing to understand is that the way that construction is implemented depends on the setting of \$LONGSTRINGS.

consider the following very simple test program (using your array):
Code: Pascal  [Select][+][-]
1. {\$APPTYPE CONSOLE}
2.
3. {\$LONGSTRINGS ON}
4. //{\$LONGSTRINGS OFF}
5.
6.
7. program _PascalArrayOfStrings;
8.
9. uses
10.   Windows,
11.   sysutils
12.   ;
13.
14.
15. const
16.   DTFmt: array[0..11] of string =
17.     ('',
18.      'yyyy-mm-dd hh:nn', 'yyyy-mm-dd', 'yyyy-mm-dd', {ISO formats}
19.      'dddddd tt', 'dddddd', 'tt',  {Long formats}
20.      'ddddd t', 'ddddd', 't',  {Short formats}
21.      'f', {Mixed short date + long time}
22.      '');
23.
24.
25. var
26.   i : integer;
27.
28. begin
29.
30.   for i := low(DtFmt) to high(DtFmt) do
31.   begin
32.     writeln(DtFmt[i]);
33.   end;
34.
35.   writeln('press <enter>/<return> to end this program');
37. end.
38.
The implementation and behavior is radically different depending on the setting of LONGSTRINGS.

Here is what happens when LONGSTRINGS is OFF
Code: Pascal  [Select][+][-]
1. PascalArrayOfStrings.lpr:32       writeln(DtFmt[i]);
2. 00401563 e8b89c0000               call   0x40b220 <fpc_get_output>
3. 00401568 89c3                     mov    %eax,%ebx
4. 0040156A a100804100               mov    0x418000,%eax             ; load i into eax
5. 0040156F c1e008                   shl    \$0x8,%eax                 ; multiply by 256 (sizeof(shortstring))
6. 00401572 8d8880304100             lea    0x413080(%eax),%ecx       ; load address of string into ecx
7. 00401578 89da                     mov    %ebx,%edx                 ; the rest is writeln stuff
8. 0040157A b800000000               mov    \$0x0,%eax
9. 0040157F e87c9e0000               call   0x40b400 <fpc_write_text_shortstr>
10. 00401584 e8076f0000               call   0x408490 <fpc_iocheck>
11. 00401589 89d8                     mov    %ebx,%eax
12. 0040158B e8d09d0000               call   0x40b360 <fpc_writeln_end>
13. 00401590 e8fb6e0000               call   0x408490 <fpc_iocheck>
14. 00401595 833d008041000b           cmpl   \$0xb,0x418000
15. 0040159C 7cbe                     jl     0x40155c <main+28>
16.

In that case the following is true:

1. there is no array of pointers pointing to individual strings
2. strings are in a READ/WRITE section of the executable, therefore they can be modified.
3. a '' string has 256 bytes reserved in the READ/WRITE section it resides in. (therefore it can be changed)

Here is what happens for the very same array when LONGSTRINGS is on:
Code: Pascal  [Select][+][-]
1.                     v '' string
2. 0x00000001000150D0  0000000000000000 0000000100017018  .........p......
3. 0x00000001000150E0  0000000100017048 0000000100017070  Hp......pp......
4. 0x00000001000150F0  0000000100017098 00000001000170c0  ˜p......Àp......
5. 0x0000000100015100  00000001000170e0 0000000100017100  àp.......q......
6. 0x0000000100015110  0000000100017120 0000000100017140   q......@q......
7. 0x0000000100015120  0000000100017160 0000000000000000  `q..............
8.                                      ^ '' string
9.
10. 0x0000000100015130  000000010001f060 0001f07000000008  `ð..........pð..
11. 0x0000000100015140  0000038000000001 000000010001f400  ........ô......
12. 0x0000000100015150  0001f79000000380 0000038000000001  €....÷.........
13.

The following is true in that case:

1. there is an array of pointers (to null terminated strings)
2. a '' is simply a null pointer

for reference, note the address of the array of pointers, address 0x1000150D0

Following the pointers in the array, shows
Code: Pascal  [Select][+][-]
1. 0x0000000100017018  79 79 79 79 2d 6d 6d 2d 64 64 20 68  yyyy-mm-dd h
2. 0x0000000100017024  68 3a 6e 6e 00 00 00 00 00 00 00 00  h:nn........
3. 0x0000000100017030  00 00 01 00 00 00 00 00 ff ff ff ff  ........ÿÿÿÿ
4. 0x000000010001703C  ff ff ff ff 0a 00 00 00 00 00 00 00  ÿÿÿÿ........
5. 0x0000000100017048  79 79 79 79 2d 6d 6d 2d 64 64 00 00  yyyy-mm-dd..
6. 0x0000000100017054  00 00 00 00 00 00 01 00 00 00 00 00  ............
7. 0x0000000100017060  ff ff ff ff ff ff ff ff 0a 00 00 00  ÿÿÿÿÿÿÿÿ....
8. 0x000000010001706C  00 00 00 00 79 79 79 79 2d 6d 6d 2d  ....yyyy-mm-
9. 0x0000000100017078  64 64 00 00 00 00 00 00 00 00 01 00  dd..........
10. 0x0000000100017084  00 00 00 00 ff ff ff ff ff ff ff ff  ....ÿÿÿÿÿÿÿÿ
11. 0x0000000100017090  09 00 00 00 00 00 00 00 64 64 64 64  ........dddd
12. 0x000000010001709C  64 64 20 74 74 00 00 00 00 00 00 00  dd tt.......
13. 0x00000001000170A8  00 00 01 00 00 00 00 00 ff ff ff ff  ........ÿÿÿÿ
14. 0x00000001000170B4  ff ff ff ff 06 00 00 00 00 00 00 00  ÿÿÿÿ........
15. 0x00000001000170C0  64 64 64 64 64 64 00 00 00 00 01 00  dddddd......
16. 0x00000001000170CC  00 00 00 00 ff ff ff ff ff ff ff ff  ....ÿÿÿÿÿÿÿÿ
17. 0x00000001000170D8  02 00 00 00 00 00 00 00 74 74 00 00  ........tt..
18. 0x00000001000170E4  00 00 00 00 00 00 01 00 00 00 00 00  ............
19. 0x00000001000170F0  ff ff ff ff ff ff ff ff 07 00 00 00  ÿÿÿÿÿÿÿÿ....
20. 0x00000001000170FC  00 00 00 00 64 64 64 64 64 20 74 00  ....ddddd t.
21. 0x0000000100017108  00 00 01 00 00 00 00 00 ff ff ff ff  ........ÿÿÿÿ
22. 0x0000000100017114  ff ff ff ff 05 00 00 00 00 00 00 00  ÿÿÿÿ........
23. 0x0000000100017120  64 64 64 64 64 00 00 00 00 00 01 00  ddddd.......
24. 0x000000010001712C  00 00 00 00 ff ff ff ff ff ff ff ff  ....ÿÿÿÿÿÿÿÿ
25. 0x0000000100017138  01 00 00 00 00 00 00 00 74 00 00 00  ........t...
26. 0x0000000100017144  00 00 00 00 00 00 01 00 00 00 00 00  ............
27. 0x0000000100017150  ff ff ff ff ff ff ff ff 01 00 00 00  ÿÿÿÿÿÿÿÿ....
28. 0x000000010001715C  00 00 00 00 66 00 00 00 00 00 00 00  ....f.......
29. 0x0000000100017168  2a 70 72 65 73 73 20 3c 65 6e 74 65  *press <ente
30. 0x0000000100017174  72 3e 2f 3c 72 65 74 75 72 6e 3e 20  r>/<return>
31. 0x0000000100017180  74 6f 20 65 6e 64 20 74 68 69 73 20  to end this
32. 0x000000010001718C  70 72 6f 67 72 61 6d 00 00 00 00 00  program.....
33.

A very important thing to notice is the location where the array of pointers is located and
the location where the strings are located.  They are NOT located in the same PE file section.

Because they are not in the same section of the PE file and, the sections have different
READ/WRITE attributes, the pointers can be altered but the string the original pointers
pointed to cannot because they reside in a READ ONLY section of the PE file.

That's why when LONGSTRINGS are OFF the strings are writeable but when LONGSTRINGS are
ON, attempting to change the contents of the string results in an access violation.

Here is the in-memory view of the mapped program:
Code: Pascal  [Select][+][-]
1. 0x100000000, Image,        152 kB, WCX, lib\x86_64-win64\PascalArrayOfStrings.exe
2. 0x100000000, Image: Commit,  4 kB, R,   lib\x86_64-win64\PascalArrayOfStrings.exe
3. 0x100001000, Image: Commit, 80 kB, RX,  lib\x86_64-win64\PascalArrayOfStrings.exe
4.
5. 0x100015000, Image: Commit,  8 kB, RW,  lib\x86_64-win64\PascalArrayOfStrings.exe   // the pointer array is in a read/write section
6. 0x100017000, Image: Commit, 32 kB, R,   lib\x86_64-win64\PascalArrayOfStrings.exe   // the strings are in a READ ONLY section
7.                                                                                     // that's the reason for access violations when
8.                                                                                     // attempting to change the string contents
9. 0x10001f000, Image: Commit, 16 kB, RW,  lib\x86_64-win64\PascalArrayOfStrings.exe
10. 0x100023000, Image: Commit,  4 kB, WC,  lib\x86_64-win64\PascalArrayOfStrings.exe
11. 0x100024000, Image: Commit,  4 kB, RW,  lib\x86_64-win64\PascalArrayOfStrings.exe
12. 0x100025000, Image: Commit,  4 kB, WC,  lib\x86_64-win64\PascalArrayOfStrings.exe
13.

HTH.

ETA:

Just in case, it is necessary to scroll horizontally to see some comments on the right hand side.
« Last Edit: June 26, 2019, 02:25:30 pm by 440bx »
FPC v3.0.4 and Lazarus 1.8.2 on Windows 7 64bit.

#### PascalDragon

• Hero Member
• Posts: 2598
• Compiler Developer
##### Re: Quick question about const array of string
« Reply #4 on: June 25, 2019, 09:06:22 am »
Just to test my understanding, in this construction:

Code: Pascal  [Select][+][-]
1. const
2.   DTFmt: array[DTFmtKind] of string =
3.     ('',
4.      'yyyy-mm-dd hh:nn', 'yyyy-mm-dd', 'yyyy-mm-dd', {ISO formats}
5.      'dddddd tt', 'dddddd', 'tt',  {Long formats}
6.      'ddddd t', 'ddddd', 't',  {Short formats}
7.      'f', {Mixed short date + long time}
8.      '');

the two empty strings (first and last) ocuppy no memory, right? Or just two Nil pointers?
Just compile it with -al and you'll see how the array is layouted (in this case with {\$H+}):
Code: [Select]
`# Begin asmlist al_const.section .rodata.n_.Ld1,"d" .balign 4.Ld1\$strlab: .short 0,1 .long -1,16.Ld1:# [33] 'yyyy-mm-dd hh:nn', 'yyyy-mm-dd', 'yyyy-mm-dd', {ISO formats} .ascii "yyyy-mm-dd hh:nn\000".section .rodata.n_.Ld1,"d" .balign 4.Ld2\$strlab: .short 0,1 .long -1,10.Ld2: .ascii "yyyy-mm-dd\000".section .rodata.n_.Ld1,"d" .balign 4.Ld3\$strlab: .short 0,1 .long -1,10.Ld3: .ascii "yyyy-mm-dd\000".section .rodata.n_.Ld1,"d" .balign 4.Ld4\$strlab: .short 0,1 .long -1,9.Ld4:# [34] 'dddddd tt', 'dddddd', 'tt',  {Long formats} .ascii "dddddd tt\000".section .rodata.n_.Ld1,"d" .balign 4.Ld5\$strlab: .short 0,1 .long -1,6.Ld5: .ascii "dddddd\000".section .rodata.n_.Ld1,"d" .balign 4.Ld6\$strlab: .short 0,1 .long -1,2.Ld6: .ascii "tt\000".section .rodata.n_.Ld1,"d" .balign 4.Ld7\$strlab: .short 0,1 .long -1,7.Ld7:# [35] 'ddddd t', 'ddddd', 't',  {Short formats} .ascii "ddddd t\000".section .rodata.n_.Ld1,"d" .balign 4.Ld8\$strlab: .short 0,1 .long -1,5.Ld8: .ascii "ddddd\000".section .rodata.n_.Ld1,"d" .balign 4.Ld9\$strlab: .short 0,1 .long -1,1.Ld9: .ascii "t\000".section .rodata.n_.Ld1,"d" .balign 4.Ld10\$strlab: .short 0,1 .long -1,1.Ld10:# [36] 'f', {Mixed short date + long time} .ascii "f\000"# End asmlist al_const# Begin asmlist al_typedconsts.section .data.n_TC_\$P\$THELLOWORLD_\$\$_DTFMT,"d" .balign 4TC_\$P\$THELLOWORLD_\$\$_DTFMT: .long 0 .long .Ld1 .long .Ld2 .long .Ld3 .long .Ld4 .long .Ld5 .long .Ld6 .long .Ld7 .long .Ld8 .long .Ld9 .long .Ld10 .long 0# End asmlist al_typedconsts`As you can see the first and last element are Nil elements.

But let's be honest: even if they wouldn't be Nil: one element more or less doesn't really matter that much (except if you're playing around on AVR or some other memory constrained hardware).

Sidenote: With {\$H-} or explicit use of ShortString you'd always have 256 Byte used for each string.

#### lucamar

• Hero Member
• Posts: 3435
##### Re: Quick question about const array of string
« Reply #5 on: June 26, 2019, 05:37:08 pm »
Thanks everyone! Now I have a real understanding of how things work deep down

Somehow I knew all that but in a tenuous, nebulous way, as in "it's probably something like this".

Much better to be able to map your declarations to what the compiler does with them and say: "it is like this".

Again, thank you.
Turbo Pascal 3 CP/M - Amstrad PCW 8256 (512 KB !!!)
Lazarus/FPC 2.0.8/3.0.4 & 2.0.10/3.2.0 - 32/64 bits on:
(K|L|X)Ubuntu 12..18, Windows XP, 7, 10 and various DOSes.

• Hero Member
• Posts: 10684
##### Re: Quick question about const array of string
« Reply #6 on: June 26, 2019, 06:15:18 pm »
Just remember typed consts have pointers to something, even nil. They have a location in memory.
Untyped consts are compile time determined, do not necessarily have a memory location and are read only nowadays..
Sven forgot to mention that. Untyped consts can be replaced by a literal.
« Last Edit: June 26, 2019, 06:18:22 pm by Thaddy »

#### 440bx

• Hero Member
• Posts: 2096
##### Re: Quick question about const array of string
« Reply #7 on: June 26, 2019, 07:08:10 pm »
Just remember typed consts have pointers to something, even nil.
That's a dangerous generalization.  For instance, in the case of a shortstring typed constant, there is no _separate_ pointer pointing to the contents of the string.  e.g,
Code: Pascal  [Select][+][-]
1. const
2.   s : shortstring = ''
unlike for a longstring, that declaration does _not_ cause the compiler to save a pointer to s somewhere in the data segment and, in the above case, definitely not one that is nil or points to nil since the compiler will reserve a full 256 bytes for the string even though it has been declared as empty.

What you said is the case for longstrings but not for all typed const variables.

HTH.
FPC v3.0.4 and Lazarus 1.8.2 on Windows 7 64bit.

• Hero Member
• Posts: 10684
##### Re: Quick question about const array of string
« Reply #8 on: June 26, 2019, 07:28:58 pm »
Well, the compiler has to allow for {\$J+} and {\$J-} - these are local directives! - that does NOT make any difference for the storage. Shortstring or longstring.
What I meant is: untyped is read only and for small values replaced with literal values. Typed consts are not only in essence different beasts....
« Last Edit: June 26, 2019, 07:30:39 pm by Thaddy »

#### lucamar

• Hero Member
• Posts: 3435
##### Re: Quick question about const array of string
« Reply #9 on: June 26, 2019, 07:51:38 pm »
Thanks.

That's something I had that rather clear, the differences of "normal" vs. "typed" constants.
Turbo Pascal 3 CP/M - Amstrad PCW 8256 (512 KB !!!)
Lazarus/FPC 2.0.8/3.0.4 & 2.0.10/3.2.0 - 32/64 bits on:
(K|L|X)Ubuntu 12..18, Windows XP, 7, 10 and various DOSes.

#### ASerge

• Hero Member
• Posts: 1720
##### Re: Quick question about const array of string
« Reply #10 on: June 26, 2019, 09:53:37 pm »
Code: Pascal  [Select][+][-]
1. type
2.   DTFmtKind = (dtfNone,
3.       dtfISOFull, dtfISODate, dtfISOTime,
4.       dtfLongFull, dtfLongDate, dtfLongTime,
5.       dtfShortFull, dtfShortDate, dtfShortTime,
6.       dtfMixedFull,
7.       dtfLast);
It seems to me that the introduction of border elements is not necessary.
Code: Pascal  [Select][+][-]
1. {\$APPTYPE CONSOLE}
2. {\$MODE OBJFPC}
3. {\$LONGSTRINGS ON}
4.
5. type
6.   TDateTimeFormatKind = (
7.     dtfISOFull, dtfISODate, dtfISOTime, {ISO formats}
8.     dtfLongFull, dtfLongDate, dtfLongTime, {Long formats}
9.     dtfShortFull, dtfShortDate, dtfShortTime, {Short formats}
10.     dtfMixedFull {Mixed short date + long time}
11.   );
12.
13. function IsValidFormatKind(Kind: Integer): Boolean; inline;
14. begin
15.   Result := Kind in [Ord(Low(TDateTimeFormatKind))..Ord(High(TDateTimeFormatKind))];
16. end;
17.
18. function GetFormatString(Kind: TDateTimeFormatKind): string; inline;
19. const
20.   CDateTimeFormatString: array[TDateTimeFormatKind] of string = (
21.     'yyyy-mm-dd hh:nn', 'yyyy-mm-dd', 'yyyy-mm-dd',
22.     'dddddd tt', 'dddddd', 'tt',
23.     'ddddd t', 'ddddd', 't',
24.     'f'
25.   );
26. begin
27.   Result := CDateTimeFormatString[Kind];
28. end;
29.
30. function GetFormatString(Kind: Integer): string;
31. begin
32.   if IsValidFormatKind(Kind) then
33.     Result := GetFormatString(TDateTimeFormatKind(Kind))
34.   else
35.     Result := '';
36. end;
37.
38. begin
39.   Writeln(IsValidFormatKind(30));
40.   Writeln(IsValidFormatKind(-3));
41.   Writeln(IsValidFormatKind(3));
42.   Writeln(GetFormatString(30));
43.   Writeln(GetFormatString(-3));
44.   Writeln(GetFormatString(3));
46. end.
When you later want to add another format, you will only add a value to the "CDateimeFormatString" array.

#### lucamar

• Hero Member
• Posts: 3435
##### Re: Quick question about const array of string
« Reply #11 on: June 26, 2019, 10:44:29 pm »
It seems to me that the introduction of border elements is not necessary.
[.. etc ...]

Yes, that's what I ended up doing (using Low() and High(), which is what I usually do) but that first q&d iteration(*) got me thinking, so I headed over here and asked. As I said, just to test my understanding ... which seems to have been quite correct but rather shallow

(*) It came out that way because that project uses a ton of constructions like that, so by the time I started adding my code my head was screwed to that "style". It happens <shrugs>
Turbo Pascal 3 CP/M - Amstrad PCW 8256 (512 KB !!!)
Lazarus/FPC 2.0.8/3.0.4 & 2.0.10/3.2.0 - 32/64 bits on:
(K|L|X)Ubuntu 12..18, Windows XP, 7, 10 and various DOSes.

#### engkin

• Hero Member
• Posts: 2513
##### Re: Quick question about const array of string
« Reply #12 on: June 26, 2019, 11:14:50 pm »
I am confused

What are these three numbers before each string:
Code: ASM  [Select][+][-]
1. .section .rodata.n_.Ld1,"d"
2. .balign 4
3. .Ld2\$strlab:
4. .short 0,1
5. .long -1,10
6. .Ld2:
7. .ascii "yyyy-mm-dd\000"

The fourth number is the length of the string, and each string ends with a terminating zero.

Any idea about the other numbers?

• Hero Member
• Posts: 10684
##### Re: Quick question about const array of string
« Reply #13 on: June 26, 2019, 11:52:51 pm »
Educated guess:
Code: Text  [Select][+][-]
1. .section .rodata.n_.Ld1,"d"    ;read only data
2. .balign 4
3. .Ld2\$strlab:
4. .short 0,1                     ; Shortstring declread, size is one to store length of zero.
5. .long -1,10                    ; Treat as PWideChar, length (payload) is 10.
6. .Ld2:
7. .ascii "yyyy-mm-dd\000"        ; terminator is WideChar.

I may be partially wrong here...
« Last Edit: June 26, 2019, 11:58:45 pm by Thaddy »

#### lucamar

• Hero Member
• Posts: 3435
##### Re: Quick question about const array of string
« Reply #14 on: June 27, 2019, 12:21:18 am »
What are these three numbers before each string:
Code: ASM  [Select][+][-]
1. .section .rodata.n_.Ld1,"d"
2. .balign 4
3. .Ld2\$strlab:
4. .short 0,1
5. .long -1,10
6. .Ld2:
7. .ascii "yyyy-mm-dd\000"

Maybe it has something to do with how strings are represented in memory
Turbo Pascal 3 CP/M - Amstrad PCW 8256 (512 KB !!!)
Lazarus/FPC 2.0.8/3.0.4 & 2.0.10/3.2.0 - 32/64 bits on:
(K|L|X)Ubuntu 12..18, Windows XP, 7, 10 and various DOSes.