Lazarus

Free Pascal => General => Topic started by: 440bx on August 10, 2018, 02:10:02 am

Title: Is there some way to obtain the length of a character array at compile time ?
Post by: 440bx on August 10, 2018, 02:10:02 am
hello,

Consider a simple definition like this one:
Code: Pascal  [Select][+][-]
  1. const
  2.   somecharacters  : pchar = 'an array of characters';
  3.  
The compiler knows at compile time how many characters are pointed to by the constant pointer "somecharacters" and, that's what I'd like to get.

In the above example, I'd like the compiler to tell me there are 22 characters (or 23 if it counts the null).  Is there a way to have the compiler return that value ?  (not using strlen, since that is a run time function, I want the size available at compile time.)

Thanks.
Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: engkin on August 10, 2018, 02:55:34 am
Looking at the generated assembly file (using -al), I don't think the compiler provides the length of somecharacters. It provides the characters:
Code: ASM  [Select][+][-]
  1. .section .data.n__$PROJECT1$_Ld1,"d"
  2.         .balign 16
  3. .globl  _$PROJECT1$_Ld1
  4. _$PROJECT1$_Ld1:
  5. # [13] somecharacters  : pchar = 'an array of characters';
  6.         .ascii  "an array of characters\000"

and their address:
Code: ASM  [Select][+][-]
  1. .section .data.n_tc_$p$project1_$$_somecharacters,"d"
  2.         .balign 4
  3. TC_$P$PROJECT1_$$_SOMECHARACTERS:
  4.         .long   _$PROJECT1$_Ld1

While for a string:
Code: Pascal  [Select][+][-]
  1. const
  2.   somestring  : string = 'an array of characters in a string';

It has the length:
Code: ASM  [Select][+][-]
  1. .section .rodata.n__$PROJECT1$_Ld2,"d"
  2.         .balign 4
  3.         .short  0,1
  4.         .long   -1,34
  5. .globl  _$PROJECT1$_Ld2
  6. _$PROJECT1$_Ld2:
  7. # [16] somestring  : string = 'an array of characters in a string';
  8.         .ascii  "an array of characters in a string\000"

But there does not seem to be a way to get it into another constant. However I think I might have found a bug:
Code: Pascal  [Select][+][-]
  1. const
  2.   somestring  : string = 'an array of characters in a string';
  3.   addr: pointer = @somestring[1];

addr holds @somestring+1, instead of @somestring[1]. Not sure if it is worth reporting.
Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: 440bx on August 10, 2018, 03:29:35 am
Engkin, thank you for the thorough and detailed reply. 

I tried defining another constant immediately after that one and then taking the difference of the two pointers (a variation of $ - varname in assembly) but, unfortunately that didn't work and looking at the memory layout it was evident that it didn't work because of alignment. 

I tried {$packrecords 1} which the manuals say is the same as {$align 1} but, in memory, the fields / constants were still aligned on a 16 byte boundary (compiling for 64bit) which defeated the attempt.

I believe you are right.  There does not seem to be a way of coercing that value out of FPC.  can't win them all ... :)

Thank you again for the thorough reply.




Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: Thaddy on August 10, 2018, 05:49:14 am
Well, for a string literal the compiler will simply precalc the string without code in the binary and the generated code will just be 22.
Code: Pascal  [Select][+][-]
  1. {$ifdef fpc}{$mode delphi}{$H+}{$I-}{$endif}
  2. const
  3.   somecharacters  = 'an array of characters';
  4. begin
  5.   writeln(Length(somecharacters));
  6. end.

This is arm asm but you get the point:
Code: ASM  [Select][+][-]
  1. # [6] writeln(Length(somecharacters));
  2.         bl      fpc_get_output
  3.         mov     r4,r0
  4.         mov     r2,#22   ;<------ pre-calculated by the compiler.
  5.         mov     r1,r4
  6.         mov     r0,#0
  7.         bl      fpc_write_text_sint
  8.         mov     r0,r4
  9.         bl      fpc_writeln_end
  10.  
So if possible do not use a typed const, but a literal string, untyped const. x86_64 or i386 code will look very similar:with the pre-calculated length.
In the case of typed consts, length() will actually execute code for string length(examine size field) or code for PChar len, which actually calls strlen anyway.
 
Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: 440bx on August 10, 2018, 07:20:32 am
So if possible do not use a typed const, but a literal string, untyped const. x86_64 or i386 code will look very similar:with the pre-calculated length.
In the case of typed consts, length() will actually execute code for string length(examine size field) or code for PChar len, which actually calls strlen anyway.
Thanks Thaddy.  If the constant is not typed then simply sizeof(thecharacterconstant) returns the number of characters in the constant which is nice but, unfortunately doing it way that leads to some undesirable problems when using arrays.

Your mentioning non-typed constants gave me a few ideas to try.  Had some fun in the process, with the following code:
Code: Pascal  [Select][+][-]
  1. type
  2.   Tarray = array[0..1] of pchar;
  3.  
  4. const
  5.   // sizeof returns the size of these constants
  6.  
  7.   charconst1   = 'character constant 1';
  8.   charconst2   = 'character constant 2 - made purposely larger';
  9.  
  10. // this variable declaration gives the compiler quite a headache.  :-)
  11.  
  12. var
  13.   carray       : tarray absolute charconst1;
  14.  
  15.   ;            // for some reason the compiler wants an extra semicolon there.
  16.                // it's happy to get the semicolon  but then it gets totally lost.
  17.      
  18. const
  19.   // this works but, it becomes the programmer's responsibility to ensure that
  20.   // the calculated character array sizes are in synch with the order of the
  21.   // elements of the array.  It's a workaround that "works", but it just makes the
  22.   // code unacceptably fragile.
  23.  
  24.   anarray   : array[0..1] of pchar = (charconst1, charconst2);
  25.  
There doesn't seem to be a way to win.  I'll just use strlen. 

Thanks again.
Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: Thaddy on August 10, 2018, 07:35:13 am
Better simply use length(), that works also on PChars. The compiler generates about the same code as strlen.
I have another idea that seems to do what you want:
Code: Pascal  [Select][+][-]
  1. {$ifdef fpc}{$mode delphi}{$H+}{$I-}{$J-}{$endif}
  2. const
  3.   somecharacters:shortstring  = 'an array of characters';
  4. begin
  5.   writeln(Length(somecharacters));
  6. end.

which does this:
Code: ASM  [Select][+][-]
  1. # [6] writeln(Length(somecharacters));
  2.         bl      fpc_get_output
  3.         mov     r4,r0
  4.         ldr     r0,.Lj3
  5.         ldrb    r2,[r0]
  6.         mov     r1,r4
  7.         mov     r0,#0
  8.         bl      fpc_write_text_uint
  9.         mov     r0,r4
  10.         bl      fpc_writeln_end
Which references through .Lj3:
Code: ASM  [Select][+][-]
  1. TC_$P$UNTITLED_$$_SOMECHARACTERS:
  2.         .byte   22 ; <--- shortstring length
  3.  

Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: 440bx on August 10, 2018, 08:18:23 am
Better simply use length(), that works also on PChars. The compiler generates about the same code as strlen.
Code: ASM  [Select][+][-]
  1. TC_$P$UNTITLED_$$_SOMECHARACTERS:
  2.         .byte   22 ; <--- shortstring length
  3.  
The compiler knows the length of the string but, it refuses to give it up at compile time.  It has no problem handing it over at run time.  I tried assigning s[0] to a constant.  It didn't like it.  Also tried length and a number of other "creative" ways of extracting the length at compile time, all to no avail.

You're right, I'll just resign myself to get the length at run time using a "compiler legal" way of getting it.  There doesn't seem to be any alternative anyway.

Thanks, that idea was worth checking out.
Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: PascalDragon on August 10, 2018, 09:10:27 am
Code: Pascal  [Select][+][-]
  1. type
  2.   Tarray = array[0..1] of pchar;
  3.  
  4. const
  5.   // sizeof returns the size of these constants
  6.  
  7.   charconst1   = 'character constant 1';
  8.   charconst2   = 'character constant 2 - made purposely larger';
  9.  
  10. // this variable declaration gives the compiler quite a headache.  :-)
  11.  
  12. var
  13.   carray       : tarray absolute charconst1;
  14.  
  15.  

No, just no. Such a declaration is an absolute accident waiting to happen as the compiler is free to place the two constant where it wants and if it thinks that a third, completely unrelated constant fits nicely between charconst1 and charconst2 than it will do so, thus breaking your array.

Maybe it's best for you to explain what you want to achieve in the end instead of posting strange abuses of array, absolute and PChar. This way we could try to find a solution that's efficient, but also maintainable.
Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: 440bx on August 10, 2018, 10:01:35 am
No, just no. Such a declaration is an absolute accident waiting to happen as the compiler is free to place the two constant where it wants and if it thinks that a third, completely unrelated constant fits nicely between charconst1 and charconst2 than it will do so, thus breaking your array.

Maybe it's best for you to explain what you want to achieve in the end instead of posting strange abuses of array, absolute and PChar. This way we could try to find a solution that's efficient, but also maintainable.
What I want couldn't be any simpler, I want to have the count of characters in a constant array of characters _at compile time_  (the compiler has that information but, it won't give it up).    Nothing particularly extravagant, parallel to sizeof(sometype) but for constant arrays.    Having those constants would be helpful, just as having high/low of an array allows other arrays/data types to be "customized" to fit them.  Nothing "strange" in that.

As far as the use of absolute in that way, I agree with you but, I was willing to pull the compiler's teeth if necessary to, at least, find out if there was a way to get the values.

Also, I don't care where the compiler chooses to put the constants.  I simply wish it would provide a way for the programmer to obtain their size/(length in this case.)  Just as it does for most everything else.

It would be nice (and useful) if those values could be obtained at compile time but, since that is apparently not possible, I'll get their size at runtime (without abusing the compiler  :D)  pretty length or pretty strlen with a cherry on the top.




Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: Thaddy on August 10, 2018, 10:14:31 am
Well, nothing strange in possibly {$J-} mode as a new optimization (even typed consts should be immutable if they are not assignable).
Anything else should rely on the "runtime" functions. E.g. C(++) doesn't even know any different. And length() already optimizes if possible, as I showed.
What I have seen is that:
- shortstring has an immediate length. (hardly any code)
- untyped consts as string literals have an immediate length. (even less code)
- Ansistrings and Unicode string consts have access to the length field and are not really inefficient.
- Pchar demands strlen (or the internal strlen from the compiler, just like C code)

I think that maybe it is possible for the compiler to optimize a typed string const in {$J-} mode, but certainly not in {$J+} mode.
And I wonder what happens since a) Pchar is a foreign type to support C and derivatives and b) the other problematic cases are managed types.

I think the gain is not worth the trouble, since both untyped literals as shortstring literals are assignment compatible to the other string types, even Pchar.
Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: 440bx on August 10, 2018, 10:24:14 am
I think the gain is not worth the trouble, since both untyped literals as shortstring literals are assignment compatible to the other string types, even Pchar.
You're definitely right about that.  I asked the question to make sure there wasn't some feature in FPC, that I was not aware of, that would yield those values.  I'll just determine the length at runtime.  As you pointed out, it really won't take much code to get the lengths and use them.
Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: ASerge on August 10, 2018, 09:41:46 pm
Code: Pascal  [Select][+][-]
  1. const
  2.   somecharacters  : pchar = 'an array of characters';
The compiler knows at compile time how many characters are pointed to by the constant pointer "somecharacters" and, that's what I'd like to get.
Code: Pascal  [Select][+][-]
  1. program Project1;
  2. {$APPTYPE CONSOLE}
  3. {$OPTIMIZATION OFF}
  4.  
  5. const
  6.   CSomeCharacters = 'an array of characters';
  7.   SomeCharacters: PChar = CSomeCharacters;
  8. begin
  9.   if Length(CSomeCharacters) <> 22 then
  10.     Writeln('You won''t see it');
  11.   Writeln(Length(CSomeCharacters));
  12.   Writeln(Length(SomeCharacters));
  13.   Readln;
  14. end.
Even so: project1.lpr(10,5) Warning: unreachable code
There is only one copy of a line in the executable.
Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: 440bx on August 10, 2018, 10:46:11 pm
Even so: project1.lpr(10,5) Warning: unreachable code
There is only one copy of a line in the executable.
The compiler obviously knows the length of the character array constant.  There just doesn't seem to be any way of getting it at compile time.  That's unfortunate, there are times when that information is useful.

Nice example Serge, proves beyond any doubt that the compiler knows the length and it is using it. Thanks.
Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: avra on August 10, 2018, 11:43:29 pm
How about letting the IDE deal with counting array elements instead of the compiler doing it?
Something like https://forum.lazarus.freepascal.org/index.php?topic=27186.15
Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: Martin_fr on August 10, 2018, 11:53:48 pm
If you do a shortstring (max len 255) and sizeof(), maybe that will do. (not tested, and if it works may be one extra)
Code: Pascal  [Select][+][-]
  1. const foo = shortstring('abc');

With ansistring that will not work, since sizeof(ansistring) = sizeof(pointer).

What would you do want to do with the value, if you could get it?
Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: 440bx on August 11, 2018, 02:24:41 am
How about letting the IDE deal with counting array elements instead of the compiler doing it?
Something like https://forum.lazarus.freepascal.org/index.php?topic=27186.15
That is a nice idea but, it has one major downside.  If a string is changed (made smaller or longer) and the programmer forgets to tell the IDE to "update" the dependent locations then, things are no longer in sync.  Additionally, it's difficult to trust the IDE to do those things correctly since the IDE usually, unlike the compiler, has a limited view of the entire program.

If you do a shortstring (max len 255) and sizeof(), maybe that will do. (not tested, and if it works may be one extra)
Code: Pascal  [Select][+][-]
  1. const foo = shortstring('abc');

With ansistring that will not work, since sizeof(ansistring) = sizeof(pointer).

What would you do want to do with the value, if you could get it?
Unfortunately, even with a shortstring, the compiler does not allow using the length of the string at compile time.  For instance, it will not accept "const Mylength = mystring[0];" or "const mylength = length(mystring);".  Taking sizeof(shortstring) will yield a value that is the size of a shortstring, not the size/character count of the string it holds (which is reasonable since, as you know, that is not a constant, it's just an initialized variable.)

As far as what I'd do with it, it would be a safe and convenient way of defining the size of dependent types just as it is commonly done with regular data types, e.g, "somecharbuffer = array[0..2 * size(byte)] of char" to define a buffer that can hold a (single byte) character for each nibble in a byte plus the null terminator, change "byte" to any other ordinal type to have a buffer that snuggly holds the type converted to "array of char" (hex conversion). If the programmer needs a buffer to hold a converted qword, just change "byte" to "qword" and everything is updated automatically by the compiler.  That's just an example, as you know, there are countless examples where knowing the size of an item at compile time can be very useful in the definition of dependent types and writing code that is automatically updated by the compiler if the target type (in this case byte) changes in the future.

Specifically with null terminated strings.  If you're going to build at run time a "composite" string made of various constant strings and you know at compile time the character counts of each string that will make up the composite string then, it is possible to declare a buffer type that is something along the lines of "TMyBuffer : array[0..sizeofastring + sizeofanotherstring + sizeofyetanotherstring] of char;".  On one side, it spares the programmer from having to define a maximum size, thereby ensuring that the buffer is always large enough to accommodate the resulting string.  If a routine builds a half dozen of these strings and they reside on the stack before being output, It can make the difference between allocating a few hundred bytes instead of a "max_buffer_size" for each string and still running the risk that there may be one combination that causes a buffer overflow thereby corrupting the stack.

Basically, it allows to write cleaner and safer code when strings (char arrays) are involved.

It would be nice if it were available, particularly considering that the compiler has the information but, Length and strlen, it is.

Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: Martin_fr on August 11, 2018, 02:37:05 am
Code: Pascal  [Select][+][-]
  1. program Project1;
  2.  
  3. const
  4.   foo = shortstring('abc123def');
  5.   l = sizeof(foo);
  6.  
  7. begin
  8.   writeln(l);
  9.   readln;
  10. end.
  11.  

Compiles and prints 9. (that is bytes, not chars, in case of utf8)

Which is interesting, because a shortstring also contains the length of the string, so I expected 10.

You can always add an "assert", that of course is runtime. But it would be only during testing, to alert you if any assumption went wrong.


Also note: shortstrings are like records passed by value, not by pointer (except for constref).
Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: 440bx on August 11, 2018, 03:22:58 am
Which is interesting, because a shortstring also contains the length of the string, so I expected 10.
As I am sure you know, the typecast doesn't convert the string into a shortstring.  It is still a null terminated array of characters.  If it had done a conversion, the sizeof it would have been 256. 

As you suggested, I am making sure the code checks that things "fit" and if they don't, the function returns false without "breaking" anything.

Having the size/length at compile time would be very useful.  It wouldn't require code to ensure everything is safe and it would consequently be cleaner.

back to strlen...  :-\

Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: Thaddy on August 11, 2018, 11:46:03 am
Without a typed const, indeed... the compiler simply translates it into a string literal if you use an untyped const declaration, with or without type indication by cast.
But is the strlen not really your issue? and that is really not necessary unless interfacing with C (and the likes) code. It is also not correct for Pascal strings. Pascal strings can contain zero's somewhere in the middle.
Code: Pascal  [Select][+][-]
  1. var s: string = 'testme'#0'testmesomemore'; // C can only write 'testme'
  2. begin
  3.   writeln(s);
  4. end.
or:
Code: Pascal  [Select][+][-]
  1. const s: string = 'testme'#0'testmesomemore';// C can only write 'testme'
  2. begin
  3.   writeln(s);
  4. end.
or:
Code: Pascal  [Select][+][-]
  1. const s = 'testme'#0'testmesomemore';// C can only write 'testme'
  2. begin
  3.   writeln(s);
  4. end.

All three output the same, because they are native Pascal string constructs.
Forget about PChars, unless you interface with dumb string languages.  :D
It seems to me you decided on strlen for all the wrong reasons: Pascal has a distinctive way to handle strings.
You should use length() which is opaque to the issue at hand. Strlen is just for PChar's and Length handles that too..
(Sorry if I offend you again..but I know something about programming languages...) but that is not Pascal and bad programming as the above demonstrates: it can have all kind of side effects:
Code: Pascal  [Select][+][-]
  1. const s = 'testme'#0'testmesomemore';
  2. begin
  3.   writeln(Length(s)); // writes 21: correct
  4.   writeln(strlen(s)); // writes 6: dumbed down C style
  5. end.
This happens when you use strlen on pascal code...... < grumpy.. >:D >:D  :D 8-) O:-) >

Let's see what happens here:
Code: Pascal  [Select][+][-]
  1. const s:PChar { dumb, monkey style, I know }= 'testme'#0'testmesomemore';
  2. begin
  3.   writeln(Length(s)); // hey presto! 6! Where's my content?
  4.   writeln(strlen(s)); //Hey presto! 6!
  5. end.

DON'T use strlen. simple. Unless for interfacing.

Well, this should read like a comic book for most Pascal programmers... :-X 8) ;D O:-)


Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: 440bx on August 11, 2018, 01:06:47 pm
Without a typed const, indeed... the compiler simply translates it into a string literal if you use an untyped const declaration, with or without type indication by cast.
But is the strlen not really your issue? and that is really not necessary unless interfacing with C (and the likes) code.
This is a port of a C program to FPC.  I could take all the C null terminated character arrays and convert them into Pascal strings.  That's always an option but, it makes the port more complicated because some of the strings are longer than 255 characters.  I could use AnsiStrings which can host more than 255 characters, if I go that route I open the door to potential problems since they are a managed type (reference counted.)

The idea is to keep the initial port as close to the original C code as possible to avoid surprises.  Once the initial port is running successfully then, closely examine the result to determine how it can be made cleaner and simpler using Pascal specific constructs/types (particularly objects and properties.)  I've already seen plenty of ways to improve the program with Pascal constructs but, if I start indulging, the port is no longer a port, it's a complete rewrite, which I intend to do after I have a ported version that works as it should.

All that said, I am using some Pascal features that don't really change anything but do make the program cleaner.  For instance, ranges, enumerated types and, var parameters.   Small things that have very little impact overall.

In the first step, the overriding goal is to eliminate all the dependencies on the C standard library.  That alone can occasionally be more delicate than it initially seems.

I'm also using the port to learn about FPC, it's "personality" and quirks and, also about Lazarus as a development environment.

Thank you for the input.  I do agree that Pascal strings are a much better (and flexible) way of handling character arrays than the C pointers to char but, for now, I'm limiting changes to small improvements like using low/high/enumerated types/var and the like.



Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: Thaddy on August 11, 2018, 01:41:56 pm
In mode $H+ pascal strings are practically unlimited in length.
Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: Xor-el on August 11, 2018, 02:04:50 pm
In mode $H+ pascal strings are practically unlimited in length.

@Thaddy, I guess you wanted to say limited to 2GB right?  :)
Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: Thaddy on August 11, 2018, 02:16:17 pm
In mode $H+ pascal strings are practically unlimited in length.

@Thaddy, I guess you wanted to say limited to 2GB right?  :)
On a 32 bit platform. Yes. And even that depends. :)
Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: Bart on August 11, 2018, 02:56:31 pm
Code: Pascal  [Select][+][-]
  1. program Project1;
  2.  
  3. const
  4.   foo = shortstring('abc123def');
  5.   l = sizeof(foo);
  6.  
  7. begin
  8.   writeln(l);
  9.   readln;
  10. end.
  11.  

Compiles and prints 9. (that is bytes, not chars, in case of utf8)

Which is interesting, because a shortstring also contains the length of the string, so I expected 10.

Delphi (7) prints 256 (which is what I expected, since shortstring is string[255] and it occupies 256 bytes of memory).
No idea about more modern Delphi's.

Bart
Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: 440bx on August 11, 2018, 04:03:47 pm
Delphi (7) prints 256 (which is what I expected, since shortstring is string[255] and it occupies 256 bytes of memory).
No idea about more modern Delphi's.

Bart

I'm really surprised.  I just tried it with Delphi 2 and Delphi 10 Seattle and both agree on 256.

However, if one tries to assign a new value to "foo", both complain that the left side cannot be assigned to, which proves that it didn't take the cast as a datatype, yet a hex dump of the executable reveals that it did allocate space in the initialized data section (256 bytes worth of space, therefore it did do a conversion, the extra space is filled with nulls) also, if one tries to either take the address of foo or go to the address of foo in memory, in both cases Delphi returns an error.   

Basically, it creates a read-only variable.   It probably does that, just in case, that somewhere in the code the programmer reads bytes in the string that are beyond the length of the string but still within the limits of its size (not exactly good programming.)  With Delphi that would result in a null being read.

FPC does it quite differently.  If the program is compiled with debug info then there is debugging information about the constant but no memory/section space is allocated to it.  If compiled without debugging information then, there is no trace anywhere in the executable of the constant's existence in the code.

What Delphi does is "safe" but what FPC does is what actually is correct.




Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: Thaddy on August 11, 2018, 04:14:41 pm
Code: Pascal  [Select][+][-]
  1. program Project1;
  2.  
  3. const
  4.   foo = shortstring('abc123def');
  5.   l = sizeof(foo);
  6.  
  7. begin
  8.   writeln(l);
  9.   readln;
  10. end.
  11.  

Compiles and prints 9. (that is bytes, not chars, in case of utf8)

Which is interesting, because a shortstring also contains the length of the string, so I expected 10.

Delphi (7) prints 256 (which is what I expected, since shortstring is string[255] and it occupies 256 bytes of memory).
No idea about more modern Delphi's.

Bart
Bart, Delphi and FPC reserve 256 bytes, but store the shortstring correct, with size. Hence length.
Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: 440bx on August 11, 2018, 04:23:27 pm
Bart, Delphi and FPC reserve 256 bytes, but store the shortstring correct, with size. Hence length.
FPC doesn't reserve 256 bytes.  It reserves 0 bytes.  Unlike in Delphi, after compiling, there is no trace of foo in the executable (except if debug info is requested, then its existence as a constant appears in the debugging information.)
Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: Thaddy on August 11, 2018, 07:15:17 pm
Bart, Delphi and FPC reserve 256 bytes, but store the shortstring correct, with size. Hence length.
FPC doesn't reserve 256 bytes.  It reserves 0 bytes.  Unlike in Delphi, after compiling, there is no trace of foo in the executable (except if debug info is requested, then its existence as a constant appears in the debugging information.)
Uhhhhh. Wrong again.
I will show you the full asm example from one of the previous to illustrate that...
Wait a sec..
Code: Bash  [Select][+][-]
  1. TC_$P$SSTRING_$$_SOMECHARACTERS:
  2.         .byte   22
  3. # [4] somecharacters:Shortstring  = 'an array of characters';
  4.         .ascii  "an array of characters\000                         "
  5.         .ascii  "                                                   "
  6.         .ascii  "                                                   "
  7.         .ascii  "                                                   "
  8.         .ascii  "                                                   "
  9.         .ascii  "   "
  10. .Le11:
  11.  
Count them... that is 255 + the .byte 22.....
Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: 440bx on August 11, 2018, 08:54:08 pm
Count them... that is 255 + the .byte 22.....
I have no doubt you see them, that along with a few pink elephants and a few cases of bottles of wine, which is the probably the one thing you are good at, drinking it.

When you're done... _compile this_ and examine it with a hex viewer or have dumpbin disassemble it for you.

Code: Pascal  [Select][+][-]
  1. program WithShortstrings;
  2.  
  3. const
  4.   foo   = shortstring('abc123def');
  5.  
  6.  
  7.   l  = sizeof(foo);
  8.  
  9.  
  10. begin
  11.  writeln(l);
  12.  
  13.  readln;
  14. end.        

if you have any questions about where the example comes from, refer to your own message #25.  Hopefully, you are able to understand yourself (which does not guarantee in any way that your understanding is correct.)
Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: avra on August 12, 2018, 06:37:46 pm
How about letting the IDE deal with counting array elements instead of the compiler doing it?
Something like https://forum.lazarus.freepascal.org/index.php?topic=27186.15
That is a nice idea but, it has one major downside.  If a string is changed (made smaller or longer) and the programmer forgets to tell the IDE to "update" the dependent locations then, things are no longer in sync.  Additionally, it's difficult to trust the IDE to do those things correctly since the IDE usually, unlike the compiler, has a limited view of the entire program.
I didn't say it's perfect. I just don't think you have a better solution at this moment. If it was for me I would make IDE script even more simple - just count commas in a selection and add 1 to the result and put it where it belongs in the code. Crude but should do the work.
Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: 440bx on August 12, 2018, 06:48:46 pm
I didn't say it's perfect. I just don't think you have a better solution at this moment. If it was for me I would make IDE script even more simple - just count commas in a selection and add 1 to the result and put it where it belongs in the code. Crude but should do the work.
Don't get me wrong.  I appreciate the suggestion.  The goal of the question is/was to find out if there was a reliable mechanism of determining the lengths at compile time so they could be used to define buffers of the right sizes, that would be automatically updated by the compiler if any changed.

At this point, I think the only "reasonable" solution is to determine the string length at run time.  Of course, not even that is as reliable as having the compiler make the size available at compile time but, it is a general solution that works every time (as long as it is bug-free, of course ;))
Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: 440bx on August 12, 2018, 08:24:16 pm
If You are looking for something else then it eludes me..

You have the right idea but, the compiler doesn't provide all the information I need to implement it.  Here is an example:

Code: Pascal  [Select][+][-]
  1. const
  2.   SomeText    : pchar = 'some text here';
  3.   MoreText    : pchar = ' more text here';
  4.   ALittleMore : pchar = 'and some more here';

at run time, I have to create a character array composed of those 3 "constants" and a some other text (to be determined at run time).  It would nice, not to mention useful, to declare a buffer to hold the end result.  Something along the lines of

type
  TheBuffer : array[0..length(SomeText) + length(MoreText) + length(ALittleMore) + "maximum size of text that will be added at run time"] of char;

That way, the size of "TheBuffer" automatically adapts to any changes made to the 3 constants and spares the programmer from having to define some artificial "maximum size" which could eventually become insufficient if the programmer changes the strings and doesn't notice that the "maximum size" defined is now too small, that while wasting memory (though the wasting of memory is not much of a concern, just a nice bonus if it can avoided.)

It's the same thing that is done with other structure sizes the compiler makes available at compile time.  No difference other than the type being a const pchar, which causes the compiler to be unwilling to give it size, even though the compiler knows the array is a constant.
 
Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: 440bx on August 12, 2018, 09:48:32 pm
I just created 3 non-writable constants containing the text then finalized it with a writable constant of the whole that
works compile time and also allows for a max size...

Code: Pascal  [Select][+][-]
  1. Const
  2. SomeText = ' Some text';
  3. MoreText = 'MoreText';
  4. AlittleMore = ' AlittleMore';
  5.  TotalBuffer = Array[0..SomeMaxSize] of char = SomeText+MoreText+AlittleMore;
  6.  

So what ever you want with it at runtime afterwards.

TotalBuffer that is..
yes, that is no problem.  The problem is that the SomeMaxSize you declare is not based on the sizes of the three constants.  Someone can change the constants in such a way that they exceed SomeMaxSize then there is a problem.  If SomeMaxSize could be determined based on the sizes of the constants then the code would always work before the buffer size, calculated by the compiler, is always large enough.

Also, unlike in your example, the result will not simply be an addition of the constants.  There will be some dynamically determined text in addition to the character constants which will make up the "final" text to display/store.

Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: Martin_fr on August 12, 2018, 11:01:32 pm
Code: Pascal  [Select][+][-]
  1. program Project1;
  2. const
  3.   a = 'abcdefghi';
  4.   l = length(a);
  5. var
  6.   b: array[0..l] of char;
  7.  
  8. begin
  9.   move(a[1], b[0], l);
  10.   writeln(a);
  11.   writeln(l);
  12.   writeln(b[l-1]);
  13.   readln
  14. end.
  15.  

Compiles, runs and prints:
abcdefghi
9
i
Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: 440bx on August 13, 2018, 12:40:10 am
Martin, that is beautiful !!

I can use that.  I added some code in the example you provided to give a rough idea of how I will use it.  The actual code does more acrobatics with the text but, the essence is the same.

Code: Pascal  [Select][+][-]
  1. program lengthofcharacterarrayconstants2;
  2.  
  3. const
  4.   a = 'abcdefghi';
  5.  
  6.   l = length(a);
  7.  
  8. type
  9.   TINDEXES  = 0..3;
  10.  
  11. const
  12.   // if any string is made longer than the currently longest string then the
  13.   // constant "Longest" needs to be updated to be the length of the longest
  14.   // string.
  15.  
  16.   const_1   = 'this is const 1 - a';
  17.   const_2   = 'this is const 2 - ab';
  18.   const_3   = 'this is const 3 - cba';
  19.   const_4   = 'this is const 4 - 12';
  20.  
  21. const
  22.   // comment here informing that Longest should be set to the length of the
  23.   // longest string in the set of const_xx
  24.  
  25.   Longest   = length(const_3);    // const_3 randomly chosen for
  26.                                   // the example's sake
  27.  
  28.   constants : array[TINDEXES] of pchar = (const_1, const_2, const_3, const_4);
  29.  
  30. const
  31.   ADDITIONAL_SPACE_NEEDED = 16;   // for additional text in real life app.
  32.                                   // determined by a combination of text.
  33.   space     = ' ';
  34.   comma     = ',';
  35.   colon     = ':';
  36.  
  37.   // move wants variables for the above - I don't use move so that won't be a
  38.   // problem.  The variables below are only for this example.
  39.  
  40.   spacev    : pchar = space;
  41.   commav    : pchar = comma;
  42.   colonv    : pchar = colon;
  43.  
  44. type
  45.   // it accepts this!! -> excellent - this solves the problem!
  46.  
  47.   TEXT_BUFFER = array[0 ..
  48.                       (2 * Longest) + ADDITIONAL_SPACE_NEEDED] of char;
  49.  
  50. var
  51.   TheBuffer   : TEXT_BUFFER;
  52.   Index       : sizeint       = 0;
  53.  
  54. var
  55.   b : array[0..l] of char;
  56.   i : integer;
  57.  
  58. begin
  59.   // code below is similar/equivalent (and much simpler) to what I'm doing but,
  60.   // it is representative.  Not included in the code below is code to ensure
  61.   // that index doesn't go beyond high(TheBuffer).  With the definitions based
  62.   // on the length of the character arrays, it should never happen but,
  63.   // ensuring it doesn't, is so cheap, it's worth including in a real program.
  64.  
  65.   // initialize the buffer with spaces and null terminate it.
  66.  
  67.   FillChar(TheBuffer, sizeof(TheBuffer), space);
  68.   TheBuffer[high(TheBuffer)] := #0;
  69.  
  70.   // now the buffer is ready - use it :-)
  71.  
  72.   move(const_2, TheBuffer, length(const_2));
  73.   inc(index, length(const_2));
  74.  
  75.   // place a comma in there
  76.  
  77.   move(commav^, TheBuffer[Index], sizeof(comma));
  78.   inc(index, sizeof(comma));
  79.  
  80.   // a space after a comma.  could do space_comma in one shot but
  81.   // this example is just to show the mechanics.
  82.  
  83.   move(spacev^, TheBuffer[Index], sizeof(space));
  84.   inc(index, sizeof(space));
  85.  
  86.   // now put the next string in the buffer
  87.  
  88.   move(const_4, TheBuffer[Index], length(const_4));
  89.   inc(index, length(const_4));
  90.  
  91.   // null terminate it
  92.  
  93.   TheBuffer[Index] := #0;
  94.  
  95.   // write it out
  96.  
  97.   writeln(TheBuffer);
  98.  
  99.  
  100.   move(a[1], b[0], l);
  101.   writeln(a);
  102.   writeln(l);
  103.   writeln(b[l-1]);
  104.  
  105.   for i := low(constants) to high(constants) do writeln(constants[i]);
  106.  
  107.   readln
  108. end.
  109.  

Knowing the character array sizes will result in cleaner, simpler and safer code.   Thank you again. :)
Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: JdeHaan on June 22, 2021, 01:05:17 pm
I know it's an old thread, but I also needed the string length at compile time. I found the new generic with constant parameter very helpful.

Code: Pascal  [Select][+][-]
  1. program gentest;
  2.  
  3. generic function ConstString<const P,Q,S: string>: PChar;
  4. var
  5.   size: Integer;
  6. begin
  7.   Size := SizeOf(P) + SizeOf(Q) + SizeOf(S);
  8.   writeln(Size);
  9.  
  10.   Result := P+Q+S;
  11. end;
  12.  
  13. var
  14.   s: PChar;
  15. begin
  16.   s := specialize ConstString<'Hello', ' world', ' !'>;
  17.   writeln(s);
  18. end.
  19.  

Only 1 minor detail: The length of the string must be > 1. If I remove the space in front of the ' !', it gives a type mismatch (expected 'char').

I'm using trunk for both Laz + FPC on MacOs Big Sur 11.4
Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: PascalDragon on June 22, 2021, 01:38:54 pm
Only 1 minor detail: The length of the string must be > 1. If I remove the space in front of the ' !', it gives a type mismatch (expected 'char').

Please report that as a bug. That's a missing type check then as that should definitely be allowed.
Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: JdeHaan on June 22, 2021, 03:09:37 pm
Done: 0039030
My first one, so hope it's clear
Title: Re: Is there some way to obtain the length of a character array at compile time ?
Post by: PascalDragon on June 23, 2021, 08:50:54 am
Done: 0039030
My first one, so hope it's clear

Looks good. Thank you.
TinyPortal © 2005-2018