Lazarus

Free Pascal => General => Topic started by: simsee on October 02, 2022, 11:18:31 am

Title: Curly brackets in strings
Post by: simsee on October 02, 2022, 11:18:31 am
Consider the following trivial program:

Code: Pascal  [Select][+][-]
  1. program Project1;
  2. begin
  3.   {
  4.   writeln('}');
  5.  }
  6.  writeln;
  7. end.

The compiler gives an error, as it is deceived by the presence of the closing brace inside the string in writeln.

Is this a bug? How should curly brackets be handled in strings? Is it mandatory to use #123 and #125?

In the documentation:

https://www.freepascal.org/docs-html/current/ref/refse8.html#x19-180001.8)

there is no mention of this issue.

Shouldn't the compiler parser ignore what's inside the quotes?

Thanks.
Title: Re: Curly brackets in strings
Post by: Thaddy on October 02, 2022, 11:29:40 am
Code: Pascal  [Select][+][-]
  1. program Project1;
  2. begin
  3.   { <---- this is a multiline comment
  4.   writeln('}'); // this would compile if not surrounded by the multiline comment....
  5.  and this is the end of a multiline comment --->}
  6. writeln;
  7. end.
Code: Pascal  [Select][+][-]
  1. begin
  2. writeln('{');
  3. end.
Title: Re: Curly brackets in strings
Post by: MarkMLl on October 02, 2022, 11:31:01 am
Hoo boy :-(

Not strictly a bug: use (* *) for the outer comments.

MarkMLl
Title: Re: Curly brackets in strings
Post by: Thaddy on October 02, 2022, 11:32:38 am
(* and *) are equal to { and } only older...
Title: Re: Curly brackets in strings
Post by: MarkMLl on October 02, 2022, 11:34:44 am
(* and *) are equal to { and } only older...

No, they were introduced at the same time. Wirth's original comments were /* */ but (I suspect) he realised that that style was also being used by the B language.

Somewhat later: https://www.research-collection.ethz.ch/bitstream/handle/20.500.11850/68910/eth-3059-01.pdf page 50 which shows that comments were in braces with /* */ as alternatives. I believe that at that time Wirth was working largely on CDC kit which had a fairly comprehensive character set, but swathes of the industry still relied on (rebadged) IBM 026 and 029 cardpunches for code entry.

MarkMLl
Title: Re: Curly brackets in strings
Post by: simsee on October 02, 2022, 12:53:16 pm
Thanks for the good explanations. But back to one of my initial questions: Shouldn't the compiler ignore what is in quotes?
Title: Re: Curly brackets in strings
Post by: Thaddy on October 02, 2022, 12:55:59 pm
Well, /* or divide then multiply has always been nonsense... :-X as is multiply then divide...
Title: Re: Curly brackets in strings
Post by: Thaddy on October 02, 2022, 12:58:53 pm
Thanks for the good explanations. But back to one of my initial questions: Shouldn't the compiler ignore what is in quotes?
But it does ignore it.... Always has.... It translates to a string literal and is stored as is. Except for escaped characters which are first evaluated and then stored.
Title: Re: Curly brackets in strings
Post by: MarkMLl on October 02, 2022, 01:03:25 pm
Short: No.

Longer: Any attempt to do this would mean that the lexer would have to scan the content of the comment and decide that the closing brace was inside a string. That is, broadly speaking, anathema since it implies that the content of the comment would have to be valid and correct Pascal source. I believe that the only situation in which anything inside the comment is interpreted is if the opening brace (or equivalent digraph) is followed by a $directive.

So it you want to exclude a block of code, either use a non-matching multiline comment (i.e. (* *) if your comment contains braces, or vice versa) or a {$ifdef ... $endif } pair. But even that can be fooled if there is an embedded $endif, so the "comment of last recourse" is a // at the start of every line.

MarkMLl
Title: Re: Curly brackets in strings
Post by: simsee on October 02, 2022, 01:04:17 pm
Thanks Thaddy.  I rephrase my question: are curly brackets, as markers for the start and end of comments, to be considered active also inside the quotes?
Title: Re: Curly brackets in strings
Post by: Thaddy on October 02, 2022, 01:09:42 pm
No. The quotes are parsed first and that is even if the contain itself a curly bracket.
If it does not work, that would be a serious parser bug. (Even with nested comments)

@Mark: you are wrong, see the docs about chicken and egg, (* was first, {} is TP style.:
https://www.freepascal.org/docs-html/ref/refse2.html
Title: Re: Curly brackets in strings
Post by: MarkMLl on October 02, 2022, 01:15:48 pm
@Mark: you are wrong, see the docs about chicken and egg, (* was first, {} is TP style.:
https://www.freepascal.org/docs-html/ref/refse2.html

No Thaddy, YOU ARE WRONG: I cited a 1973 Wirth paper. Page 50 of the PDF, page 44 of the original.

MarkMLl
Title: Re: Curly brackets in strings
Post by: simsee on October 02, 2022, 01:16:48 pm
I'm sorry, I don't want to make you lose your patience, but from my example it would seem that the curly bracket inside the quotes is active and pairs with the previous open one, closing the multiline comment.
Title: Re: Curly brackets in strings
Post by: Thaddy on October 02, 2022, 01:32:52 pm
The code inside the start of a comment is ignored until the compiler sees a closing of the comment. The compiler does not evaluate complex comments (well, it evaluates nested comments.) So the compiler thinks: Oh start of a comment, just look for the closing.. Code inside comments is never compiled, not even evaluated.
In your case you can use proper real old school comments as already suggested: (*...*).
Title: Re: Curly brackets in strings
Post by: Bart on October 02, 2022, 01:36:21 pm
FWIW: Delphi 7 also does not compile the example code.

Bart
Title: Re: Curly brackets in strings
Post by: wp on October 02, 2022, 01:39:26 pm
Neither does Delphi XE 10.4 CE.
Title: Re: Curly brackets in strings
Post by: Thaddy on October 02, 2022, 01:39:39 pm
For the same reason.
Title: Re: Curly brackets in strings
Post by: wp on October 02, 2022, 01:48:14 pm
I think Thaddy is right. Suppose this:
Code: Pascal  [Select][+][-]
  1. { This comment contains a curly brace }. }
Everybody would say that the comment ends at the first brace (even the highligher of the forum software does so). If the parser does not perform analysis of what is inside the braces then there is no difference to the situation
Code: Pascal  [Select][+][-]
  1. {
  2.   WriteLn('}');
  3. }'
Title: Re: Curly brackets in strings
Post by: simsee on October 02, 2022, 01:55:43 pm
So we can conclude that the curly brackets are always active, regardless of the surrounding context.
Title: Re: Curly brackets in strings
Post by: ASerge on October 02, 2022, 01:58:40 pm
So we can conclude that the curly brackets are always active, regardless of the surrounding context.
Yes
Compilers scan from '{' to the nearest '}', and only then scan tokens, including quoted strings.
As result
Code: Pascal  [Select][+][-]
  1. {
  2.   WriteLn('}');
  3. }'
is then same as
Code: Pascal  [Select][+][-]
  1. ');
  2. }'
and
Code: Pascal  [Select][+][-]
  1. { This comment contains a curly brace }. }
is
Code: Pascal  [Select][+][-]
  1. . }
Both produce error.
Title: Re: Curly brackets in strings
Post by: Fred vS on October 02, 2022, 02:02:02 pm
fpc dont like this one too:

Code: Pascal  [Select][+][-]
  1. program test;
  2.  
  3. { I am a comment with this: { something... }
  4.  
  5. begin
  6. end.

Result:

Quote
fred@fred-80m0 ~> fpc test.pas
Free Pascal Compiler version 3.2.2 [2021/07/09] for x86_64
Copyright (c) 1993-2021 by Florian Klaempfl and others
Target OS: Linux for x86-64
Compiling test.pas
test.pas(3,29) Warning: Comment level 2 found
test.pas(6,5) Fatal: Unexpected end of file
Fatal: Compilation aborted
Error: /usr/bin/ppcx64 returned an error exitcode


Title: Re: Curly brackets in strings
Post by: MarkMLl on October 02, 2022, 02:07:39 pm
In your case you can use proper real old school comments as already suggested: (*...*).

On that at least we agree. J&W-era Pascal had definitely moved from /* */ to (* *) as the digraph, I don't have a copy to hand but I suspect that it used that form in the code examples rather than braces. The fact that I habitually use (* *) for "long term" comments suggests that it's a habit I got into early, despite most of the systems I've used having the full ASCII character set.

I believe that braces got into ASCII circa 1965 and were definitely in the definitive ASCII-68, but the fact that Xerox PARC's Smalltalk of the late '70s and early '80s still had arrows in the character set suggests that some manufacturers stayed on an older version (replacing print drums in fast line printers was expensive). I believe that Wirth was using a CDC when he worked on Pascal, a quick trawl in Bitsavers suggests that their large-scale printers didn't have braces (but did have some of the ALGOL-style "funnies") but I've seen some CDC Pascal stuff in the past which implied that at least some of their terminals etc. had something like a 12-bit character set which included e.g. much if not all of the APL operators.

MarkMLl
Title: Re: Curly brackets in strings
Post by: dseligo on October 02, 2022, 02:16:12 pm
So we can conclude that the curly brackets are always active, regardless of the surrounding context.

Not only curly brackets. This also doesn't compile:
Code: Pascal  [Select][+][-]
  1.     program Project1;
  2.     begin
  3.       (*
  4.       writeln('*)');
  5.     *)
  6.     writeln;
  7.    end.

Thaddy's explanation in post #13 is correct.
Title: Re: Curly brackets in strings
Post by: simsee on October 02, 2022, 02:33:54 pm
 But this compiles:

Code: Pascal  [Select][+][-]
  1. program Project1;
  2.  
  3. begin
  4.   writeln('ok//');
  5. end.

So, for single line comments the rule the rule does not apply.
Title: Re: Curly brackets in strings
Post by: Fred vS on October 02, 2022, 02:38:01 pm
The code inside the start of a comment is ignored until the compiler sees a closing of the comment.
...

Hum, ok, but it seems that the compiler does care of other open brackets.
See my post: https://forum.lazarus.freepascal.org/index.php/topic,60786.msg455827.html#msg455827
Title: Re: Curly brackets in strings
Post by: dseligo on October 02, 2022, 02:56:45 pm
But this compiles:

Code: Pascal  [Select][+][-]
  1. program Project1;
  2.  
  3. begin
  4.   writeln('ok//');
  5. end.

So, for single line comments the rule the rule does not apply.

It does apply, your '//' isn't inside a comment.

The code inside the start of a comment is ignored until the compiler sees a closing of the comment.
<snip>
Title: Re: Curly brackets in strings
Post by: wp on October 02, 2022, 03:56:26 pm
It's always the same rule:
- Find a comment beginner '(*" --> search until the next '*)' is found, ignore everything else in between.
- Find a comment beginner '{' --> Search until the next  '}' is found, ignore everything else.
- Find a quote indicating that a string begins --> search until the next quote character is found (unless duplicated) and ignore everything else.
Title: Re: Curly brackets in strings
Post by: Arioch on October 02, 2022, 04:10:57 pm
wp - no, not always.  In some dialects, more theoretican practical, like ANSI Pascal the following are correct and complete comments:

(*  bla-bla-vla }
{ bla-vla-vla *)


Actually, TP (if not USCD) violated the standard and hardware history when making them incompatible. But doing so they enables erzats nested comments, so they did good

On the contrary.  (. 1..2 ] and [ 3 .. 4 .) are still valid sets/ranges.  Inconsistent, but practical :-)
Title: Re: Curly brackets in strings
Post by: MarkMLl on October 02, 2022, 04:48:45 pm
wp - no, not always.  In some dialects, more theoretican practical, like ANSI Pascal the following are correct and complete comments:

(*  bla-bla-vla }
{ bla-vla-vla *)


Actually, TP (if not USCD) violated the standard and hardware history when making them incompatible. But doing so they enables erzats nested comments, so they did good

On the contrary.  (. 1..2 ] and [ 3 .. 4 .) are still valid sets/ranges.  Inconsistent, but practical :-)

Yes, depending on whether they are processed as digraphs at the lexer level or as separate elements.

I'd expect comments to always be discarded by the lexer, and that includes /everything/ between an opening and closing marker. I'd expect the (. and .) digraphs to be converted to single characters and then passed to the parser with no further attempt at interpretation.

Part of the issue with matching comments is because in FPC a comment marker closes a macro, and a macro definition, which could itself potentially include a (non-matching) comment.

MarkMLl
Title: Re: Curly brackets in strings
Post by: Arioch on October 02, 2022, 05:24:27 pm
My *blind guess* it that originally, in TP descending parser FSM, it was just simpler to copy-paste {/} and (*/*) pairs verbatim, than making special cases of "one closing another"

you meet '{' - search for the next '}'
you meet '(*' - search for the next '*)'

it is simple and large;y error-safe, just colling one single search fuinciton, Pos :-)

REP SCANSB is all you need, well, almost

multivariant searcher is much more complex, and thus slow and error prone.

This, by accident, enabled nested comments.  "And programemrs saw it, and they saw it was good" and demanded it to forever be so :-)
Title: Re: Curly brackets in strings
Post by: PascalDragon on October 03, 2022, 02:38:02 pm
fpc dont like this one too:

Code: Pascal  [Select][+][-]
  1. program test;
  2.  
  3. { I am a comment with this: { something... }
  4.  
  5. begin
  6. end.

Result:

Quote
fred@fred-80m0 ~> fpc test.pas
Free Pascal Compiler version 3.2.2 [2021/07/09] for x86_64
Copyright (c) 1993-2021 by Florian Klaempfl and others
Target OS: Linux for x86-64
Compiling test.pas
test.pas(3,29) Warning: Comment level 2 found
test.pas(6,5) Fatal: Unexpected end of file
Fatal: Compilation aborted
Error: /usr/bin/ppcx64 returned an error exitcode

That depends upon the mode. In modes FPC and ObjFPC FPC allows nested comments, so you can have { foo { bar } blubb }, while in modes TP and Delphi you need to alternate the comment style for this to compile: { foo (* bar *) blubb }. This is why your code fails as the compiler detects a second level comment and expects as many closing brackets to finish of the comment(s).
Title: Re: Curly brackets in strings
Post by: MarkMLl on October 03, 2022, 03:00:01 pm
That depends upon the mode. In modes FPC and ObjFPC FPC allows nested comments, so you can have { foo { bar } blubb }, while in modes TP and Delphi you need to alternate the comment style for this to compile: { foo (* bar *) blubb }. This is why your code fails as the compiler detects a second level comment and expects as many closing brackets to finish of the comment(s).

When you say "alternate" do you literally mean /alternate/ or is there still a two-deep limit?

MarkMLl
Title: Re: Curly brackets in strings
Post by: PascalDragon on October 05, 2022, 11:06:35 pm
That depends upon the mode. In modes FPC and ObjFPC FPC allows nested comments, so you can have { foo { bar } blubb }, while in modes TP and Delphi you need to alternate the comment style for this to compile: { foo (* bar *) blubb }. This is why your code fails as the compiler detects a second level comment and expects as many closing brackets to finish of the comment(s).

When you say "alternate" do you literally mean /alternate/ or is there still a two-deep limit?

For both Delphi and modes Delphi and TP only two levels are supported. With the full blown nested comments of the other modes you can have arbitrary levels (well, I think there is some limit in the compiler).
Title: Re: Curly brackets in strings
Post by: MarkMLl on October 06, 2022, 09:29:12 am
For both Delphi and modes Delphi and TP only two levels are supported. With the full blown nested comments of the other modes you can have arbitrary levels (well, I think there is some limit in the compiler).

Thanks for that. I'd remark that I've always found coding for Wirth's later style (only (* *) comments) a right pain in the neck, particularly if they could contain pragmata. It's... amusing that Modula-2 repurposed braces but mandated that comments be nestable, while allowing both single- and double-quotes on strings.

ALGOL-60 had multi-line comments at the parser level (everything between COMMENT and ; ignored) and possibly at the lexer level (single-line starting with % ). I'm not sure but I think that Ada (hence VHDL etc.) only has single-line comments.

MarkMLl
TinyPortal © 2005-2018