Lazarus

Free Pascal => General => Topic started by: fpccn_ljz on June 19, 2021, 03:47:00 am

Title: It has a little bug in IDE
Post by: fpccn_ljz on June 19, 2021, 03:47:00 am
I had found a little bug in IDE :

if the last column lost ' ; ' ,IED can't find .
Title: Re: It has a little bug in IDE
Post by: speter on June 19, 2021, 03:54:27 am
That is not a bug.

Pascal has allowed that syntax since the 1980's.

cheers
S.
Title: Re: It has a little bug in IDE
Post by: lucamar on June 19, 2021, 05:32:25 am
In fact, that extra ";" we all put there is really a new empty sentence ;)

And we might thank the gods it's "allowed", otherwise the little "else" problem* most of us have seen sometime or other would reach monster proportions :D


* I mean the "problem" that there can't be a ";" before he else in an if...else ... which while no problem at all, can catch you unaware sometimes.
Title: Re: It has a little bug in IDE
Post by: 440bx on June 19, 2021, 07:36:33 am
I had found a little bug in IDE :

if the last column lost ' ; ' ,IED can't find .
The semicolon in Pascal is a statement separator and "end" is not a statement, therefore in strict Pascal, a semicolon should not precede the "end" keyword.

Many implementations of Pascal, including FPC and Delphi allow "extra" semicolons because treating the ";" strictly as a separator (which is what it is in Pascal) proved confusing for many programmers.

Just for fun, declare a record (anything you want, doesn't matter) and don't put a semicolon at the end of the last field (which should be followed by the record terminator "end" keyword) and you'll notice that the semicolon isn't required there either.
Title: Re: It has a little bug in IDE
Post by: lucamar on June 19, 2021, 08:03:23 am
The semicolon in Pascal is a statement separator and "end" is not a statement, therefore in strict Pascal, a semicolon should not precede the "end" keyword.

As demonstrated by the fact that it's not allowed before an "else" even after an "end", as in:
Code: Pascal  [Select][+][-]
  1. if ... then
  2.   begin
  3.     {...}
  4.   end {<- no ";" here!}
  5. else
  6.     {...};

The sintax of "if ... else" is much more strict (and demostrative) in this sense.
Title: Re: It has a little bug in IDE
Post by: 440bx on June 19, 2021, 08:20:11 am
The sintax of "if ... else" is much more strict (and demostrative) in this sense.
Yes, it is because since neither the "end" nor the "else" keywords are statements a semicolon in between won't be welcome. :)
Title: Re: It has a little bug in IDE
Post by: alpine on June 19, 2021, 11:34:42 am
The sintax of "if ... else" is much more strict (and demostrative) in this sense.
Yes, it is because since neither the "end" nor the "else" keywords are statements a semicolon in between won't be welcome. :)
In fact, "end" is a statement, or at least, it is the end of a compound statement.

Let me humbly explain.
The "statement" is a non-terminal into the formal grammar of Pascal. The word "end" is a terminal used at the end of another non-terminal named "compound statement".
Code: [Select]
<compound statement> ::= begin <statement> {; <statement> } end The if statement is defined as:
Code: [Select]
<if statement> ::= if <expression> then <statement> | if <expression> then <statement> else <statement>There is no semicolon before the "else" terminal.

Formally defined in that way, you can put semicolons between two "end"s and it will prefix an empty statement.
Code: Pascal  [Select][+][-]
  1. if Cond then begin
  2.   begin
  3.   end ; {empty statement}
  4.   ; {empty statement}
  5.   ;;;;;;;;;;;;;
  6. end

And this is not an error or feature, this is by design.

Requiring semicolon between the statement and the "else" keyword (as it is in our beloved C) will put the Pascal grammar in another class - the class that cannot be parsed without backtracking.

Edit: but actually the OP was for the IDE not for the Pascal syntax.
Title: Re: It has a little bug in IDE
Post by: MarkMLl on June 19, 2021, 12:23:43 pm
That is not a bug.

Pascal has allowed that syntax since the 1980's.

cheers
S.

Others have remarked on this but I want to emphasise a point: Pascal (from the early 1970s), and ALGOL before it (from the late 1950s), were *DEFINED* as using ; as a separator, not as a terminator. Most compilers *PERMIT* a *SUPERFLUOUS* ; where it doesn't introduce problems.

MarkMLl
Title: Re: It has a little bug in IDE
Post by: 440bx on June 19, 2021, 04:16:17 pm
In fact, "end" is a statement, or at least, it is the end of a compound statement.
As you pointed out, it is a keyword in the pair that defines a compound statement but, "end" itself is not a statement.

Let me humbly explain.
Your explanation is good but misleading because "end" is not a statement.  When the compiler looks for a statement (because it expects one), it looks for keywords such as "for", "while", "repeat", "begin", etc but, there is definitely no "end" in the list of possible statements.

The word "end" is a terminal used at the end of another non-terminal named "compound statement".
Code: [Select]
<compound statement> ::= begin <statement> {; <statement> } end
what you stated there is absolutely correct and note that the definition of <statement> does not include "end".  The <begin><end> pair are compound statement (a statement list if you prefer) delimiters.  The pair, _together_ is a statement, a compound statement to be more precise, that starts with the keyword "begin" (which without a matching "end" isn't a statement either).  "end" by itself is _not_ a statement, it doesn't even start a statement.

Requiring semicolon between the statement and the "else" keyword (as it is in our beloved C) will put the Pascal grammar in another class - the class that cannot be parsed without backtracking.
It would make the semicolon a statement terminator instead of separator.  I don't think that would make backtracking necessary (at least I don't see a case where the presence of a semicolon would make it necessary at this time.)
Title: Re: It has a little bug in IDE
Post by: alpine on June 19, 2021, 05:58:57 pm
In fact, "end" is a statement, or at least, it is the end of a compound statement.
As you pointed out, it is a keyword in the pair that defines a compound statement but, "end" itself is not a statement.
It's my bad wording, the "end" is a terminal symbol in the sense of the BNF notation. Considering we are discussing the "if" statement, the preceding non-terminal symbol of "else" can be a "statement" (that is by Pascal definition rules). The only "statement" non-terminal which ends on an "end" terminal symbol is a "compound statement". In other words, that particular "end" ends a compound statement.

Let me humbly explain.
Your explanation is good but misleading because "end" is not a statement.  When the compiler looks for a statement (because it expects one), it looks for keywords such as "for", "while", "repeat", "begin", etc but, there is definitely no "end" in the list of possible statements.
The compiler doesn't look for a statement. It takes the next lexeme and tries to follow its current production rule.

The word "end" is a terminal used at the end of another non-terminal named "compound statement".
Code: [Select]
<compound statement> ::= begin <statement> {; <statement> } end
what you stated there is absolutely correct and note that the definition of <statement> does not include "end".  The <begin><end> pair are compound statement (a statement list if you prefer) delimiters.  The pair, _together_ is a statement, a compound statement to be more precise, that starts with the keyword "begin" (which without a matching "end" isn't a statement either).  "end" by itself is _not_ a statement, it doesn't even start a statement.
On the contrary: definition of statement includes the "end". The BNF rules are too many to include here, but the "statement" non-terminal definition is a recursive one, like:
Code: [Select]
<statement> ::= <simple statement> | <if statement> | ... | <compound statement>Note <statement> includes <compound statement> and also <compound statement> includes <statement>.

Requiring semicolon between the statement and the "else" keyword (as it is in our beloved C) will put the Pascal grammar in another class - the class that cannot be parsed without backtracking.
It would make the semicolon a statement terminator instead of separator.  I don't think that would make backtracking necessary (at least I don't see a case where the presence of a semicolon would make it necessary at this time.)
The separator is a term of the lexer. (What is a terminator?) The separators are whitespace and comments, they separate the lexemes. The semicolon is a terminal symbol in the BNF rules.

Putting a semicolon there will change the grammar from LL(1) to LL(k), k>1. AFAIK, the FPC compiler uses a recursive descent parser, which is the main reason of its speed. Such a parser will need to backtrack if it can't decide which production rule to follow by just one lexeme (k>1).
Title: Re: It has a little bug in IDE
Post by: 440bx on June 19, 2021, 06:29:24 pm
The only "statement" non-terminal which ends on an "end" terminal symbol is a "compound statement". In other words, that particular "end" ends a compound statement.
yes but, "end" does _not_ start a statement.

The compiler doesn't look for a statement. It takes the next lexeme and tries to follow its current production rule.
The compiler better look for a statement, that's what tells the compiler the next production to execute.  When the compiler sees a "begin" it looks for a statement which must be anyone of the symbols that start a statement of which "end" is _not_ one of them.

On the contrary: definition of statement includes the "end". The BNF rules are too many to include here, but the "statement" non-terminal definition is a recursive one, like:
Code: [Select]
<statement> ::= <simple statement> | <if statement> | ... | <compound statement>Note <statement> includes <compound statement> and also <compound statement> includes <statement>.
Notice that in that BNF rule, there is no statement that starts with "end".  "end" is part of a compound statement but, it is _not_ a token that starts a statement.  It is neither found in <simple statement> nor in the start of <compound statement> or any other statement for that matter.  "end" is not a statement, it is a statement list delimiter.

The separator is a term of the lexer. (What is a terminator?) The separators are whitespace and comments, they separate the lexemes. The semicolon is a terminal symbol in the BNF rules.
The semicolon is a terminal symbol but, _how_ it's used is what makes it a separator instead of a terminator.  if the semicolon was a _terminator_ then the <statement> production would include the semicolon in it, instead in Pascal it appear in a production such as { statement } { ";" statement } if it were a terminator a statement list would specify statement as { statement ";" }.  That is what makes the semicolon a terminator (as in C)

Putting a semicolon there will change the grammar from LL(1) to LL(k), k>1.
I cannot think of an instance where that would be the case.  There is no reason for the compiler to have to backtrack because a semicolon terminates statements instead of separating them.

AFAIK, the FPC compiler uses a recursive descent parser, which is the main reason of its speed. Such a parser will need to backtrack if it can't decide which production rule to follow by just one lexeme (k>1).
FPC is a recursive descent compiler but, changing the semicolon from separator to terminator wouldn't affect that.  It would affect the grammar but, it wouldn't create a situation where the compiler would have to backtrack (at least not in Pascal.)
Title: Re: It has a little bug in IDE
Post by: alpine on June 19, 2021, 09:38:24 pm
yes but, "end" does _not_ start a statement.
Have I ever said the opposite?

The compiler better look for a statement, that's what tells the compiler the next production to execute.  When the compiler sees a "begin" it looks for a statement which must be anyone of the symbols that start a statement of which "end" is _not_ one of them.
The recursive descent parser usually is implemented as recursive procedures which closely resembles the BNF non-terminal rules. Those procedures uses the next seen terminal symbol to decide which rule to follow (procedure to call).  And BTW, the "end" is perfectly valid to follow "begin".

On the contrary: definition of statement includes the "end". The BNF rules are too many to include here, but the "statement" non-terminal definition is a recursive one, like:
Code: [Select]
<statement> ::= <simple statement> | <if statement> | ... | <compound statement>Note <statement> includes <compound statement> and also <compound statement> includes <statement>.
Notice that in that BNF rule, there is no statement that starts with "end".  "end" is part of a compound statement but, it is _not_ a token that starts a statement.  It is neither found in <simple statement> nor in the start of <compound statement> or any other statement for that matter.
Again, did I ever said the opposite?

  "end" is not a statement, it is a statement list delimiter.
No. "end" is definitely not a delimiter. It is a terminal symbol which stands at the end of the <compound statement> rule (and some others too).

The separator is a term of the lexer. (What is a terminator?) The separators are whitespace and comments, they separate the lexemes. The semicolon is a terminal symbol in the BNF rules.
The semicolon is a terminal symbol but, _how_ it's used is what makes it a separator instead of a terminator.
I see. You named it (as separator, terminator) according to when it appears to be between or at the end of the "list". I would name it "promoter" instead, because if you look closer into the <compound statement> rule, you'll see that this is the token that tells the parser that another statement will follow, even it is the empty statement. Don't confuse yourself with that it is usually the last symbol on the line.

if the semicolon was a _terminator_ then the <statement> production would include the semicolon in it, instead in Pascal it appear in a production such as { statement } { ";" statement } if it were a terminator a statement list would specify statement as { statement ";" }.  That is what makes the semicolon a terminator (as in C)
This definition { statement ";" } will introduce ambiguity which changes grammar to LL(k). This is why you can't parse C grammar with a (naive) recursive descent parser.

I cannot think of an instance where that would be the case.  There is no reason for the compiler to have to backtrack because a semicolon terminates statements instead of separating them.
Consider the following simple example: the if statement is being parsed, after the "then" token the if_rule procedure has called the statement_rule procedure.  The statement_rule procedure returns consuming the semicolon. Now what to expect? Another semicolon because "if" is also a statement (and must be terminated)? "else" because there is an else clause? or some other statement?  You can't tell for sure that the current if_rule is completed or not. You must take a decision and eventually to backtrack if it was wrong.
 
Title: Re: It has a little bug in IDE
Post by: MarkMLl on June 19, 2021, 10:32:36 pm
I'm not sure whether the "terminal" and "non-terminal" concepts had been formalised when Wirth "migrated" (hurriedly, according to my research) from ALGOL-W to Pascal, and this might contribute to some of the sloppiness highlighted by the current discussion. In any event, Wirth's early compilers were table-driven recursive /ascent/, and I think we have to stick to the BNF and "railroad diagrams" of the era rather than trying to impute deeper structure.

Apart from that I'd observe that this discussion is substantially more "nerdish" than helps the OP.

MarkMLl
Title: Re: It has a little bug in IDE
Post by: alpine on June 19, 2021, 10:53:34 pm

Apart from that I'd observe that this discussion is substantially more "nerdish" than helps the OP.

Good point. It is written in the subject: it is about the IDE.
Maybe it must be discussed elsewhere.

Thanks,
Title: Re: It has a little bug in IDE
Post by: MarkMLl on June 19, 2021, 11:12:27 pm
Good point. It is written in the subject: it is about the IDE.
Maybe it must be discussed elsewhere.

Although in fairness, OP is ascribing behaviour to the IDE which is really down to the syntax of the language and the strictness of the compiler.

Hopefully he's got the message without being discouraged :-)

I'd throw in a couple of related points before this discussion completely winds down. First, at least some of the ALGOL implementations describe anything after END as being an implicit comment, with minimal consideration of the status of ; in that context. Second, the END. at the end of an ALGOL (hence Pascal) program was a non-standard "hack" from Burroughs that allowed the compiler to decide that it had seen enough without reading the next card which would potentially have screwed up the remainder of the batch job (and their compilers insist that nothing follow it).

I've spent a bit of time over the last few weeks looking at Verilog and VHDL, and it's an interesting exercise speculating which bits are unarguably ALGOL, which bits are unarguably Ada, and which bits are probably Ada but the sub-sub-sub-committee had their own idea about the END behaviour and didn't agree with the guys in the next office :-)

MarkMLl
Title: Re: It has a little bug in IDE
Post by: MarkMLl on June 19, 2021, 11:43:21 pm
Consider the following simple example: the if statement is being parsed, after the "then" token the if_rule procedure has called the statement_rule procedure.  The statement_rule procedure returns consuming the semicolon. Now what to expect? Another semicolon because "if" is also a statement (and must be terminated)? "else" because there is an else clause? or some other statement?  You can't tell for sure that the current if_rule is completed or not. You must take a decision and eventually to backtrack if it was wrong.


ifSt = 'IF' expression 'THEN' .OUT('DUP' '0') .OUT('BRF' *1)
        statementSeq .LABEL *1
        ifElsePart 'END';

ifElsePart = ('ELSE' .OUT('BRT' *1) statementSeq .LABEL *1) |
        (.EMPTY .OUT('STO' '__undefined__'));

(* If statement. Note that care is required to make sure that the   *)
(* stack is consistent whether or not there is an else part.      *)


That's Meta-2 notation which you're probably unfamiliar with (Schorre, 1964), but basically "it just works" if you look for "else..." | nothing. My company's running on that code.

MarkMLl
Title: Re: It has a little bug in IDE
Post by: 440bx on June 20, 2021, 12:02:34 am
And BTW, the "end" is perfectly valid to follow "begin".
Yes, it is and the reason has nothing to do with semicolons, it has everything to do with the fact that Pascal accepts whitespace (or a comment) as a null statement (incidentally, just as C does but, in C if whitespace is used as a null statement then the null statement must be terminated just like any other statement)

No. "end" is definitely not a delimiter. It is a terminal symbol which stands at the end of the <compound statement> rule (and some others too).
It's a "delimiter" in the sense that it marks the end of a compound statement (unfortunately, among other things.) 

I see. You named it (as separator, terminator) according to when it appears to be between or at the end of the "list".
This isn't just a choice I made.  C uses semicolons as a statement terminator whereas Pascal uses it as a statement separator.  In strict Pascal (the original McCoy) the last statement in a compound statement cannot be followed by a semicolon because if the compiler sees a semicolon it expects a keyword (or identifier) that starts another statement.  That is not the case in C.  In C, the last statement in a compound statement {...} must be ended by a semicolon just like any other statement because it explicitly marks the end of the statement.

This definition { statement ";" } will introduce ambiguity which changes grammar to LL(k). This is why you can't parse C grammar with a (naive) recursive descent parser.
It wouldn't.  It definitely would not cause the grammar to change to LL(k).  For the record, Pascal cannot be parsed with a "naive" recursive descent parser either (at least not strictly based on its grammar.)

Consider the following simple example: the if statement is being parsed, after the "then" token the if_rule procedure has called the statement_rule procedure.
In that case the compiler expects a _simple_ statement.

The statement_rule procedure returns consuming the semicolon. Now what to expect?
It must expect, either another statement (that is a new statement not part of the "if") or an "end" keyword that terminates a compound statement (or a procedure/function block.)

Another semicolon because "if" is also a statement (and must be terminated)? "else" because there is an else clause? or some other statement?  You can't tell for sure that the current if_rule is completed or not. You must take a decision and eventually to backtrack if it was wrong.
No, there is no need for multiple semicolons.  if multiple semicolons appeared then they would either be separators between null statements (as in Pascal) or null statement terminators (as in C.)

Title: Re: It has a little bug in IDE
Post by: alpine on June 20, 2021, 02:53:55 am
@440bx
Well, let's argue whether to put a semicolon in C as follows {{};};
I'm bailing out of this...

Good point. It is written in the subject: it is about the IDE.
Maybe it must be discussed elsewhere.

I'd throw in a couple of related points before this discussion completely winds down. First, at least some of the ALGOL implementations describe anything after END as being an implicit comment, with minimal consideration of the status of ; in that context. Second, the END. at the end of an ALGOL (hence Pascal) program was a non-standard "hack" from Burroughs that allowed the compiler to decide that it had seen enough without reading the next card which would potentially have screwed up the remainder of the batch job (and their compilers insist that nothing follow it).

I've spent a bit of time over the last few weeks looking at Verilog and VHDL, and it's an interesting exercise speculating which bits are unarguably ALGOL, which bits are unarguably Ada, and which bits are probably Ada but the sub-sub-sub-committee had their own idea about the END behaviour and didn't agree with the guys in the next office :-)

Mark, thanks for sharing this. I'm personally very curious about such a first-person views on a computers development (and this is with the utmost respect). The time the computers came in my country, or I came into computers, it was about ~10 years gap from the IT developed (western) countries. That was the time for the engineers to reverse engineer the  IBM/360, VAX PDP-8, VME, etc. The clones were declared as genuine, the others were not mentioned at all.
So, I'm now truly interested in that unknown (for me) development. Languages as ALGOL, MPL, even COBOL - these are lost, skipped for me. But, as it turns out, every one of these has its implications on what we do now.

As I previously said, maybe it is a subject for a separate discussion.
Title: Re: It has a little bug in IDE
Post by: waltfair on May 01, 2023, 03:56:45 pm
My first languages were Fortran and IBM 1620 assembly, then in grad school I learned and used Algol, so
I could understand and implement algorithms from the ACM literature.
When I finally got a PC and found TurboPascal, the switch from Algol was painless.
After I left industry to start my own consulting business, I initially used TurboPascal and Delphi for my commercial software.

Later I used C/C++, and Prolog.

Now most of my development work is done in Lazarus or C#

My PHD dissertation code was done in C#
TinyPortal © 2005-2018