Recent

Author Topic: Syntax curiosity  (Read 8433 times)

munair

  • Hero Member
  • *****
  • Posts: 798
  • compiler developer @SharpBASIC
    • SharpBASIC
Syntax curiosity
« on: September 12, 2019, 01:04:06 pm »
I actually miss a general discussion section on this forum, so I will put this topic here (don't accuse me of putting it in the wrong place).

Ever since I started programming with Pascal I wondered about the IF construct, specifically the absence of a combined ELSE IF for elseif blocks. There are languages that use one keyword for it (VisualBASIC, Python, PHP), usually something like ELSEIF or ELSIF.

When Wirth designed the language he probably figured that backtracking or two-pass techniques, as most compilers do today, would not be an option. However, there is a way to combine the ELSE IF keywords (which would prevent the inevitable nesting of IF statements) without back tracking or doing multiple passes.

As soon as the parser finds the ELSE keyword, a fixed branch-out label can be set. It doesn't matter how many ELSE IF's will follow; they either jump to the next ELSE IF (if the condition is false) or they branch out to the fixed label. The only price to pay is a redundant label if the final ELSE IF block is not followed by a ELSE block.

To illustrate this design, here is a code snippet (not programmed in Pascal):
Code: FreeBasic  [Select][+][-]
  1. lex.getoken()
  2.        
  3. ep_evalbool()
  4.        
  5. l1 = newlabel()            ' L0
  6. emitln("JE " + l1)
  7.        
  8. st_block(el)
  9. expectoken(tkKeyword.nEnd)
  10.  
  11. ln = l1                    ' L0
  12.  
  13. if (lex.cutoken.id = tkKeyword.nElse) then
  14.        
  15.   l1 = newlabel()          ' L1 (fixed)
  16.  
  17.   do
  18.     emitln("JMP " + l1)
  19.     postlabel(ln)          ' L0, L2, L3, ...
  20.  
  21.     lex.getoken()
  22.  
  23.     if lex.cutoken.id = tkKeyword.nIf then
  24.       lex.getoken()
  25.  
  26.       ep_evalbool()
  27.  
  28.       ln = newlabel()
  29.       emitln("JE " + ln)
  30.  
  31.       st_block(el)
  32.       expectoken(tkKeyword.nEnd)
  33.  
  34.       if lex.cutoken.id <> tkKeyword.nElse then
  35.         postlabel(ln)
  36.         exit do
  37.       end if
  38.  
  39.     elseif lex.cutoken.id = tkKeyword.nDo then
  40.       st_block(el)
  41.       expectoken(tkKeyword.nEnd)
  42.       exit do
  43.     end if
  44.   loop
  45.  
  46. end if
  47.  
  48. expectoken(tkSymbol.nSemicolon)
  49.  
  50. postlabel(l1)

While slightly more complex than Pascal's IF .. THEN .. ELSE, it would still make among the fastest compilers today.
« Last Edit: September 15, 2019, 10:58:53 am by Munair »
keep it simple

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11382
  • FPC developer.
Re: Syntax curiosity
« Reply #1 on: September 12, 2019, 01:38:44 pm »
Ever since I started programming with Pascal I wondered about the IF construct, specifically the two-keywords ELSE IF construct for elseif blocks. Most languages use one keyword for it, usually something like ELSEIF or ELSIF.

Only Basic I guess. it is a not natural combination of two keywords that already are in use. Probably they needed to fix something, but I'm not so deep into Basic details. That said, later compilable basics are not original BASIC, so it might not even be part of the original Basic syntax. Afaik C=64 Basic V2 only had if and goto.

Quote
The curious Pascal construct probably goes back to the late 60s / early 70s, when the language was first designed.

So why did Wirth choose this specific language design? When studying/developing a compiler one will see the benefit (and beauty) of it, especially when considering the available hardware at the time. In order to make a compiler fast and (relatively) simple, it should be top-down and do a single pass. But with ELSEIF blocks there is no way to know how many blocks there are, which means there is also no way to know what label to branch out to, unless the compiler does a second pass.

The whole block structure is a child of its time. Language design was in the early stages in the sixties.  He did better in the successor, Modula2, but Pascal was already a hit.

Quote
But with Wirth's clever ELSE IF construct, a fixed branch-out label can be set as soon as the parser finds the ELSE keyword. It doesn't matter how many ELSE IF's will follow; they either jump to the next ELSE IF (if the condition is false) or they branch out to the fixed label. The only price to pay is a redundant label if the final ELSE IF block is not followed by a ELSE block.

If the IF block is simple. You still need to cater for a lot of complexity if the part before the ELSE is very complex, or handle it recursively. 

Most conditional branches have very limited distances they can jump, so it nearly never this simple.



munair

  • Hero Member
  • *****
  • Posts: 798
  • compiler developer @SharpBASIC
    • SharpBASIC
Re: Syntax curiosity
« Reply #2 on: September 12, 2019, 02:07:35 pm »
Only Basic I guess. it is a not natural combination of two keywords that already are in use. Probably they needed to fix something, but I'm not so deep into Basic details. That said, later compilable basics are not original BASIC, so it might not even be part of the original Basic syntax. Afaik C=64 Basic V2 only had if and goto.

Not just Basic. Python has ELIF. PHP has ELSEIF to name a few. But indeed, leading languages use ELSE IF, probably for the same reason.

If the IF block is simple. You still need to cater for a lot of complexity if the part before the ELSE is very complex, or handle it recursively. 

Most conditional branches have very limited distances they can jump, so it nearly never this simple.
Recursive processing is key. Once the statements are well defined and blocks are handled properly, (deep) nesting will not be a problem.

For example:
Code: FreeBasic  [Select][+][-]
  1. ' if-statement
  2. if a = 0 do
  3.   a = fnA();
  4.   ' if-statement
  5.   if a = 1 do
  6.     a = -1;
  7.     end; ' end if-statement
  8.   end
  9. else if b = 0 do
  10.   b = fnB();
  11.   end
  12. else if c = 0 do
  13.   c = fnC();
  14.   end
  15. else if d = 0 do
  16.   d = fnD();
  17.   end
  18. else do
  19.   z = fnZ();
  20.   end; ' end if-statement

would output something like this (debugging):
Code: [Select]
if
<condition>

BEQ L0

a
BSR fnA

LEA a(PC), A0

MOVE D0, (A0)

if
<condition>

BEQ L1

a
MOVE #0, D0

LEA a(PC), A0

MOVE D0, (A0)

end
L1:
end
BRA L2

L0:
<condition>

BEQ L3

fnB
BSR fnB

end
BRA L2

L3:
<condition>

BEQ L4

fnC
BSR fnC

end
BRA L2

L4:
<condition>

BEQ L5

fnD
BSR fnD

end
BRA L2

L5:
fnZ
BSR fnZ

end
L2:
EOI

I find AST processing much more complex than recursive procedure handling.
keep it simple

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11382
  • FPC developer.
Re: Syntax curiosity
« Reply #3 on: September 12, 2019, 02:13:19 pm »
Well, Pascal is not that simple, compare:

Code: Pascal  [Select][+][-]
  1. if a then
  2.   if b then
  3.       s
  4.     else
  5.       s2
  6.  
  7. with
  8.  
  9. if a then
  10.   if b then
  11.       s;
  12.  else
  13.    s2
  14.  
  15.  

I tried to illustrate the difference with indentation (which the compiler of course ignores)

munair

  • Hero Member
  • *****
  • Posts: 798
  • compiler developer @SharpBASIC
    • SharpBASIC
Re: Syntax curiosity
« Reply #4 on: September 12, 2019, 02:33:16 pm »
Well, Pascal is not that simple, compare:

Code: Pascal  [Select][+][-]
  1. if a then
  2.   if b then
  3.       s
  4.     else
  5.       s2
  6.  
  7. with
  8.  
  9. if a then
  10.   if b then
  11.       s;
  12.  else
  13.    s2
  14.  
  15.  

I tried to illustrate the difference with indentation (which the compiler of course ignores)

Well, I suppose that's where BEGIN .. END come in to prevent potential ambiguity and also make the code better readable / understandable. In general, a language should leave no room for ambiguity (which may be among the hardest parts of compiler development).
keep it simple

440bx

  • Hero Member
  • *****
  • Posts: 3944
Re: Syntax curiosity
« Reply #5 on: September 12, 2019, 03:05:17 pm »
Ever since I started programming with Pascal I wondered about the IF construct, specifically the two-keywords ELSE IF construct for elseif blocks. Most languages use one keyword for it, usually something like ELSEIF or ELSIF.

The curious Pascal construct probably goes back to the late 60s / early 70s, when the language was first designed.
It's a bit amusing to see the Pascal if/then/else construct described as "curious".

In a language that has the if/else construct, an "elseif", or any other syntactic variant thereof, is an indication of poor language design.  First, it's completely redundant/unnecessary.  Second, imagine if there was an "elsefor" or "elsewhile", etc, that would legitimately be "curious". An "elseif" construct is as curious as any of those yet, programmers don't see it that way because it allows them to indulge in one of their most common poor programming practices, that is, deeply nested if/else constructions.  A very poor programming habit made even worse by the "preference" some programmers have not to align begin/end pairs.

"elseif"/"elsif" is simply "cheap design" that caters to poor programming.



(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

MarkMLl

  • Hero Member
  • *****
  • Posts: 6676
Re: Syntax curiosity
« Reply #6 on: September 12, 2019, 03:21:23 pm »
Picking up on OP's

> Well, I suppose that's where BEGIN .. END come in to prevent potential ambiguity

More to the point, that's where Modula-2's IF .. THEN .. END (the END being mandatory) comes in. However how it got there is worth examining.

ALGOL-60 and ALGOL-W both used the Pascal-style  if <expression> then <single statement>  form. ALGOL-68 uses  if <expression> then <statement sequence> end  which eliminates the dangling-else problem. Wirth, as is well known, resigned from the ALGOL-68 standardisation process and shortly afterwards introduced Pascal, apparently intentionally breaking the ALGOL syntax by- as a specific example- changing the order of variable declaration. In addition to this, the earliest Pascal specification used /* */ for comments but Wirth gives the impression of having switched to (* *) when he noticed that he was suggesting compatibility with the B language.

Wirth fixed the dangling else problem in Modula-2, and it's also fixed in Ada (which is not, as a result, a "Pascal style" language whatever the unwashed say).

To its discredit, Pascal- as designed by Wirth with input from Hoare- has  record .. end  and  case .. end. The  then <single statement>  form might be excusable if ALGOL's  := if <expression> then <expression> else <expression>  form had been retained, but for some reason Wirth jettisoned it.

So I know that plenty of people are going to object to my saying this, but Pascal's syntax is a mess.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

munair

  • Hero Member
  • *****
  • Posts: 798
  • compiler developer @SharpBASIC
    • SharpBASIC
Re: Syntax curiosity
« Reply #7 on: September 12, 2019, 03:51:40 pm »
It's a bit amusing to see the Pascal if/then/else construct described as "curious".

In a language that has the if/else construct, an "elseif", or any other syntactic variant thereof, is an indication of poor language design.  First, it's completely redundant/unnecessary.  Second, imagine if there was an "elsefor" or "elsewhile", etc, that would legitimately be "curious". An "elseif" construct is as curious as any of those yet, programmers don't see it that way because it allows them to indulge in one of their most common poor programming practices, that is, deeply nested if/else constructions.  A very poor programming habit made even worse by the "preference" some programmers have not to align begin/end pairs.

"elseif"/"elsif" is simply "cheap design" that caters to poor programming.
I wasn't referring to " IF..THEN.ELSE" but to "ELSE IF". Python and PHP are widely used and support exactly this "poor" ELIF and ELSEIF design. I wouldn't call it poor per se, but from a compiler development point of view it requires more steps only to satisfy potential programmers' preference. As far as I can trace it back, the ELSEIF clause was introduced in the 1980s (at least with QuickBASIC) when computers became more powerful and back tracking and 2-pass techniques became acceptable techniques.
keep it simple

munair

  • Hero Member
  • *****
  • Posts: 798
  • compiler developer @SharpBASIC
    • SharpBASIC
Re: Syntax curiosity
« Reply #8 on: September 12, 2019, 03:57:38 pm »
So I know that plenty of people are going to object to my saying this, but Pascal's syntax is a mess.
I partially agree. But regarding the two-keywords ELSE IF construct, it is found in leading programming languages, and from a compiler construction point of view, it is logical. This is exactly the reason why every serious programmer should know at least SOMETHING about compiler design.
keep it simple

440bx

  • Hero Member
  • *****
  • Posts: 3944
Re: Syntax curiosity
« Reply #9 on: September 12, 2019, 04:11:00 pm »
I wasn't referring to " IF..THEN.ELSE" but to "ELSE IF".
I see what you're saying now.  I guess/believe that Wirth chose the "ELSE IF" syntax because things like "elseif"/"elif" are redundant when there is "IF" and "ELSE". 


So I know that plenty of people are going to object to my saying this, but Pascal's syntax is a mess.
I'd say its got a few quirks but, I wouldn't describe the Pascal syntax as a mess.  Unlike in other languages, some very popular ones, there are very few ways in Pascal of messing things up due to a syntactically unintended construct.
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

munair

  • Hero Member
  • *****
  • Posts: 798
  • compiler developer @SharpBASIC
    • SharpBASIC
Re: Syntax curiosity
« Reply #10 on: September 12, 2019, 04:18:12 pm »
I guess/believe that Wirth chose the "ELSE IF" syntax because things like "elseif"/"elif" are redundant when there is "IF" and "ELSE".

If ELSEIF could be implemented with the same ease and performance there would be no reason not to do so, except that it would add another keyword.
keep it simple

MarkMLl

  • Hero Member
  • *****
  • Posts: 6676
Re: Syntax curiosity
« Reply #11 on: September 12, 2019, 04:19:38 pm »
I partially agree. But regarding the two-keywords ELSE IF construct, it is found in leading programming languages, and from a compiler construction point of view, it is logical. This is exactly the reason why every serious programmer should know at least SOMETHING about compiler design.

Modula-2 has ELSIF. Can't remember whether the original Modula had it.

The idea of  if <expression> then <single statement>  and so on, which was what necessitated  begin ... end  , is fine as far as elegance is concerned. But I don't think that reputable languages provide a more urgent example of the gulf between elegant and reliable.

Leaving aside the reliability issue of that form, it was an utter sod to explain in documentation in the days when most programming was done with punched cards and indented sourcecode was a comparative rarity. I suggest that anybody who doesn't believe me digs into early ALGOL manuals on Bitsavers.

My own suspicion is that Wirth was intent on producing Pascal quickly, so that he could present the community with a fait accompli before the ALGOL-68 standardisation effort had wound down. And I also suspect that his compiler writing technique in that era, which if Euler is anything to go by involved uncommented numeric tables and- yes- GOTOs, was sufficiently resistant to modification that he recognised that fixing the if statement's flaws without introducing additional problems was something to not be undertaken lightly.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

MarkMLl

  • Hero Member
  • *****
  • Posts: 6676
Re: Syntax curiosity
« Reply #12 on: September 12, 2019, 04:33:10 pm »
I'd say its got a few quirks but, I wouldn't describe the Pascal syntax as a mess.  Unlike in other languages, some very popular ones, there are very few ways in Pascal of messing things up due to a syntactically unintended construct.

It appears to me that, 50 years on, most language designers are struggling to match Pascal's support for string handling and moderately strong type checking without leaving themselves open to accusations they're trying to persuade the World to program in Pascal.

But I'm afraid that I stick to my guns here and say that Pascal's mix of elegant if (not terminated by end) and reliable case (terminated by end) is a mess.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

munair

  • Hero Member
  • *****
  • Posts: 798
  • compiler developer @SharpBASIC
    • SharpBASIC
Re: Syntax curiosity
« Reply #13 on: September 12, 2019, 05:16:17 pm »
I almost forgot, RPG is another language that uses ELSEIF. Doing some more reading, it seems the 2-keywords ELSE IF construct can confuse programmers as to what belongs to what; does the IF belong to the ELSE or is it a new IF statement (it shouldn't be). Marcov's example demonstrates how confusing and harder-to-read it can get.

In my opinion constructs like
Code: Pascal  [Select][+][-]
  1. if <condition> then
  2.   if <condition> then
  3.     <consequence>
  4. else
  5.   consequence;
should not be possible.

Rather:
Code: FreeBasic  [Select][+][-]
  1. if <condition> do
  2.   if <condition> do
  3.     <consequence>;
  4.     end; 'nested statement end
  5. else
  6.   <consequence>;
  7.   end; ' statement end
whereby the statement ends are mandatory and the semicolon leaves no doubt as to which statement (block) is which.
« Last Edit: September 12, 2019, 08:14:09 pm by Munair »
keep it simple

ArtLogi

  • Full Member
  • ***
  • Posts: 184
Re: Syntax curiosity
« Reply #14 on: September 12, 2019, 10:41:33 pm »
Interesting topic. I had to actually go and try if "ELSE IF" could be written as

Code: Pascal  [Select][+][-]
  1. IF arg1 oper arg2 THEN
  2.   statement
  3. ELSE
  4. IF arg1 oper arg2 THEN
  5.  statement
  6. ELSE
  7.  
Nope... but is this logically the same as IF - ELSE IF - ELSE?
Code: Pascal  [Select][+][-]
  1. BEgin
  2.         if 1=0 then begin
  3.                writeln('1=0')
  4.           end
  5.         elsE begin
  6.               IF 1=2 THEN begin
  7.                  WRITELN('1=2')
  8.                 end
  9.                ELSE begin
  10.                      WRITELN('else')
  11.                 end
  12.            end
  13. End.
... because if you do count there is exactly same amount of words IF and ELSE than statement of IF - ELSE IF - ELSE

That said isn't the "original" Wirth Pascal requiring begin and end to each instruction with no exception see next post. The question needs to be analysed in the light (because it is asked as) of the original Wirth implentation of Pascal, not a modern streamlined and extended ones?
« Last Edit: September 12, 2019, 11:05:03 pm by ArtLogi »
While Record is a drawer and method is a clerk, when both are combined to same space it forms an concept of office, which is alias for a great suffering.

 

TinyPortal © 2005-2018