Recent

Author Topic: For variable being read after loop... Any way to get an error or warning on it?  (Read 13682 times)

kupferstecher

  • Hero Member
  • *****
  • Posts: 603
For many it is not immediately obvious what its purpose is, but consider:[...]
But how can you be sure that the loop was not fully completed but exited before? I.e. how do you know if the variable value is guaranteed or not after the loop. I think you need an other variable like a flag or something to track a loop exit by a BREAK statement. And then the whole thing isn't useful anymore, I'd rather copy the control variable when breaking.

MarkMLl

  • Hero Member
  • *****
  • Posts: 7904
For many it is not immediately obvious what its purpose is, but consider:[...]
But how can you be sure that the loop was not fully completed but exited before? I.e. how do you know if the variable value is guaranteed or not after the loop. I think you need an other variable like a flag or something to track a loop exit by a BREAK statement. And then the whole thing isn't useful anymore, I'd rather copy the control variable when breaking.

I'm uncomfortable with the whole damn thing, since while the compiler can unambiguously detect whether there's a break or goto (which in intent probably includes LongJmp() and exception raising) within the loop it can't unambiguously detect whether it's been taken. Hence the ISO position, which is that if goto isn't taken then the control variable becomes invalidated, is untenable in practice.

However as a background detail I know that there was at least one mainframe Pascal implementation- at Stanford University, and well-publicised- which used hardware facilities to tag even integer values that had no defined value. I've not researched the ISO committee membership (which presumably did not include Wirth, who it appears was no longer on the best of terms with his former colleagues) but there's a possibility that Stanford's demonstration that this sort of thing /could/ be done influenced the standard's assumption that it /should/ be done.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Logitech, TopSpeed & FTL Modula-2 on bare metal (Z80, '286 protected mode).
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

440bx

  • Hero Member
  • *****
  • Posts: 4686
as they say... keep it simple (short.)

if someone needs to rely on an index value upon exiting/terminating a loop then a "for" is _not_ the right construct.   Simply use a "while" or a "repeat", no need to complicate things using a "for" loop for tasks it wasn't designed to perform.

It really isn't hard to use a "while" instead of a "for".
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

alpine

  • Hero Member
  • *****
  • Posts: 1271
Here it is:
From http://pascal.hansotten.com/uploads/books/Pascal_User_Manual_and_Report_Second_Edition.pdf, page 24, last paragraph:
Quote
The control variable, the initial value, and the final value must be of the same scalar type (excluding type real), and must not be altered by the for statement. The initial and final values are evaluated only once. If in the case of to (downto) the initial value is greater (less) than the final value, the for statement is not executed. The final value of the control variable is left undefined upon normal exit from the for statement.
(later came the clarification about the break statement)
"Must not be altered by the for statement" - and that is the sole reason you can't reuse the same control var in a nested proc or pass it as a var parameter.
And only $DEITY (and Wirth) knew why it's left undefined, but now it's set in stone.

For many it is not immediately obvious what its purpose is, but consider:[...]
But how can you be sure that the loop was not fully completed but exited before? I.e. how do you know if the variable value is guaranteed or not after the loop. I think you need an other variable like a flag or something to track a loop exit by a BREAK statement. And then the whole thing isn't useful anymore, I'd rather copy the control variable when breaking.
Exactly! That whole break thing seems a nonsense to me.

I'm just stunned from the amount of posts here, incl. attempts to interpret the ISO texts, C language refs, etc. It is the way it is and if someone not happy, let him use the while  construct for the purpose (as K&R did in their for()).

BTW the OP asked just for a compiler hint about using the value after the for loop.

"I'm sorry Dave, I'm afraid I can't do that."
—HAL 9000

MarkMLl

  • Hero Member
  • *****
  • Posts: 7904
if someone needs to rely on an index value upon exiting/terminating a loop then a "for" is _not_ the right construct.   Simply use a "while" or a "repeat", no need to complicate things using a "for" loop for tasks it wasn't designed to perform.

But the fact of the matter is that for and its anomalous behaviour is defined in the language ** , cannot be easily removed, and results in behaviour that the compiler cannot easily warn (or error) as being unsafe.

** Even if we blame break on an early non-Wirth extension, he didn't explicitly prohibit goto leading out of a for statement.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Logitech, TopSpeed & FTL Modula-2 on bare metal (Z80, '286 protected mode).
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

440bx

  • Hero Member
  • *****
  • Posts: 4686
But the fact of the matter is that for and its anomalous behaviour is defined in the language ** , cannot be easily removed, and results in behaviour that the compiler cannot easily warn (or error) as being unsafe.
Yes, that's a fact.

The anomaly should be reason enough to choose a different kind of loop if it is necessary to rely on the index value.

IOW, the value of the "for" control variable should _always_ be thought of as being undefined after the loop ends no matter how it ended.  That's the simple solution.

I agree that there are deficiencies in the design, one that is particularly evident when it really isn't possible for the compiler to determine with certainty in all cases how the "for" loop will end.  The most the compiler could do is emit a warning if it detects the loop variable being used after the loop but, the warning may or may not be applicable which considerably lessens its value.

The OP asked for a warning or message about mis-using the control variable, personally, I think the compiler should issue a warning but, not about the control variable but about the programmer having selected the wrong construct for the task.  IOW, when the compiler sees the "for" control variable being used after the loop, it should issue a "syntax" error: "programmer expected.  compilation terminated."

(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

MarkMLl

  • Hero Member
  • *****
  • Posts: 7904
The anomaly should be reason enough to choose a different kind of loop if it is necessary to rely on the index value.

In any event: it's part of the language, we're stuck with it, and there's many millions of lines of legacy Pascal that rely on it. And any attempt to machine-rewrite Legacy Pascal into something that uses the "right" control structure will fail for the reasons we're discussing.

I've just checked, and just about the same syntax is used in ALGOL-W (http://i.stanford.edu/pub/cstr/reports/cs/tr/71/230/CS-TR-71-230.pdf p56). Even flawed, it was still an improvement on the FORTRAN and ALGOL/C styles that preceded it.

---

I feel that we ought to draw a line under this, and if necessary assist the development process by disciplined discussion at https://gitlab.com/freepascal.org/fpc/source/-/issues/40879

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Logitech, TopSpeed & FTL Modula-2 on bare metal (Z80, '286 protected mode).
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

440bx

  • Hero Member
  • *****
  • Posts: 4686
there's many millions of lines of legacy Pascal that rely on it.
Maybe I'm misunderstanding but, I cannot even think of one instance where I've seen code that relied on the value of the "for" control variable after the loop ended.

I'm not against the compiler issuing a warning if it sees the control variable's value being used after the "for" ended but, I'm not sure of the value of such a warning because it's very unlikely that the compiler can reliably determine, in all cases, if the control variable has a useful value or is undefined.

As I stated in a previous post.  IMO, if a programmer needs to rely on the control variable, I would expect the programmer to use something other than a "for" loop and would regard the use of a "for" loop as being wrong even if the conditions are there to ensure the control variable is valid once the loop ends.  The reason I'd consider the code wrong is because that code is simply too fragile.

I think the compiler should guarantee (document) that the value of the "for" control variable is always undefined once the loop has ended.  I don't remember what the standard says and since I would not rely on that behavior, even if demanded by the standard, I have no incentive to look it up.
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

MarkMLl

  • Hero Member
  • *****
  • Posts: 7904
Maybe I'm misunderstanding but, I cannot even think of one instance where I've seen code that relied on the value of the "for" control variable after the loop ended.

Well OP obviously has, otherwise he wouldn't have started the thread, and TBH I'd say that it's a fairly common fault. The fact that it's a fault that is very often found (eventually) during debugging, and that the perpetrator learns by experience, doesn't detract from the fact that the syntax of the language permits it to happen so it's really down to the compiler to detect it to the extent possible: in the same way that the compiler warns about uninitialised variables and parameters provided that they are appropriately declared.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Logitech, TopSpeed & FTL Modula-2 on bare metal (Z80, '286 protected mode).
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

BrunoK

  • Hero Member
  • *****
  • Posts: 620
  • Retired programmer
I don't understand why there so much messing up with for loop.
If control is needed on the index var, just do a a while.
Code: Pascal  [Select][+][-]
  1.   procedure Test();
  2.   var
  3.     I: Integer;
  4.   begin
  5.     i := 0;
  6.     while I <= 999 do begin
  7.       WriteLn(I);
  8.       Inc(i);
  9.     end;
  10.     WriteLn(I);
  11.   end;
Code is a bit messier, but does what (rarely) is needed to get a post loop value. Longer in Pascal code, but nearly identical in term of compiled code size (shorter ?). Let that for etc... be the way it was designed.
 

kupferstecher

  • Hero Member
  • *****
  • Posts: 603
In any event: it's part of the language, we're stuck with it, and there's many millions of lines of legacy Pascal that rely on it.

I don't see a problem (in the aspect of backward compatibility) by issuing a warning for all usages after the loop. Somehow as 440bx said. Perhaps: "Warning: Loop variable "foo" is accessed outside of the loop."
And as the compiler couldn't evaluate at design time in all cases if the loop variable is valid after the loop, the warning would be appropriate, I'd say. (And if the standard wouldn't have that break-speciality, then the warning could be the same, as well.)

dbannon

  • Hero Member
  • *****
  • Posts: 3156
    • tomboy-ng, a rewrite of the classic Tomboy
.... and there's many millions of lines of legacy Pascal that rely on it. ...

I cannot imagine code that relies on a value being undefined.  Can anyone offer an example ?

Davo
Lazarus 3, Linux (and reluctantly Win10/11, OSX Monterey)
My Project - https://github.com/tomboy-notes/tomboy-ng and my github - https://github.com/davidbannon

MarkMLl

  • Hero Member
  • *****
  • Posts: 7904
I cannot imagine code that relies on a value being undefined.  Can anyone offer an example ?

I thought it was fairly clear that I was referring to the for loop:

The anomaly should be reason enough to choose a different kind of loop if it is necessary to rely on the index value.

In any event: it's part of the language, we're stuck with it, and there's many millions of lines of legacy Pascal that rely on it. And any attempt to machine-rewrite Legacy Pascal into something that uses the "right" control structure will fail for the reasons we're discussing.

i.e. despite its unfortunate behaviour it cannot be quietly dropped from the language implementation.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Logitech, TopSpeed & FTL Modula-2 on bare metal (Z80, '286 protected mode).
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

440bx

  • Hero Member
  • *****
  • Posts: 4686
I guess the problem is that someone who is new to Pascal may believe that using the "for" control variable after the loop ends is ok (as it is in C) even though it _may_ not be and, the compiler currently does not warn the programmer about it.

The real problem with issuing a warning is that, in most cases, the compiler cannot determine with certainty if using the variable after the loop has ended is ok or not (the compiler doesn't know what is going to happen at runtime.)

If the compiler takes the simple approach of emitting a warning if it sees the control variable being used then, it will emit warnings when the control variable is simply re-used in code that has nothing to do with the loop.  Avoiding that would require the compiler to do potentially sophisticated data flow analysis, which would slow down compilation, just to emit a warning that may still not applicable to the situation.

I'm all for the compiler helping the programmer avoid mistakes but, if the compiler is going to do something like that then, I would suggest such potential mistakes be detected only in a special "lint" mode (similar to optimization modes, a deep analysis mode) because the code required definitely has the potential to slow compilation speed.

Kind of like MS C/C++ does with "warning levels", the higher the warning level the more "sensitive" the compiler gets.

(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

Thaddy

  • Hero Member
  • *****
  • Posts: 16018
  • Censorship about opinions does not belong here.
(as it is in C)
No, it isn't.
One would usually write a loop in C like this:
Code: C  [Select][+][-]
  1. #include <stdio.h>
  2. int main() {
  3.     for (int i = 0; i < 10; i++) {
  4.         printf("%d\n", i);
  5.     }
  6.     printf("%d\n", i);
  7.     return 0;
  8. }
Agree?... Well, I have news for you...: that crashes/won't compile....
i is out of scope from printf...
But you can write code that doesn't crash  :D
Code: C  [Select][+][-]
  1. #include <stdio.h>
  2. int main() {
  3.     int i;
  4.     for (i = 0; i < 10; i++) {
  5.         printf("%d\n", i);
  6.     }
  7.     // Now 'i' is accessible here
  8.     printf("Final value of i: %d\n", i);
  9.     return 0;
  10. }
My first example is basically how FreePascal interprets a loop variable, only it does not err.
My second example is what most people "expect", but is extremely sloppy C with total disregard for proper scoping rules. ( Hey, It is C..!)
Real C programmers know the difference and spot it immediately.

That shown, I wish FPC would support both with a local define or something.

« Last Edit: August 10, 2024, 03:45:48 pm by Thaddy »
If I smell bad code it usually is bad code and that includes my own code.

 

TinyPortal © 2005-2018