Lazarus

Free Pascal => General => Topic started by: ebuxbaum on November 28, 2020, 08:12:03 am

Title: "unexpected behaviour" in loop
Post by: ebuxbaum on November 28, 2020, 08:12:03 am
Hi,

for a program that performs k-means clustering, I used the following procedure:

Code: Pascal  [Select][+][-]
  1. procedure CalculateNewCentroids (const Data      : MatrixTyp;
  2.                                        k         : word;
  3.                                  var   Group     : GroupTyp;
  4.                                  var   Centroids : MatrixTyp
  5.                                 );
  6.  
  7. var i, j, l, n, p, nk  : word;
  8.     Sums               : array [1..kNNMax] of float;
  9.  
  10.  
  11. begin
  12.   n := MatrixRows(Data);
  13.   p := MatrixColumns(Data);
  14.   for l := 1 to k do
  15.     begin
  16.       for j := 1 to k do Sums[j] := 0.0;
  17.       nk := 0;
  18.       for i := 1 to n do
  19.         if Group[i] = l
  20.           then
  21.             begin
  22.               for j := 1 to p do
  23.                 begin
  24.                   Sums[j] := Sums[j] + GetMatrixElement(Data, i, j);
  25.                   writeln(j:5, ' ', n:5, ' ', p:5, ' ', nk:5);
  26.                 end;
  27.               inc(nk);
  28.             end;
  29.       for j := 1 to p do
  30.         SetMatrixElement(Centroids, k, j, Sums[j]/nk);
  31.     end; // for l
  32. end; // CalculateNewCentroids
   

which is supposed to calculate the averages of all features for each group. Unexpectedly, the inner loop changes values of counters for the outer loop, the output of the writeln-statement is:

       1   101    23     0
       2   101    23     0
       3   101    23     0
       4   101    23     0
       5   101    23     0
       6   101    23     0
       7   101    23     0
       8   101    23     0
       9   101    23     0
      10   101    23     0
      11   101    23     0
      12   101    23     0
      13   101    23     0
      14   101    23     0
      15   101    23     0
      16   101    23  6291
      17 34078 60293  6291
   65108 34078 60293  6291
Title: Re: "unexpected behaviour" in loop
Post by: lucamar on November 28, 2020, 08:23:28 am
I'd start by checking what GetMatrixElement() is doing and how it's declared; it might somehow be changing i and/or j (and getting away with it!). Otherwise, check for unexpected overflows or run-away pointers.

Dfficult to say more without seeing all the related code. :-[
Title: Re: "unexpected behaviour" in loop
Post by: ebuxbaum on November 28, 2020, 08:44:10 am
GetMatrixElement is defined as

Code: Pascal  [Select][+][-]
  1. type
  2.   MatrixStruc = record
  3.     Columns, Rows: word;
  4.     Data: array[1..MaxVector] of float;
  5.   end;
  6.   MatrixTyp = ^MatrixStruc;  
  7.  
  8. function GetMatrixElement(const A: MatrixTyp; Row, Column: word): float;
  9.  
  10. var n, p : word;
  11.  
  12. begin
  13.   n := MatrixRows(A);
  14.   p := MatrixColumns(A);
  15.   if (Row <= n) and (Column <= p)
  16.     then
  17.       Result := A^.Data[pred(Row)*p + Column]
  18.     else
  19.       ch := WriteErrorMessage(' Attempt to read a non-existent matrix element');
  20. end;

and therefore should not change the counters (although n and p exist there too, they are different variables because of locality). In addition, a proposed side-effect would start with the first call, not the 16th.

Btw, the error occurs both under Win10 and Linux.
Title: Re: "unexpected behaviour" in loop
Post by: bytebites on November 28, 2020, 09:20:09 am
Still not compiƶable code.
Maybe kNNMax too small.

Good luck for guessing  :P
Title: Re: "unexpected behaviour" in loop
Post by: lucamar on November 28, 2020, 09:31:20 am
GetMatrixElement is defined as
[...]
and therefore should not change the counters (although n and p exist there too, they are different variables because of locality). In addition, a proposed side-effect would start with the first call, not the 16th.

Yes, of course. But then, we couldn't know that without seeing it, would we? :D

One new thing is that all your accesses to the matrix seems to be through a pointer; check then whether the "runaway pointer" theory migth hold some water. It doesn't seems probable since the structure itself shouldn't be anywhere near the stack--where all the locals are kept--and you don't seem to be doing anthing "strange" with the pointer, but one never knows ...

One strange thing is that the first to fail seems to be nk (or it might be all of them at once); it looks as if somehow the index of Sums[] is becoming negative and thrashing the vars declared before it, but I can't see how, sorry :-[
Title: Re: "unexpected behaviour" in loop
Post by: MarkMLl on November 28, 2020, 10:51:50 am
I very much dislike criticising code on account of well-intentioned indentation etc., but cutting out some of the cruft:

Code: Pascal  [Select][+][-]
  1. procedure CalculateNewCentroids (const Data      : MatrixTyp;
  2.                                        k         : word;
  3.                                  var   Group     : GroupTyp;
  4.                                  var   Centroids : MatrixTyp
  5.                                 );
  6.      
  7. var i, j, l, n, p, nk  : word;
  8.     Sums               : array [1..kNNMax] of float;
  9.      
  10.      
  11. begin
  12.   n := MatrixRows(Data);
  13.   p := MatrixColumns(Data);
  14.  
  15.   for l := 1 to k do begin
  16.     for j := 1 to k do
  17.       Sums[j] := 0.0;
  18.     nk := 0;
  19.  
  20.     for i := 1 to n do
  21.       if Group[i] = l then begin
  22.             for j := 1 to p do begin
  23.                 Sums[j] := Sums[j] + GetMatrixElement(Data, i, j);
  24.                 writeln(j:5, ' ', n:5, ' ', p:5, ' ', nk:5)
  25.             end;
  26.             inc(nk)
  27.       end;
  28.  
  29.     for j := 1 to p do
  30.       SetMatrixElement(Centroids, k, j, Sums[j]/nk)
  31.   end // for l
  32.  
  33. end; // CalculateNewCentroids
  34.  

Take a careful look at what's in the group array, and put i at the front of that WriteLn().

MarkMLl
Title: Re: "unexpected behaviour" in loop
Post by: ebuxbaum on November 28, 2020, 11:53:10 am
Ok, done. i stays 1, and Group is 1 (I checked), as higher groups will be handled only in further iterations of the outer loop. k is unaffected, too. The values returned by GetMatrixElement are correct.
Title: Re: "unexpected behaviour" in loop
Post by: MarkMLl on November 28, 2020, 12:45:39 pm
Allowing that I'm working on something else... does control reach the statement with the divide-by-zero in it?

MarkMLl
Title: Re: "unexpected behaviour" in loop
Post by: ebuxbaum on November 28, 2020, 02:07:38 pm
I am not sure what you mean by that. The only division in this procedure is the calculation of averages (Sums[j]/nk), and this is never reached. But you raise an important point, the situation nk = 0 is possible and needs to be taken care of (then Sums[j] would also be 0).
Title: Re: "unexpected behaviour" in loop
Post by: Jonas Maebe on November 29, 2020, 02:56:18 pm
Try enabling range checking.
Title: Re: "unexpected behaviour" in loop
Post by: jamie on November 29, 2020, 11:58:03 pm
looks like your n and p could be exceeding your kNNMax which would cause a clobbering of the stack there by messing up your counter variables, which are also on the stack

 looking at the use of DATA which is a pointer, most likely returning bad data or out of range data.
Title: Re: "unexpected behaviour" in loop
Post by: ebuxbaum on December 01, 2020, 10:20:22 am
Actually no: KNNMax is the maximal number of clusters, into which the data can be sorted. Following the conventional nomenclature in multivariate statistics, n is the number of items (say, persons participating in a study), p the number of variables. Thus, problems would occur only if k (actual number of clusters)  is larger than KNNmax, but that is prevented by the calling routine. Indeed, KNNmax is a global constant and set to 15, k is set by the calling routine to 2 for the time being. 

I have now modified the inner loop as follows:
Code: Pascal  [Select][+][-]
  1. for j := 1 to p do begin
  2.    x := GetMatrixElement(Data, i, j);
  3.    write(k:5, ' ', i:5, ' ', j:5, ' ', n:5, ' ', p:5, ' ', nk:5, ' ', Group[i]:5, ' ', x:3:6, ' ', Sums[j]:3:3, ' ');
  4.    Sums[j] := Sums[j] + x;
  5.    writeln(p:5, ' ', Sums[j]:3:3);
  6. end;
  7.  

and got the following result:

     k     i     j    n     p     nk  Group   x        Sums[j]  p  Sums[j]
    2     1     1   101    23     0        1   32.000000 0.000    23 32.000 
    2     1     2   101    23     0        1   -0.307000 0.000    23 -0.307 
    2     1     3   101    23     0        1   -0.081000 0.000    23 -0.081 
    2     1     4   101    23     0        1   -0.466000 0.000    23 -0.466 
    2     1     5   101    23     0        1    1.170000 0.000    23  1.170 
    2     1     6   101    23     0        1    1.894000 0.000    23  1.894 
    2     1     7   101    23     0        1   -0.913000 0.000    23 -0.913 
    2     1     8   101    23     0        1    0.628000 0.000    23  0.628 
    2     1     9   101    23     0        1   -0.904000 0.000    23 -0.904 
    2     1    10   101    23     0        1    0.458000 0.000    23  0.458 
    2     1    11   101    23     0        1   -1.160000 0.000    23 -1.160 
    2     1    12   101    23     0        1    0.360000 0.000    23  0.360 
    2     1    13   101    23     0        1    1.507000 0.000    23  1.507 
    2     1    14   101    23     0        1   -0.505000 0.000    23 -0.505 
    2     1    15   101    23     0        1    0.454000 0.000    23  0.454 
    2     1    16   101    23     0        1    0.753000 0.000  6291  0.753 
    2     1    17   101  6291 27263        1    0.305000 0.000  6291  0.305 
    2     1    18 60293  6291 27263        1   -0.453000 0.000  6291  0.000         


In other words, the guilty line is
Code: Pascal  [Select][+][-]
  1. Sums[j] := Sums[j] + x;
, because when j = 16, p is correct before and wrong afterwards.
Title: Re: "unexpected behaviour" in loop
Post by: MathMan on December 01, 2020, 12:58:08 pm
Actually no: KNNMax is the maximal number of clusters, into which the data can be sorted. Following the conventional nomenclature in multivariate statistics, n is the number of items (say, persons participating in a study), p the number of variables. Thus, problems would occur only if k (actual number of clusters)  is larger than KNNmax, but that is prevented by the calling routine. Indeed, KNNmax is a global constant and set to 15, k is set by the calling routine to 2 for the time being.

Hm, looking at your initial post your problems seem to start with j=16 and KNNMax = 15 as per your above statement. I'd say this smells like exceeding the array bounds of "Sums".

I can only advise like what has already been advised by Jonas - compile with range checks on and see what happens when the function is executed.

MathMan
Title: Re: "unexpected behaviour" in loop
Post by: dseligo on December 01, 2020, 02:31:26 pm
In other words, the guilty line is
Code: Pascal  [Select][+][-]
  1. Sums[j] := Sums[j] + x;
, because when j = 16, p is correct before and wrong afterwards.
j shouldn't be 16, because of this declaration:
    Sums               : array [1..kNNMax] of float;

and you said that kNNMax is set to 15.

So you have:     Sums:array [1..15] of float;
And you then do this: Sums[16]:=Sums[16]+x;
Title: Re: "unexpected behaviour" in loop
Post by: ebuxbaum on December 02, 2020, 09:40:59 am
Yes, that should have been pMax, not KNNmax. Thanks everybody for your help. I only wonder why this didn't cause a runtime error. In TP, it would have.
Title: Re: "unexpected behaviour" in loop
Post by: Bart on December 02, 2020, 10:51:11 am
By default range checking is off in fpc, that's why.

Bart
Title: Re: "unexpected behaviour" in loop
Post by: PascalDragon on December 02, 2020, 02:46:26 pm
Well, Jonas Maebe did suggest to enable range checking...

Try enabling range checking.
TinyPortal © 2005-2018