* * *

Author Topic: Calculate statistics (mean, variance) on multidimensional arrays  (Read 1176 times)

matandked

  • New member
  • *
  • Posts: 14
Calculate statistics (mean, variance) on multidimensional arrays
« on: February 12, 2017, 10:11:43 am »
I wish to calculate simple statistics (mean, variance) for a given array.
I have no problems with one-dimensional arrays, but I have issues with multidimensional arrays.

I guess this is because I need to define another function for arrays of arrays (??)

My code is as follows:

Code: Pascal  [Select]
  1. program ArrayStatistics;
  2.  
  3. {$mode objfpc}{$H+}
  4.  
  5. uses
  6.   {$IFDEF UNIX}{$IFDEF UseCThreads}
  7.   cthreads,
  8.   {$ENDIF}{$ENDIF}
  9.   Classes,
  10.   sysutils, math;
  11.  
  12. const
  13.   RowCount = 2;
  14.   ColumnCount = 5;
  15.  
  16. var
  17.   simpleArray: array [1..ColumnCount] of Real = (0.5, 2.3, 6.2, 7.2, 1.0);
  18.   // I wish I could just place 'simpleArray' in the first row..
  19.   multiDimensionalArray: array [1..RowCount, 1..ColumnCount] of Real =
  20.      ( (0.5, 2.3, 6.2, 7.2, 1.0), (2.5, 4.3, 3.2, 1.2, 1.0));
  21.  
  22. function getArrayStatsText (anArray : array of Real) : string;
  23.          begin
  24.            Result := Concat(
  25.                 'Array Length: ', IntToStr(Length(anArray)), ' --> ',
  26.                 'Mean: ', FloatToStr(mean(anArray)), ' ',
  27.                 'Variance: ', FloatToStr(variance(anArray)), ' ',
  28.                 'Sum: ', FloatToStr(sum(anArray))
  29.                 );
  30.          end;
  31.  
  32. begin
  33.    // Using functions described at:
  34.   // http://www.freepascal.org/docs-html/rtl/math/statisticalroutines.html
  35.  
  36.   // Following works and prints on my screen:
  37.   // Array Length: 5 --> Mean: 3.44 Variance: 9.413 Sum: 17.2
  38.   WriteLn(getArrayStatsText(simpleArray));
  39.  
  40.   WriteLn(LineEnding + 'ISSUE - when I try to use function getArrayStatsText');
  41.   WriteLn('on multiDimensionalArray, I receive an error');
  42.   // Incompatible type
  43.   // WriteLn(getArrayStatsText(multiDimensionalArray));
  44.  
  45.   ReadLn;
  46. end.
  47.  
  48.  

Thaddy

  • Hero Member
  • *****
  • Posts: 3675
Re: Calculate statistics (mean, variance) on multidimensional arrays
« Reply #1 on: February 12, 2017, 10:34:36 am »
Checkout the math unit: It already has that.
Why do the Danish always try to fuck up any programming language?

matandked

  • New member
  • *
  • Posts: 14
Re: Calculate statistics (mean, variance) on multidimensional arrays
« Reply #2 on: February 12, 2017, 01:55:47 pm »
I'm using math in my code (especially in my uses there's math).
I'm not sure what is in math?

Thaddy

  • Hero Member
  • *****
  • Posts: 3675
Re: Calculate statistics (mean, variance) on multidimensional arrays
« Reply #3 on: February 12, 2017, 02:32:35 pm »
Note that any array, independent of its dimensions, can always be represented* as a one dimensional array.
You just have to be careful how you choose your planes (a.k.a. axis in the case of a 2 d array).

Example:
Code: Pascal  [Select]
  1. uses math;
  2. const
  3.   RowCount = 2;
  4.   ColumnCount = 5;
  5. var
  6.   a: array [0..Pred(RowCount), 0..Pred(ColumnCount)] of double =
  7.      ( (0.5, 2.3, 6.2, 7.2, 1.0), (2.5, 4.3, 3.2, 1.2, 1.0));
  8.   // flattened a...
  9.   b: array [0..Pred(RowCount*ColumnCount)] of double absolute a;
  10. begin
  11.   writeln('Mean: ' :10,Mean(b):5:5);
  12.   writeln('Variance: ':10, Variance(b):5:5);
  13.   writeln('Sum: ':10,Sum(b):5:5);
  14. end.

I changed real to double because real is deprecated and is an alias for double.

* within the limitations of the memory model
« Last Edit: February 17, 2017, 03:50:45 am by Thaddy »
Why do the Danish always try to fuck up any programming language?

matandked

  • New member
  • *
  • Posts: 14
Re: Calculate statistics (mean, variance) on multidimensional arrays
« Reply #4 on: February 17, 2017, 09:41:59 am »
 :o I'm really impressed!

But will this (changing two-dimensional to one-dimensional array) work on dynamic array as well?

Suppose that I read an array from file, so I don't know exact dimensions of my array until I read the file (if this is required, I can prepare reproducible example)

Thaddy

  • Hero Member
  • *****
  • Posts: 3675
Re: Calculate statistics (mean, variance) on multidimensional arrays
« Reply #5 on: February 17, 2017, 10:59:19 am »
As I wrote... Any array... But plz give some code and I'll show you how. Dynamic arrays are a bit more difficult, though, because of reallocation with setlength.
« Last Edit: February 17, 2017, 11:08:59 am by Thaddy »
Why do the Danish always try to fuck up any programming language?

matandked

  • New member
  • *
  • Posts: 14
Re: Calculate statistics (mean, variance) on multidimensional arrays
« Reply #6 on: February 22, 2017, 08:31:50 am »
Again, sorry for lack of response.

I implemented a function 'ReadSpreadsheetRange' (that returns spreadsheet range of cells as dynamic array) in different topic (my last response contains full code with sample file): http://forum.lazarus.freepascal.org/index.php/topic,35711.0.html

(If it's needed, I can copy mentioned code here)

I wish to calculate statistics for dynamic array returned by function 'ReadSpreadsheetRange'.

Thank you in advance!!

wp

  • Hero Member
  • *****
  • Posts: 3534
Re: Calculate statistics (mean, variance) on multidimensional arrays
« Reply #7 on: February 22, 2017, 09:58:53 am »
Why so complicated? Why first iterate through all cells, store values in an array and then iterate through the array again to calculate statistics? This is a waste of resources. The worksheet already is something like a general array.

You already have a procedure to iterate through the spreadsheet ("ReadspreadsheetRange"). Instead of copying the numbers to an array, calculate the sum (s) and count the values (n) while running through the cells. Now you know the mean = s/n. Then repeat iterating through the spreadsheet and calculate the sum of the squares (s2) of (x - mean) where x is the value found in the current cell. Finally you get the standard deviation as sqrt(s2 / (n-1)).
Lazarus trunk / fpc 3.0.0 / Win32

matandked

  • New member
  • *
  • Posts: 14
Re: Calculate statistics (mean, variance) on multidimensional arrays
« Reply #8 on: February 22, 2017, 10:39:45 am »
Why so complicated? Why first iterate through all cells, store values in an array and then iterate through the array again to calculate statistics? This is a waste of resources. The worksheet already is something like a general array.

You're correct, it's not optimal in terms of computational efficiency. However, I wish to separate functions for reading a worksheet range and calculating a statistics, so that I will be able to re-use them in different parts of my code / code will be more readable.
I'm afraid that if I will do that in one big function it will be less readable. I can call function from function, but I can imagine that sometimes I will need just to read an array, not calculate statistics.

wp

  • Hero Member
  • *****
  • Posts: 3534
Re: Calculate statistics (mean, variance) on multidimensional arrays
« Reply #9 on: February 22, 2017, 11:42:11 am »
Although I don't agree - anyway: Do you know how to iterate through a 2D array? If yes then do what I wrote above while running through the array instead of running through the worksheet. If not, read some basic texts on "for" loops in pascal.
Lazarus trunk / fpc 3.0.0 / Win32

matandked

  • New member
  • *
  • Posts: 14
Re: Calculate statistics (mean, variance) on multidimensional arrays
« Reply #10 on: February 22, 2017, 12:38:04 pm »
Yes, I know.

 

Recent

Get Lazarus at SourceForge.net. Fast, secure and Free Open Source software downloads Open Hub project report for Lazarus