Recent

Author Topic: Importing a string matrix into Lazarus - problem only ONE column  (Read 14766 times)

rvk

  • Hero Member
  • *****
  • Posts: 7044
Re: Importing a string matrix into Lazarus - problem only ONE column
« Reply #15 on: November 03, 2015, 01:24:36 pm »
There is definitely a bug in TKUStringMatrix. For example try the following file:
Code: Pascal  [Select][+][-]
  1. matrix1 := TKUStringMatrix.Create('Sample compustat4.csv', ',');
Code: [Select]
aa,ba,ca,da,ea,fa,ga,ha
ia,ja,ka,la,ma,na,oa,pa
Code: [Select]
results in:
      |     1 |     2 |     3 |     4 |     5 |     6 |     7 |     8
------+-------+-------+-------+-------+-------+-------+-------+------
    1 | aa | ba | ca | da | ea | fa | ga | ha
    2 | ia | ja | ka | la | ma | na | oa | pa

Import bestanden gelukt
So far so good (although the alignment isn't pretty).

Now take the file:
Code: [Select]
aaaa,baaa,caaa,daaa,eaaa,faaa,gaaa,haaa
iaaa,jaaa,kaaa,laaa,maaa,naaa,oaaa,paaa
results in:
Code: [Select]
      |     1 |     2 |     3 |     4 |     5 |     6 |     7 |     8
------+-------+-------+-------+-------+-------+-------+-------+------
    1 | aaaa | baaa | caaa | daaa | eaaa | faaa | gaaa | haaa
    2 | iaaa | jaaa | kaaa | laaa | maaa | naaa | Import bestanden gelukt

So it doesn't matter if you use , ; or #9. The string import has a serious bug.

(I'm not sure if the import itself has the bug or the Show() has it but there is definitely something not right)

I think it is the Show() that's at fault.
When doing this:
Code: Pascal  [Select][+][-]
  1.     for r := 1 to 2 do
  2.       for c := 1 to 8 do
  3.       writeln(matrix1.GetCell(r,c));
All data is correctly printed.

rvk

  • Hero Member
  • *****
  • Posts: 7044
Re: Importing a string matrix into Lazarus - problem only ONE column
« Reply #16 on: November 03, 2015, 01:34:22 pm »
Ok, I have found the problem.

In matrix1.Show() the result is build up in a string. It seems that the KUMatrixGeneric is compiled with ShortString as default (with a maximum of 255 characters in a string).

If you add {$H+} after the {$MODE OBJFPC} line it all works  :D

Code: Pascal  [Select][+][-]
  1. unit KUMatrixGeneric;
  2. {$MODE OBJFPC}{$H+}
  3.  

Why this was not done in the first place is really weird %)

This would also be a problem with MatrixInteger and all others. (If the dataset was large enough)

The result is now this:
Code: [Select]
      |     1 |     2 |     3 |     4 |     5 |     6 |     7 |     8 |     9 |    10 |    11 |    12 |    13 |    14 |    15 |    16 |    17 |    18 |    19 |    20 |    21 |    22 |    23 |    24 |    25 |    26 |    27 |    28 |    29 |    30 |    31 |    32 |    33
------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+-------+------
    1 | 1004 | 31/05/1993 | mei/93 | 54594 | 54594|05/1993 | 1992 | INDL | C | D | STD | AIR | AAR CORP | USD | 265,036 | 365,151 |  | 11,8996 | 307,436 | 66,298 | 0,283 | 71,637 |  | 16,806 | 382,78 | A |  | 10911000 |  |  | 13,5 |  | 1 | 5080
    2 | 1004 | 31/05/1994 | mei/94 | 54594 | 54594|05/1994 | 1993 | INDL | C | D | STD | AIR | AAR CORP | USD | 307,723 | 417,626 |  | 11,913 | 328,562 | 115,729 | 9,484 | 67,714 |  | 6,697 | 407,754 | A |  | 6559600 |  |  | 14,375 |  | 4 | 5080
    3 | 1004 | 31/05/1995 | mei/95 | 54594 | 54594|05/1995 | 1994 | INDL | C | D | STD | AIR | AAR CORP | USD | 321,632 | 425,814 |  | 12,35 | 363,196 | 119,766 | 10,463 | 73,14 |  | 15,255 | 451,395 | A |  | 5820000 |  |  | 15,25 |  | 4 | 5080
    4 | 1004 | 31/05/1996 | mei/96 | 54594 | 54594|05/1996 | 1995 | INDL | C | D | STD | AIR | AAR CORP | USD | 338,012 | 437,846 |  | 12,7913 | 404,11 | 118,292 | 16,012 | 79,385 |  | 24,76 | 504,99 | A |  | 7651700 |  |  | 22,125 |  | 4 | 5080
    5 | 1004 | 31/05/1997 | mei/97 | 54594 | 54594|05/1997 | 1996 | INDL | C | D | STD | AIR | AAR CORP | USD | 414,1 | 529,584 |  | 14,7912 | 468,5 | 116,818 | 23,025 | 99,981 |  | 9,531 | 589,328 | A |  | 19734600 |  |  | 31 |  | 1 | 5080
    6 | 1004 | 31/05/1998 | mei/98 | 54594 | 54594|05/1998 | 1997 | INDL | C | D | STD | AIR | AAR CORP | USD | 468,4 | 670,559 |  | 10,8594 | 619,434 | 177,509 | 35,657 | 149,148 |  | 22,823 | 782,123 | A |  | 26812650 |  |  | 26,4375 |  | 1 | 5080
    7 | 1004 | 31/05/1999 | mei/99 | 54594 | 54594|05/1999 | 1998 | INDL | C | D | STD | AIR | AAR CORP | USD | 508,186 | 726,63 |  | 11,9073 | 727,714 | 180,939 | 41,671 | 173,586 |  | 28,525 | 918,036 | A |  | 25589900 | 540,7748 |  | 19,75 |  | 1 | 5080
    8 | 1004 | 31/05/2000 | mei/00 | 54594 | 54594|05/2000 | 1999 | INDL | C | D | STD | AIR | AAR CORP | USD | 511,267 | 740,998 |  | 12,6378 | 833,107 | 180,447 | 35,163 | 163,816 |  | 10,051 | 1024,333 | A |  | 24993300 | 372,7519 |  | 13,875 |  | 1 | 5080
    9 | 1004 | 31/05/2001 | mei/01 | 54594 | 54594|05/2001 | 2000 | INDL | C | D | STD | AIR | AAR CORP | USD | 485,856 | 701,854 |  | 12,6299 | 713,811 | 179,987 | 18,531 | 125,392 |  | 46,093 | 874,255 | A |  | 23351100 | 377,118 |  | 14 |  | 1 | 5080

Import bestanden gelukt
« Last Edit: November 03, 2015, 01:42:45 pm by rvk »

Mickel

  • New Member
  • *
  • Posts: 21
Re: Importing a string matrix into Lazarus - problem only ONE column
« Reply #17 on: November 03, 2015, 02:48:10 pm »
Wow yes, the {$H+} does the job! Thank you very much!

Mickel

  • New Member
  • *
  • Posts: 21
Optimizing the search function
« Reply #18 on: November 03, 2015, 03:24:56 pm »
Hi,

I was wondering if someone could help me optimize my search function. Currently it goes through 1 million rows every j-iteration. What sort of coding should I add that it goes to the next j after it finds a match? As well is it possible to remove the row (not empty the cell) in matrix2 before going to the next j. Because the cells are unique anyway, this should reduce the time of the next iteration by one row every iteration.

Thank you in advance, hopefully someone knows the answer.

The code:


  for i:=2 to 119533 do
  begin
       for j:=2 to matrix2.NrRows do
        begin

             if matrix1[i,5] = matrix2[j,4] then    // PERMCO | date equal

             begin
        matrix1.setCell(i,31,matrix2[j,15]) ;

             end;
        end;

rvk

  • Hero Member
  • *****
  • Posts: 7044
Re: Importing a string matrix into Lazarus - problem only ONE column
« Reply #19 on: November 03, 2015, 03:44:56 pm »
You can use the code-button (#) at the top of the editor to set your code in "pretty-print". It's more readable.

Second... instead of running i to 119533 couldn't you better use matrix1.NrRows ? Or are they not the same?

Third...
If it find a match you can use break to jump out of the loop and continue with the next loop of I.

Code: Pascal  [Select][+][-]
  1.   for i := 2 to matrix1.NrRows do
  2.   begin
  3.     for j := 2 to matrix2.NrRows do
  4.     begin
  5.       if matrix1[i, 5] = matrix2[j, 4] then // PERMCO | date equal
  6.       begin
  7.         matrix1.setCell(i, 31, matrix2[j, 15]);
  8.         break; // break out of the current loop; so break out of the j-loop
  9.       end;
  10.     end;
  11.   end;
  12.  

Quote
It is important to note that the Break statement only jumps out of the current loop - not out of any nested loops above it.

For your other question... is column 4 in matrix2 in alphabetical order? In that case you could even make it a lot quicker by not looping though the entire sheet but doing a quick lookup to that cell. (i.e. begin in the middle (NrRows*0.5). If you're above the value then goto a (NrRows*0.25). Otherwise goto (NrRows*0.75) etc.)

I don't see a removeRow in TKUMatrix. I do see a SwapRows so you could first order matrix2 (quicksort) and do your looping after that. In that case (of alphabetical order) you can go two ways... Like I said above... use your own algorithm to search the correct value by cutting your result in half each time or (if your matrix1 is also in order) you can do your looping but your inner loop doesn't have to start at 2 but can start at the number where you last broke out of your j loop because both matrixes are in alphabetical order.

(Not sure which is faster. It would also depend on how large matrix2 is and how fast sorting matrix1 with 119533 rows is.)

Mickel

  • New Member
  • *
  • Posts: 21
Re: Importing a string matrix into Lazarus - problem only ONE column
« Reply #20 on: November 03, 2015, 04:02:25 pm »
Okay thank you very much! The sorting, break and start at k makes the calculation much quicker. The multiply 0.5 was also a very good idea, but the cells contain " 1647 | dd/mm/yyyy", but the code is now fast enough. Calculation should be done in 30 mins. Thank you!

var i, j, k : Integer;
  begin
  k:=2;
  for i:=2 to 250 do  //119533
  begin
       for j:=k to matrix2.NrRows do
        begin

             if matrix1[i,5] = matrix2[j,4] then    // PERMCO | date equal

             begin
        matrix1.setCell(i,31,matrix2[j,15]); //returns
             matrix1.setCell(i,26,matrix2[j,14]); //shares outstanding
             k:=j;
             break;
             //matrix2.remove

             end;
        end;

rvk

  • Hero Member
  • *****
  • Posts: 7044
Re: Importing a string matrix into Lazarus - problem only ONE column
« Reply #21 on: November 03, 2015, 04:10:16 pm »
The sorting, break and start at k makes the calculation much quicker.
Are you sure matrix2 is sorted on column 4 AND your matrix1 is sorted on column 5?? Otherwise you shouldn't start at k.

Calculation should be done in 30 mins.
If your matrix2 isn't sorted at column 4 or matrix1 isn't sorted on 5, you could do a SwapRow with the row you find and the last row. Your inner loop probably never reaches the last row because it will always find a correct value before that. So after you find a value swap it to the last row in matrix2 and it should even work faster :)

But this is only necessary if matrix2 is unsorted.

Code: Pascal  [Select][+][-]
  1.   var
  2.     i, j: Integer;
  3.   begin
  4.     k := 2;
  5.     for i := 2 to matrix1.NrRows do // 119533
  6.     begin
  7.       for j := 2 to matrix2.NrRows do
  8.       begin
  9.         if matrix1[i, 5] = matrix2[j, 4] then // PERMCO | date equal
  10.         begin
  11.           matrix1.setCell(i, 31, matrix2[j, 15]); // returns
  12.           matrix1.setCell(i, 26, matrix2[j, 14]); // shares outstanding
  13.           matrix2.SwapRows(j, matrix2.NrRows); // swap current row to the end
  14.           break;
  15.         end;
  16.       end;
  17.     end;
  18.   end;
« Last Edit: November 03, 2015, 04:34:35 pm by rvk »

Mickel

  • New Member
  • *
  • Posts: 21
Re: Importing a string matrix into Lazarus - problem only ONE column
« Reply #22 on: November 03, 2015, 06:25:05 pm »
Thank you for clarifying, everything is working now. Thank you for your help! :)

Mickel

  • New Member
  • *
  • Posts: 21
Importing RealMatrix not working
« Reply #23 on: November 05, 2015, 04:48:35 pm »
Hi,

I encountered a new problem when trying to import a RealMatrix (image added as an attachment). The issue doesn't occur when I'm using the TKUStringMatrix import. However I need the TKURealMatrix for one of my new function assets. To be specific, I'm using a formula in the second if-function, where I deduct 1 of the date, so I think I need the Real matrix. Hopefully someone knows why it doesn't work.

function assets(matrix1: TKURealMatrix): TKURealMatrix;
  var i, j, k : Integer;
  begin
  i:=0;
  k:=1;

  for i:=1 to 50 do  //119533
  begin
       for j:=k to matrix1.NrRows do
        begin

             if matrix1[i,1] = matrix1[j,1] then    // gck the same
                begin
                if matrix1[j,2] = matrix1[i,2] - 1 then
                   begin
                   writeln('Does it work?');
                   matrix1.SetCell(i,4,matrix1[j,3]);
                   k:=j;
                   end

                else break;
                end

             end;
  end;
         

rvk

  • Hero Member
  • *****
  • Posts: 7044
Re: Importing a string matrix into Lazarus - problem only ONE column
« Reply #24 on: November 05, 2015, 05:08:45 pm »
First off all it would have been usefull if you included the non-working code.
Now you have an example of reading Compustat v008.csv as stringmatrix. (and you didn't even supply Compustat v008.csv.)
So always attach code relevant to your problem.

Then the Example.csv. I don't think you can read integers (or reals) with ; as delimiter.

Try it yourself.
Code: Pascal  [Select][+][-]
  1. var
  2.   a: integer;
  3. begin
  4.   read(a);
  5.   //
If you input something like 2;3;4;5 it will five you a Runtime error 106.

But even if you change the ; to spaces you end up with an error that 1116,9 isn't recognized.
So you would also need to change the , to .
Then is should work.

(Also note that if you change the ; to spaces you need to make sure there is a space after line 2,3,4 etc. (or add 0) because those lines only have 3 fields while line 1 has 4 fields.)

Another option would be to read the stringmatrix and convert the number yourself to a realmatrix.

Mickel

  • New Member
  • *
  • Posts: 21
Re: Importing a string matrix into Lazarus - problem only ONE column
« Reply #25 on: November 05, 2015, 05:19:15 pm »
My apologies for being unclear about my problem and chaotic uploading of my code. The compustat v008 was too big to upload. Perhaps the final option is the best one. Because changing the , into . did not work with matrix1 := TKURealMatrix.Create('Compustat v008.csv');    .

How do i convert string to a real? Just make matrix2: TKURealMatrix and run a loop for example:

i:=1 to matrix1.NrRows do
 j:=1 to matrix1.NrCols do
  matrix1(i,j) := matrix2(i,j)

Or won't lazarus allow this procedures?
 

rvk

  • Hero Member
  • *****
  • Posts: 7044
Re: Importing a string matrix into Lazarus - problem only ONE column
« Reply #26 on: November 05, 2015, 05:29:43 pm »
My apologies for being unclear about my problem and chaotic uploading of my code. The compustat v008 was too big to upload. Perhaps the final option is the best one. Because changing the , into . did not work with matrix1 := TKURealMatrix.Create('Compustat v008.csv');    .

How do i convert string to a real? Just make matrix2: TKURealMatrix and run a loop for example:

i:=1 to matrix1.NrRows do
 j:=1 to matrix1.NrCols do
  matrix1(i,j) := matrix2(i,j)

Or won't lazarus allow this procedures?
You'll need to create the realmatrix to the exact dimension as the stringmatrix (there isn't an emtry create).
So:
Code: Pascal  [Select][+][-]
  1. matrix1 := TKURealMatrix.Create(matrix2.NrRows, matrix2.NrCols);
and for the copying of the values you should use StrToFloat because you can't assign a string to a real.
So:
Code: Pascal  [Select][+][-]
  1.     for i := 1 to matrix1.NrRows do
  2.       for j := 1 to matrix1.NrCols do
  3.         if matrix2.Cell[i, j] <> '' then
  4.           matrix1.Cell[i, j] := StrToFloat(matrix2.Cell[i, j]);
You might want to add a check if you're really dealing with a real-number, because otherwise you could get an exception if it's not a real.

(ps. Don't forget the ';' parameter for the create of the stringmatrix. It wasn't in your example project.)

Mickel

  • New Member
  • *
  • Posts: 21
Re: Importing a string matrix into Lazarus - problem only ONE column
« Reply #27 on: November 05, 2015, 05:48:48 pm »
Trying to get things to work, however my second function now gives an error. Do I need to make matrix1 a global variable to make the function assets work? As well as procedure seems to not work if i put a command line around the second function. Sorry to bother you with these basic questions :s thank you for helping.

program oplossing;

// Student naam: TYP JE NAAM HIER
// Student nummer: s0123456
// CheckMagicSquare

{$mode objfpc}{$H+}
{$APPTYPE CONSOLE}

uses
  Classes, SysUtils, KUMatrixGeneric;

function change(matrix2: TKUStringMatrix): TKUStringMatrix;
var matrix1: TKURealMatrix;
var i,j: Integer;

begin
  matrix1 := TKURealMatrix.Create(matrix2.NrRows, matrix2.NrCols);

   for i := 1 to matrix1.NrRows do
      for j := 1 to matrix1.NrCols do
        if matrix2.Cell[i, j] <> '' then
          matrix1.Cell[i, j] := StrToFloat(matrix2.Cell[i, j]);
end


function assets(matrix1: TKURealMatrix): TKURealMatrix;
  var i, j, k : Integer;
  begin
  i:=0;
  k:=1;

  for i:=1 to 50 do  //119533
  begin
       for j:=k to matrix1.NrRows do
        begin

             if matrix1[i,1] = matrix1[j,1] then    // gck the same
                begin
                if matrix1[j,2] = matrix1[i,2] - 1 then
                   begin
                   writeln('Does it work?');
                   matrix1.SetCell(i,4,matrix1[j,3]);
                   k:=j;
                   end

                else break;
                end

             end;
  end;

      if i= 40 then
        writeln('40');
     //if i=100 then
       // writeln('100');
     if i= 50000 then
        writeln('halfweg');
     if i= 85000 then
        writeln('bijna daar');


  matrix1.Save('Compustat v009.txt');
  writeln('file saved');


  end;

procedure LinkdProc();
  //import gegevens
  var  matrix2: TKUStringMatrix;
  var matrix1: TKURealMatrix;

  begin
       matrix2 := TKUStringMatrix.Create('Compustat v008.csv',';');

       matrix2.showshort();
       writeln();
       //matrix2.show();

       writeln('Import bestanden gelukt');
       change(matrix2);
       assets(matrix1);

       Readln;
  end;

  begin
  LinkdProc();
  end.                                       
« Last Edit: November 05, 2015, 05:51:22 pm by Mickel »

rvk

  • Hero Member
  • *****
  • Posts: 7044
Re: Importing a string matrix into Lazarus - problem only ONE column
« Reply #28 on: November 05, 2015, 05:58:44 pm »
What are you doing here?
Code: Pascal  [Select][+][-]
  1. function change(matrix2: TKUStringMatrix): TKUStringMatrix;
  2. var matrix1: TKURealMatrix;
  3. var i,j: Integer;
  4.  
  5. begin
  6.   matrix1 := TKURealMatrix.Create(matrix2.NrRows, matrix2.NrCols);
  7.  
  8.    for i := 1 to matrix1.NrRows do
  9.       for j := 1 to matrix1.NrCols do
  10.         if matrix2.Cell[i, j] <> '' then
  11.           matrix1.Cell[i, j] := StrToFloat(matrix2.Cell[i, j]);
  12. end
  13.  

First... you forgot a ; after the end of the function.
Second... you create matrix1 in this function but don't do anything with it.
You also don't give the function a result.
So you need to add Result := matrix1; otherwise this whole function has no use.
And last... you have this:
Code: Pascal  [Select][+][-]
  1. function change(matrix2: TKUStringMatrix): TKUStringMatrix;
and this:
Code: Pascal  [Select][+][-]
  1. change(matrix2);
The result of change should be TKURealMatrix and in your main-procedure you need to have matrix1 := change(matrix2);

Of course you could also define matrix1 as global variable... but that's a choice. In that case change() wouldn't even need to be a function but can be a procedure. But when your program grows it's best not to use too many global variables.

Mickel

  • New Member
  • *
  • Posts: 21
Re: Importing a string matrix into Lazarus - problem only ONE column
« Reply #29 on: November 05, 2015, 06:32:56 pm »
Okay thank you a lot of helping me understand the coding and learn how to work with it. It now runs, but 4th column is always zero. I will try to find the mistake on myself first.

 

TinyPortal © 2005-2018