issues with fpc numlib

wp

Hero Member
Posts: 11916

issues with fpc numlib

« on: October 02, 2011, 06:40:26 pm »

When trying to write a curve fitting series for TAChart using the polynomial fitting routine in fpc's numlib (unit ipf, procedure ipfpol) I detected several errors:

- There is a typo in the documenting comment above the ipfpol procedure saying that "Calculate n-degree polynomial b for dataset (x,y) with n elements using the least squares method." - this should read "... with m elements..."

- There seems to be a memory allocation issue somewhere within ipfpol. When fitting an n-degree polynomial it should be sufficient to allocate n+1 elements for the parameter array, since there are n+1 fitting parameters. However, the program crashes in this situation. When allocating n+2 elements, however, the program runs well.

- The fit results are not correct when there is quite a large difference between the input data and the fitted data. This can be verified in comparison e.g. with gnuplot.

The attachment contains a short project demonstrating the last issue. The project creates three test data sets and tries to fit a straight line (y = a + bx) to the data. Only the straight line input data set is fitted well, the two others (parabola and exponential) are fitted with an incorrect axis intersection (in comparison with gnuplot).

Please note that the button "Call gnuplot" works only if gnuplot is installed; I am on Windows 64-bit and have gnuplot in "C:\Program Files (x86)\gnuplot\binary", in other cases modify the gnuplot path in the procedure "ExecFit" accordingly.

Is there anybody out there who knows more about the internals of numlib than myself and who could fix these issues? I will be posting this report also in the Lazarus Bug Tracker.

numlib fitting error.zip (5.8 kB - downloaded 165 times.)

Logged

marcov

Administrator
Hero Member
Posts: 11451
FPC developer.

Re: issues with fpc numlib

« Reply #1 on: October 02, 2011, 09:21:23 pm »

I think it is not correct to compare something as numlibs ipfpol to gnuplot.

Programs like gnuplot usually take more time to characterize and scan the input, and select the algorithm appropiately.

"ipfpol" is just one of those algorithms.

Logged

wp

Hero Member
Posts: 11916

Re: issues with fpc numlib

« Reply #2 on: October 02, 2011, 09:30:49 pm »

Sure, ipfpol is very limited compared to gnuplot. But the demo requires just a simple linear fit. If that would not work ipfpol would be useless.

« Last Edit: October 02, 2011, 09:35:32 pm by wp »

Logged

marcov

Administrator
Hero Member
Posts: 11451
FPC developer.

Re: issues with fpc numlib

« Reply #3 on: October 02, 2011, 10:10:06 pm »

Quote from: wp on October 02, 2011, 09:30:49 pm

Sure, ipfpol is very limited compared to gnuplot. But the demo requires just a simple linear fit. If that would not work ipfpol would be useless.

Least squares is typically quite sensitive. It is usually used as a primitive to build more complex stuff on.

E.g. in delphi code (using tpmath) I iterate several times, removing outliers with extreme relative residuals in each iteration.

I meanwhile checked the out of bounds array access. I think I know where it happens, but it is not easily correctable without reverse engineering the code.

If you can read Dutch, here are some unprocessed scans of the docs:

http://www.stack.nl/~marcov/numlib/

Logged

wp

Hero Member
Posts: 11916

Re: issues with fpc numlib

« Reply #4 on: October 02, 2011, 10:35:05 pm »

Thank you for the docs. I can't read dutch, I am German and, at least, can get an impression of what's going on.

Quote

removing outliers with extreme relative residuals in each iteration

I didn't do that, and I think gnuplot didn't either. So I am still not convinced why two programs get different results. Both are just trying to minimize the sum of squared residuals. But just by looking at the result of the ipfpol fit one can see that this is not the minimum of fit residuals (see "ipfpol-result.png") -- the fitted line is almost always on one side of the input data, in contrast to the gnuplot fit ("gnuplot-results.png") where the straight line goes right through the data.

Quote

I meanwhile checked the out of bounds array access. I think I know where it happens

Give me a hint, I'll try to debug it myself.

ipfpol-result.png (39.94 kB, 693x499 - viewed 333 times.)

gnuplot-result.png (24.83 kB, 570x390 - viewed 359 times.)

« Last Edit: October 03, 2011, 01:18:14 am by wp »

Logged

Ask

Global Moderator
Hero Member
Posts: 687

Re: issues with fpc numlib

« Reply #5 on: October 03, 2011, 05:24:54 am »

I agree that numlib has many problems and is in need of serious overhaul.
I have decided to use it in TAChart since it is already bundled with FPC,
so I thought that, even if imperfect, it is still better than importing yet another library.

I'd like to see numlib improved, but have no time to do that now.
As a rough plan, first I'd recommend to get rid of archaic manual memory allocation,
and use dynamic arrays everywhere. This will allow you to use range checks.
Second, I would work on the naming. I am not sure how much code is using numlib now, but I guess very little. So I'd recommend mass-renaming of units and procedures to a get away from 1960-era FORTRAN standard.
Third, the code should be brought to a consistent styling.
After that, the real work may begin on improving and fixing the code.

If you modify numlib, note that TAChart includes a modified copy to cater for
Lazarus/FPC version mismatch. I plan to synchronize copies after each FPC release.

So you'll have to send your changes both to FPC devels and me, sorry for the inconvenience.

Logged

wp

Hero Member
Posts: 11916

Re: issues with fpc numlib

« Reply #6 on: October 03, 2011, 12:17:47 pm »

This looks like re-inventing the wheel. How about adding another numerical library to fpc, such as tpmath? This package (under LGPL) is very complete, much superior to numlib regarding curve fitting, and still actively developed.

Logged

marcov

Administrator
Hero Member
Posts: 11451
FPC developer.

Re: issues with fpc numlib

« Reply #7 on: October 03, 2011, 12:52:16 pm »

Quote from: wp on October 03, 2011, 12:17:47 pm

This looks like re-inventing the wheel. How about adding another numerical library to fpc, such as tpmath? This package (under LGPL) is very complete, much superior to numlib regarding curve fitting, and still actively developed.

numlib is a large body of routines, not just curvefitting. Btw, the memory allocation is not even manual enough for me.

Logged

Ask

Global Moderator
Hero Member
Posts: 687

Re: issues with fpc numlib

« Reply #8 on: October 03, 2011, 01:13:16 pm »

Quote from: wp on October 03, 2011, 12:17:47 pm

This looks like re-inventing the wheel. How about adding another numerical library to fpc, such as tpmath? This package (under LGPL) is very complete, much superior to numlib regarding curve fitting, and still actively developed.

I would certainly support getting a better math library,
and TPMath (actually, I think DMath variant is more fitting) looks like a decent candidate.
However, there are some hurdles:
1) DMath should be a strict superset of numlib. For example, from the quick look, I do not see any spline-related functions in DMath -- maybe I just missed them?
2) DMath contains a lot of auxiliary code, such as string and BGI graphics routines, which IMO should not be added to FPC.
3) Finally, you have to convince FPC developers, not me, which is very hard.

So until the above happens, we are stuck with numlib.

Logged

wp

Hero Member
Posts: 11916

[Solved] issues with fpc numlib

« Reply #9 on: October 03, 2011, 11:27:37 pm »

I found the problem: no problem! I had called the ipfpol procedure with the understanding that the parameter n would be the number of fitting parameters, but as written in the unit, it is the degree of the polynomial.

After changing that, the arrow overrun error is gone, and the fit results are correct.

« Last Edit: October 03, 2011, 11:33:12 pm by wp »

Logged

Lazarus

Bookstore

Search

Recent

Author Topic: issues with fpc numlib (Read 10018 times)

wp

issues with fpc numlib

marcov

Re: issues with fpc numlib

wp

Re: issues with fpc numlib

marcov

Re: issues with fpc numlib

wp

Re: issues with fpc numlib

Ask

Re: issues with fpc numlib

wp

Re: issues with fpc numlib

marcov

Re: issues with fpc numlib

Ask

Re: issues with fpc numlib

wp

[Solved] issues with fpc numlib

	Computer Math and Games in Pascal (preview)
	Lazarus Handbook