Bookstore

 Computer Math and Games in Pascal (preview) Lazarus Handbook

Author Topic: Measuring distances between data points  (Read 15584 times)

wp

• Hero Member
• Posts: 7534
Measuring distances between data points
« on: June 21, 2012, 12:11:12 pm »
Recently I had to measure distances between data points in a chart. I decided to create a more general solution and wrote a TDataPointDistanceTool which I am donating to the TAChart community.

This new tool has lots in common with TDataPointCrosshair tool and is applied in the same way: add a TChartToolset to the form and link it to the chart, add a TDataPointDistance tool to the toolset and activate one of the options of the Shift property, e.g. ssLeft. When you left-click in the diagram and drag the mouse you will see a measurement bar from the click position to the current mouse position. By default, the bar remains visible after MouseUp. If you want it to disappear call Hide in the MouseUp event handler.

These are some properties:
- LockedToData (default true): This means that the measurement bar starts at a data point. When during dragging the mouse approaches another data point the end point jumps to this point and their distance is displayed. If the distance to a data point is larger than the GrabRadius then the distance to the current mouse point is displayed. If LockedToData is false the starting point can be anywhere in the diagram. But note that this may cause problems when the chart contains series on several axes that use different transformations, for example a series on a linear axis and another one on logarithmic axis. If LockedToData is false, the tool has no way to know which axis is to be applied, and it uses those of the first series. Set LockedToData to true to prevent this ambiguity.
- ShowDistance = true displays the distance in axis units next to the measurement bar while dragging.
- ShowEndBar = true displays "stop" bars at the ends of the measurement line
- EndBarLength is the length of the end bar, -1 means that the end bar goes across the entire chart, nice for measuring distances of parallel slanted lines.
- Use the event OnMeasure for example to display the result of the measurement in the statusbar.
- Use the event OnDraw for example for a custom-drawn endbar.

See the attached demo project for a demonstration.

Since TDatapointDistanceTool is very similar to TDatapointCrosshairTool I separated common code off into a common ancestor TBasicChrosshairTool.

There are also some modifications with the TDatapointCrosshairTool: This tool has a property Position which is in graph units. Since the internal graph units are not so interesting for the user I added a function GetPositionAx which returns the current position of the crosshair in axis units. There is also a new event OnGetPosition which passes the current position in axis units as a parameter. I know this kind of duplicates the OnDraw event, but I always forget that I have to call OnDraw when I want to display the current position in the StatusBar.

Finally, currently there is no way to hide the crosshair or measurement bar programmatically after the tool has been deactivated. Calling Hide for example as a reaction on a ButtonClick crashes the program.
Mainly Lazarus trunk / fpc 3.2.0 / all 32-bit on Win-10, but many more...

wp

• Hero Member
• Posts: 7534
Re: Measuring distances between data points
« Reply #1 on: June 28, 2012, 09:21:39 pm »
Well - looks like the DistanceTool does not make it into TAChart. Anyway, that's fine. I'll refactor the code for me such that the tool can be used outside the package...
Mainly Lazarus trunk / fpc 3.2.0 / all 32-bit on Win-10, but many more...

• Global Moderator
• Hero Member
• Posts: 687
Re: Measuring distances between data points
« Reply #2 on: July 01, 2012, 09:05:26 am »
Do not lose hope just yet
Sorry for the slow reaction -- I was overwhelmingly busy last two weeks, and did not have time to properly study your work.

So far, I have taken only a single line from it -- typo fix in the procedure name.

One reason for the delay is that your patch is quite big and contains many additions.
Most of them are good, but there are a few things I'd like to discuss or change.
It would be somewhat easier if you'd submit your work in a smaller chunks,
perhaps as a patch series.

Nevertheless, the feature is good in principle, and I shall split you submission into 2-3 commits, but it will take some time.

Quote
I'll refactor the code for me such that the tool can be used outside the package

That might actually help -- if you have already done that, please submit refactored version --
it will be easier for me to extract changes from it.

wp

• Hero Member
• Posts: 7534
Re: Measuring distances between data points
« Reply #3 on: July 01, 2012, 05:39:07 pm »
Sorry for being impatient.

Here is the refactored version of component and demo. The DataPointDistanceTool now is in its own unit, it can be tested without installation into the TAChart package. Just compile the demo.

If you think the component is fine for TAChart, here is the first patch. It extracts the code useful for TDataPointDistanceTool from TDataPointCrosshairTool to a common ancestor TBasicCrosshairTool. In particular it makes DoDraw, DoHide and Hide virtual, renames FCrosshairPen to FPen and copies some useful code to the ancestor.

I don't know how to create the next patch without committing my version to the repository. Anyway, the next patch will add the code for TDatapointDistanceTool, and the third patch will add some - maybe unnecessary - refinements for TDataPointCrosshairTool like conversion of Position to axis coordinates, and the OnMeasure event.
Mainly Lazarus trunk / fpc 3.2.0 / all 32-bit on Win-10, but many more...

wp

• Hero Member
• Posts: 7534
Re: Measuring distances between data points
« Reply #4 on: July 06, 2012, 12:50:56 am »
Thank you for committing the first patch to svn. So here's the second one. It just contains the new TDatapointDistanceTool code. Please note that it differs a bit from the previous posting -- I added a OnGetDistanceText event handler since I needed formatting of the distance text as a time. And there is some refactoring of the FStartPos, FEndPos, FStartSeries, FEndSeries etc. I decided to use the word "Position" for graph units (as it is done in TDatapointCrosshairTool) and "Point" for axis units.

I did not want to touch other units of the project, but I feel that the auxiliary procedures "DrawXORText" and "FindFirstCustomChartSeries" should better be in other units.
Mainly Lazarus trunk / fpc 3.2.0 / all 32-bit on Win-10, but many more...

wp

• Hero Member
• Posts: 7534
Re: Measuring distances between data points
« Reply #5 on: August 03, 2012, 04:11:37 pm »
Sorry for bringing up this thread again: could it be that the patch in the previous posting has been overlooked and forgotten?
Mainly Lazarus trunk / fpc 3.2.0 / all 32-bit on Win-10, but many more...

• Global Moderator
• Hero Member
• Posts: 687
Re: Measuring distances between data points
« Reply #6 on: August 22, 2012, 05:42:56 pm »
I have committed some parts of your patch with modifications in r38323 -  r38335.
The reason it took so long is primarily because I experimented with various approaches, and did not what to commit early experimental interfaces to avoid unnecessary compatibility burden.
If you look at the commits, I hope you can see that the difference is rather substantial.
Mainly I tried to use existing TAChart mechanisms instead on re-implementing them:
IsActive instead of custom IsValid, calls to Handled, etc.
I did like the direction you took with "Anchors", but I think you did not go far enough -- so I have introduced new TPointRef mechanism which encapsulates information about a series point. I plan to extend this mechanism to other data point tools.
I have also implemented a single Distance function with measurement units controlled by an argument. I have already used TChartUnits in one other place, and plan to do this more.
Note that there is still not some rough edges and some missing features compared to your patch -- notably "infinite" end bars and advanced label positioning.
I do plan to implement them -- but did not yet found optimal API.
Anyway, the resulting tool can get quite fancy, especially in normal drawing mode

Sorry for the inconvenience for your projects due to incompatibility.
To reduce migration problems, you can rename (with Refactoring/Rename identifier) your version of the tool prior to upgrading Lazarus. I tried that myself and was able to use mine and your versions side-by-side.

wp

• Hero Member
• Posts: 7534
Re: Measuring distances between data points
« Reply #7 on: August 22, 2012, 09:35:23 pm »
Thank you. I am impressed by your ideas to improve the tool.

Sorry for being impatient, but there is quite some activity in the TAChart board now, and I was fearing that the posting might get buried underneath newer ones.

One thing came to my mind after submitting the original posting: the default value of MeasureMode should not be cdmXY, but cdmOnlyX or cdmOnlyY. cdmXY assumes that the quantities on x and y axes have the same units, and this is a very rare case.

When writing the original version of the component some other ideas for tools came to my mind:
• The position of the DataPointCrosshairtool and DistanceTool cannot be changed once the tool operation is finished. I often want to reposition the endpoints of the distance tool to get better accuracy. So far, the only option is to do it again.
• a labelling tool which allows to place some text at an arbitrary location in the chart area.
• a point labelling tool which allows to attach some text to a specific data point, in addition to the Marks. For example, to indicate the maximum of a curve and its coordinates. It would be nice to have a line, an arrow, etc from the label to the data point.
Mainly Lazarus trunk / fpc 3.2.0 / all 32-bit on Win-10, but many more...

• Global Moderator
• Hero Member
• Posts: 687
Re: Measuring distances between data points
« Reply #8 on: August 30, 2012, 12:10:59 pm »
I have made some progress on measuring tool -- I think it is good enough to be used.
Please test and tell if something is missing, especially compared to your version.

The text below is quite long -- feel free to answer partially and/or create a new topics for the parts which interest you.

Quote
The position of the DataPointCrosshairtool and DistanceTool cannot be changed once the tool operation is finished. I often want to reposition the endpoints of the distance tool to get better accuracy.
I do not think it is a problem for crosshair tool -- since there is only one point,
there is no difference between "editing" and "recreating" its position.
As for the distance tool, yes, I see that editing may be useful -- however this is another case of "easy to implement, but difficult to design".
I have one idea, which is good and generic, but perhaps an overkill:
1) Add "TDistanceSeries", which would display its data as a set of "distance measurements", similar to the way it is done in the current distance tool.
2) Use the existing TDragDropTool to edit the endpoints.
3) Add properties to the distance tool which would indicate the distance series and
The advantages of this plan are:
1) It will allow multiple simultaneous measurements, optionally with 'multi-hop' mode where the consecutive measurements are made along segments of polyline.
2) The current option of "permanent" distance is kind of a hack. The whole mechanism of tool drawing supposes that tools are transient -- visible only while active. Using the series for permanent data is architecturally better.
3) Editing, as said above.
4) Distance series may double as a "vector plot series" (see e.g. http://www.sharpplot.com/Vectors.htm), or at least share code with it.

1) Xor mode will be quite tricky to implement -- but I think manageable.
2) The setup will be somewhat complicated for the user. I hope I can find a way to offer current 'single-measure' functionality by default, with only 'multi-measure' and 'editing' requiring a series, but I did not yet thought it through.

Quote
a labelling tool which allows to place some text at an arbitrary location in the chart area.
This requires the definition of 'arbitrary location' first. For a long time, I have had the following design in mind:
1) Introduce TChartUnits type to select measurement units for coordinates and sizes -- already done.
2) Use TChartUnits in various places -- slowly started, but there are compatibility problems -- for example, "BarWidth" and "BarOffset" could probably use TChartUnits.
3) Add "TChartShape" class which is a basic class for an arbitrary shape with coordinates measured in TChartUnits. This will give user a large degree of control over shape positioning during the resizing/zooming/panning of the chart. See "Position" page of the axis demo for an example.
4) Add various descendants, the simplest being TRectChartShape, which would give you the label in a box.
However, there is a problem with the API: for each coordinate property, there must be corresponding "units" property. This may get quite cumbersome:
Top, TopUnits, Left, LeftUnits, Width, WidthUnits, Height, HeightUnits, etc.
Additionally, there may be a set of corresponding "UseXX" properties -- for example,
Width/UseWidth and Right/UseRight
I was planning to investigate the possibility to create a special type and/or a special OI editor to combine those 2 or 3 properties, but did not have the time yet.
Suggestions in this area are greatly appreciated.

Quote
a point labelling tool which allows to attach some text to a specific data point, in addition to the Marks.
This already can be done -- add new empty point series, use TDataPointClickTool,
add new point with the desired label in the OnClick handler.
Alternatively, simply edit Text field of the pointed data item -- this is simpler,
but obviously will not work to functional series.

I can easily implement TDataPointEditTool which does the above, but there is a problem with the data item editor. In most applications, this editor should have
domain-specific UI. I am not sure that the generic UI would be useful frequently
enough to warrant inclusion in TAChart. And without that UI, TDataPointEditTool is almost equal to TDataPointClickTool.

Quote
a line, an arrow, etc from the label to the data point.
A line already exists -- controlled by Marks.LinePen property.
As for arrows, they are very easy to add. However, I had an idea of possibility to implement callouts (see e.g. http://www.nevron.com/gallery/FullGalleries/chart/Annotations/images/chart-rounded-rectangular-callout.png), which could cause conflict from the API point of view.
I am not sure about the relative value of having arrows now with a chance of compatibility problems later versus not having arrows for some time with a chance of more complete implementation in the indefinite future. My opinion on this matter may be easily influenced by you

wp

• Hero Member
• Posts: 7534
Re: Measuring distances between data points
« Reply #9 on: August 31, 2012, 12:53:03 am »
Quote
I have made some progress on measuring tool -- I think it is good enough to be used.
I totally agree - very nice and very useful!

Here are some suggestions that came to my mind when converting my project to the new tools version:
• When option dpdoLockToData is set usage of the tool is a bit rugged. After clicking at the start point there is no distance line until the first possible end point is hit. When dragging the mouse somewhere else the distance marker sticks to the previous position until a new point is found. In my version, the distance marker after having locked to the first data point is visible all the time, but jumps to the next anchor when the mouse gets close. I think this behavior would be much smoother and less disturbing to the user.
• Option dpdoRotated (excellent idea!) draws the distance label flipped over when you drag from the right to the left. I think the label should be rotated by 180 degrees when the x coordinate of the startpoint is greater than that of the end point.
• The marks along the distance line are displayed in graph units. Graph units are not interesting to the user of the chart, usually only axis units. I know that I could tweak the label text by means of the OnGetDistanceText event, but it would be more convenient to have at least an option like "dpdoAxisUnits", or - more general - a property for the ChartUnits. If axis units would be activated by default it would be even better.

Quote
Add "TDistanceSeries", which would display its data as a set of "distance measurements", similar to the way it is done in the current distance tool.
Ah - this is a very interesting solution. I clearly see the advantages, and as for the disadvantages, I could live with a missing XOR mode; I usually turn it off anyway because of the flicker, and now the marks with their background are much more beautiful!

Quote
a labelling tool which allows to place some text at an arbitrary location in the chart area.
I am not convinced any more if that is necessary. Your answer to the next item gives me the idea that this could be achieved by a dummy series which accomodates the locations where the texts are. Instead of a full-fledged army of PowerPoint-like shapes etc, it would be enough to give the Marks a shape property (rectangular by default, elliptic, rounded rect).

Quote
Quote
a line, an arrow, etc from the label to the data point.
My opinion on this matter may be easily influenced by you
Don't put too much work into that. The arrow would be more than enough.

Now a question on the chart tools in general: As I understand there are two mutually excluding ways to use the tools in an application: the first one is to take advantage of the Enabled property, to set up a toolbar with a button for each ChartTool which activates the corresponding tool. This is very clear to the user, but it may soon get uncomfortable to move the mouse up to the toolbar again and again. The second option is to use the Shift property. This is very powerful, you can apply any tool immediately. But with a variety of tools on a form (I'd like to use all of them...) it is difficult to find useful shift combinations - and to remember them.

The ideal ChartTool would allow both ways: to have a toolbar, AND to use the magic shift keys. Do you have any idea how this chould be achieved? I think, so far I cannot use both approaches simultaneously because in the toolbar solution all tools musts be disabled except for the active one, but then the Shift keys for the disabled tools do not work any more.

[EDIT] Ah - maybe this should work: duplicate each tool. For the first set, use Enabled = false and Shift = [ssLeft]. Assign each tool to a button in the toolbar which sets Enabled to true. In the second set, Enabled = true, but Shift has an individual assignment, ssLeft alone is not used.
« Last Edit: August 31, 2012, 08:57:39 am by wp »
Mainly Lazarus trunk / fpc 3.2.0 / all 32-bit on Win-10, but many more...

• Global Moderator
• Hero Member
• Posts: 687
Re: Measuring distances between data points
« Reply #10 on: August 31, 2012, 12:18:34 pm »
Quote
When option dpdoLockToData is set usage of the tool is a bit rugged.
Hm. This behavior was my interpretation of "locked" concept -- which IMHO
means "not allowed to point elsewhere".
I still think there is some value in that -- for example,
it is guaranteed that OnMeasure event receives two valid data points.
Also, since axis may be extracted only from the data point, the "axis" distance
may oscillate between actual axis and graph distance depending on the proximity
to the data point.
Nevertheless, I also see the reason for your variant, so I replaced an option
with a tri-state DataPointMode in r38447.

Quote
the label should be rotated by 180 degrees

Quote
Graph units are not interesting to the user of the chart, usually only axis units
Agreed, this was an oversight. Changed default in r38446.
Also added graph distance as a second Format argument for easier access.

I have converted and committed your demo in r38451.

Quote
The ideal ChartTool would allow both ways: to have a toolbar, AND to use the magic shift keys.
This is the intended usage. I do not quite understand what does not works for you.

You should separate your tools into "frequently used" and "seldom used".
First group should be always enabled and assigned different shift combinations.
Second group should be disabled except maybe one selected tool,
and assigned the same shift combination. Both groups can coexist in a single toolset.

For example, in GIS applications zooming and panning are usually always enabled,
while distance measuring and point editing must be selected from toolbar.

wp

• Hero Member
• Posts: 7534
Re: Measuring distances between data points
« Reply #11 on: August 31, 2012, 02:11:33 pm »
Thank you a lot. The demo runs fine (after removing the unit TADistanceTool from the project). In my application, however, there is no immediate reaction in spite of DataPointMode=dpmSnap. But maybe this is due to my data: there are 9 series with about 3800 data points each; the previous version was working smoothly, though - I'll dig into that...

I'll also investigate into the toolbar option that you mentioned. Maybe I'll write a demo with simplified data on a "perfect" arrangement of tools in cooperation with a toolbar.
Mainly Lazarus trunk / fpc 3.2.0 / all 32-bit on Win-10, but many more...

• Global Moderator
• Hero Member
• Posts: 687
Re: Measuring distances between data points
« Reply #12 on: August 31, 2012, 03:08:41 pm »
Quote
after removing the unit TADistanceTool from the project).
Oops, fixed in r38458.

Quote
there is no immediate reaction in spite of DataPointMode=dpmSnap.
Simple test of setting RandomCharSource1.PointsNumber=10000 in the demo
did not reveal any problem.

Quote
Maybe I'll write a demo with simplified data on a "perfect" arrangement of tools in cooperation with a toolbar.
If you have time, a tutorial would be even better.

wp

• Hero Member
• Posts: 7534
Re: Measuring distances between data points
« Reply #13 on: September 01, 2012, 02:47:21 pm »
Quote
Simple test of setting RandomCharSource1.PointsNumber=10000 in the demo did not reveal any problem.
I should have been more specific: my data are displayed in a paned view, as discussed in http://www.lazarus.freepascal.org/index.php/topic,16195.15.html, and some of the series mentioned are hidden.

Therefore, the standard distance demo does not show the problem. If you use the attached modified distance demo you will see it, however. I added some more series and randomsoures with 5000 points each.

You will see the issue when you activate the "Snap" option and drag the mouse around - there is no immediate reaction.

When you hide all of the series except for one or two the ragged behavior of the distance tool does not change.

Therefore, I modified your TDataPointTool.FindNearestPoint by adding a visibility check for the series to reduce the number of GetNearestPoint calls, and the operation gets much smoother in the case of hidden series:

Code: [Select]
`procedure TDataPointTool.FindNearestPoint(APoint: TPoint);//...begin//...  for s in CustomSeries(FChart, FAffectedSeries.AsBooleans(FChart.SeriesCount)) do    if s.Active and // wp added: "s.Active and"      s.GetNearestPoint(p, cur) and PtInRect(FChart.ClipRect, cur.FImg) and      (cur.FDist < best.FDist)    then begin//...`

Other optimizations would be to do a bounds check whether the queried point is within the extent of the series, before going into the GetNearestPoint calculation. But maybe you are already doing that - I did not check the sources.

Of course these are just workarounds for special cases. The main reason to this speed issue seems to me that GetNearestPoint seems to perform a linear scan. Performance should improve considerably for a binary search. This would require an ordered search index for the x and y values (the ylist as well?). Looks like a huge effort...
« Last Edit: September 01, 2012, 03:22:41 pm by wp »
Mainly Lazarus trunk / fpc 3.2.0 / all 32-bit on Win-10, but many more...

• Global Moderator
• Hero Member
• Posts: 687
Re: Measuring distances between data points
« Reply #14 on: September 01, 2012, 05:05:44 pm »
Quote
visibility check for the series to reduce the number of GetNearestPoint calls
I see two conflicting arguments here:

On the one hand, finding invisible point as a nearest one is strange from the end-user's point of view, because, e.g. crosshair would jump to an empty space.
In this regard, your change may be seen as a bugfix.
On the other hand, programmer has explicitly specified AffectedSeries to include
this inactive one, so ignoring it might be surprising.
In this regard, your change may be seen as a regression

I think the decision should be based not on the potential optimization
(you can achieve the same effect by removing inactive series from AffectedSeries),
but on the points above. What is your opinion?

As for the general question of the performance improvement:
1) Boundary checking might actually help, but will be heavily dependent on extent caching (because bounds calculation also requires a linear scan), and I foresee some implementation corner cases. Still, I shall (not very soon) investigate in that direction, as it seems to offer largest cost/benefit ratio.
2) I have considered binary search, but: It will only help for cdmOnlyX mode. Further, sorted-by-Y list will only help for cdmOnlyY mode -- and will require additional sored lists for multi-valued series. In all, too much complexity for a little gain.
3) To help general case, a spatial tree is required. This is obviously even more complex, but will solve the problem at the fundamental level. Balance is closer here -- not sure if it is worth the effort.
4) Additionally, I have long considered caching axis and/or graph transformations. This will improve performance in various scenarios -- but, as with all caches, they must be maintained and invalidated correctly, which is a traditional source of bugs.