I'd love to see some code that shows whether, how, or in what cases, a binary search would speed it up.
Cheers,
VTwin
That's tempting.
I wish I could take you up on that at this time. Though I do know Pascal, I'm a complete newbie in FPC, that's a bit of a problem. Also, I have to finish a C program that is a real pain in the neck (thanks to Visual C++ 2017, which is a wonderful instrument to create complexity where there wasn't any.)
As usual, the more one thinks about a problem, the more potential solutions come to mind. Some may be possible, others have flaws which are discovered during their implementation. That's what "proof of concept" code is good for, to find out if an idea holds water (or better, hold sensible logic) or not.
Here is something that, if it holds water, might be the fastest (and may be even simplest) way of solving the OP's problem. It hinges on the fact that the data values seem to reside in a "reasonable" range, where reasonable means, not too far apart, in this case.
In the OP's example, the values' floor and ceiling are 16 and 22. The values have one decimal value which can be eliminated by a multiplication by 10.
This would allow a third "helper" (call it "match") array to be allocated with a range from 160 to 220 (that's only 60 elements). The array would have elements structured along the lines of:
element = record
{ keeping the Value below is not necessary but it makes the objective obvious and it is a nice aid during debugging }
Value : dword;
CountLeft : integer;
CountRight : integer;
end;
In this representation, the original arrays are thought of as the left and right arrays.
Have one loop walk the "left" array and increase the CountLeft whenever its value "falls" into that slot and do the same thing for the "right" array. It amounts to this, two loops where
Index = Left[i];
Match[Index].Value = Index; { obvious and unnecessary but may be useful during debugging }
Match[Index].CountLeft++;
do the same thing for the Right array but only updating the CountRight.
The Match array now has all the matches and mismatches. The Big O is simply 2n.
A walk of the Match array can now be used to calculate the values based on whether there was a match (CountLeft and CountRight not zero for an entry) or mismatches (one of the counts is zero and the other isn't) entries where both counts are zero are simply ignored.
If the assumptions are valid (a big if obviously) then this method would solve the problem in O(3n). It would probably be hard to find a faster algorithm. It has a few (obvious) downsides and possibly other non obvious ones:
1. Additional memory is needed for the Match array. This could become a real problem if there is a large number of elements in the original array and little to no overlap in their values.
2. When walking the Match array to perform the necessary calculations, it _may_ be necessary to locate the previous and next entries (maybe not but, something to think about during implementation. If the previous and next entries need to be determined then a sentinel at the bottom and top should be added to keep the code from having to consider these two cases as exceptional)
If this is possible, this would solve some inconvenient problems, such as duplicate entries (they would simply have a count greater than one) and identical arrays, as well as, no matching values in either array (all entries would have one of the counts at zero).
It would make processing these occurrences simple and part of the "normal" process (that is, no code to process exceptional cases).
Again, this is just food for thought. It seems like a viable solution that would be difficult to top as far as speed and simplicity.
Cheers!