Recent

Author Topic: Method call 300 times slower on MacOS than on Windows (GetTextExtent)  (Read 925 times)

Manlio

  • Full Member
  • ***
  • Posts: 104
  • Pascal dev
I ran into this issue when loading long HTML pages into components such as TIpHtmlPanel and THtmlViewer, and I discovered that the same operation (loading a web page) on the Mac was hundreds of times slower than on Windows. (Same CPU)

After a lot of digging, I isolated the "culprit": the GetTextExtent method. It calculates the area that a text string will occupy on a TCanvas. This method is called many times when rendering HTML pages on the screen, to calculate where the lines should break, the height of various elements, etc.

On Windows, the method call looks like this:

    Windows.GetTextExtentPointW(DC, Str, Count, Size);

And on MacOS, it looks like this:

    LclIntf.GetTextExtentPoint32(DC, PChar(s), Length(s), Size);

I wrote a test program that can be compiled on both Windows and MacOS and that will show the difference in performance. The source code is here:

Source code: https://tinyurl.com/win-mac-test

And here is a screenshot with performance times on both Windows and Mac:

Screenshot: https://imgur.com/a/fGLXXDY

How to read the screenshot:

First of all, compare data points C. That's the time (milliseconds) spent in performing generic operations. Note that this time is about the same on both systems. This shows that the Windows and the MacOS have a similar speed.

Now, data point A shows that the GetTextExtent call was performed 6600 times on both machines. On Windows, this took a total of 1.2+ milliseconds. On MacOS, this took a total of 435.8+ milliseconds. That's more than 300 times slower.

Additional data on the same line shows you the slowest time, the average time, and the slowest time. For example, the fastest call time on Windows was 0.000099 msec while on MacOS it was 500 times slower, at 0.05 msec. And so on...

Data point B shows a different test: The GetTextExtent method is called 1000 times in a row, with the same parameters, and the time is measured. For some mysterious reason, when all these calls are packed together, instead of being done in the middle of some other code, the difference between Windows and Mac is less,  but even then MacOS is still at least 10 times slower than Windows.

Now, the obvious question: can anything be done to improve the situation?

Opening long HTML pages on the Mac is really slow, with pages taking several seconds on the Mac, which instead open instantly on Windows.

I assume that GetTextExtent (and variations thereof) were originally Windows API, and that they needed to be somehow emulated on other platforms. Some loss of performance may be inevitable. But hundreds of times slower?

Also note this: In data point B, when repeating the same call over and over, with the same parameters, the difference between the slowest and the fastest time is a factor of 10 on Windows, but it's a factor of 150 on MacOS.

I compiled both versions without debugging, etc. And again, data points C show that the performance on both machines is similar, so it's not about CPU speed at all.

So, why this huge difference, and most importantly, is it somehow possible to improve things on the Mac?

Can anyone shed some light, please?

Source code: https://tinyurl.com/win-mac-test

Screenshot: https://imgur.com/a/fGLXXDY

Thank you!
manlio mazzon gmail

trev

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 1062
  • Former Delphi 1-7, 10.2 User
Re: Method call 300 times slower on MacOS than on Windows (GetTextExtent)
« Reply #1 on: September 05, 2020, 02:25:44 pm »
Possibly related... assigning a stringList (containing 4004 strings of between 10 and 15 characters) to a Memo like this:

Code: Pascal  [Select][+][-]
  1. Memo.Lines := strList;

took 33 seconds in macOS and 3ms in Windows 10.

Code: Pascal  [Select][+][-]
  1. Memo.Lines.Assign(strList);

took 28 seconds but:
 
Code: Pascal  [Select][+][-]
  1. Memo.Text := StringList.Text


took 1ms, so three times faster than Windows 10.

I don't know where that gives you any ideas. 
o Lazarus v2.1.0 r63871, FPC v3.3.1 r47164, macOS 10.14.6, Xcode 11.3.1
o Lazarus v2.1.0 r64160, FPC v3.3.1 Nov 27 21:16:31, macOS 11.0.1 (aarch64), Xcode 12.2
o Lazarus v2.1.0 r61574, FPC v3.3.1 r42318, FreeBSD 12.1 amd64 (VMware VM)
o Lazarus v2.1.0 r61574, FPC v3.0.4, Ubuntu 20.04 (PD VM)

Manlio

  • Full Member
  • ***
  • Posts: 104
  • Pascal dev
Re: Method call 300 times slower on MacOS than on Windows (GetTextExtent)
« Reply #2 on: September 06, 2020, 05:47:38 pm »
Possibly related... assigning a stringList (containing 4004 strings of between 10 and 15 characters) to a Memo like this:
...
I don't know where that gives you any ideas. 

Thanks for your suggestion. The code in your first example is probably converted into an Assign() call by the compiler, so it is equivalent to the second example. In other words, assigning one object to another with := is the same as calling Assign(). That would explain the similar times of your examples 1 and 2, and it would indicated that calls to Assign() are problematic on MacOS. I am however quite new to FPC and Mac, so I wouldn't know where to start to investigate something like that.

In your case, however, there's a way (third example) to get the work done at a comparable speed.

In my case (GetTextExtent) I don't know of any alternative methods to get the work done.

I also don't know how to look into how GetTextExtent (and its few variations) is implemented for the Mac. If that implementation involves calling Assign(), then there might indeed be a connection with your examples.

But I don't know where to look for it, I don't know how to dig deeper into the implementations of Assign() or GetTextExtent*() methods to figure out what's wrong... so if anyone with a better understanding were able to help, I'd appreciate it very much!
manlio mazzon gmail

Alextp

  • Hero Member
  • *****
  • Posts: 1153
    • UVviewsoft
Re: Method call 300 times slower on MacOS than on Windows (GetTextExtent)
« Reply #3 on: September 06, 2020, 06:08:40 pm »
lazarus/lcl/interfaces/cocoa/cocoawinapi.inc
function TCocoaWidgetSet.GetTextExtentPoint(DC: HDC; Str: PChar; Count: Integer; var Size: TSize): Boolean;

Manlio

  • Full Member
  • ***
  • Posts: 104
  • Pascal dev
Re: Method call 300 times slower on MacOS than on Windows (GetTextExtent)
« Reply #4 on: September 07, 2020, 12:22:19 am »
lazarus/lcl/interfaces/cocoa/cocoawinapi.inc
function TCocoaWidgetSet.GetTextExtentPoint(DC: HDC; Str: PChar; Count: Integer; var Size: TSize): Boolean;

Thanks!!

I'm now slowly digging into it, I'll update the thread if I make progress.
manlio mazzon gmail

skalogryz

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 2532
    • havefunsoft.com
Re: Method call 300 times slower on MacOS than on Windows (GetTextExtent)
« Reply #5 on: September 07, 2020, 06:05:27 am »
r63871 should slightly improve the performance
Patron Cocoa Widgetset development https://www.patreon.com/skalogryz

PascalDragon

  • Hero Member
  • *****
  • Posts: 2424
  • Compiler Developer
Re: Method call 300 times slower on MacOS than on Windows (GetTextExtent)
« Reply #6 on: September 07, 2020, 09:10:30 am »
Possibly related... assigning a stringList (containing 4004 strings of between 10 and 15 characters) to a Memo like this:
...
I don't know where that gives you any ideas. 

Thanks for your suggestion. The code in your first example is probably converted into an Assign() call by the compiler, so it is equivalent to the second example. In other words, assigning one object to another with := is the same as calling Assign(). That would explain the similar times of your examples 1 and 2, and it would indicated that calls to Assign() are problematic on MacOS. I am however quite new to FPC and Mac, so I wouldn't know where to start to investigate something like that.

It's not the compiler that converts it into an assign, it's simply how the Lines property is implemented for the memo ($lcldir/include/custommemo.inc):

Code: Pascal  [Select][+][-]
  1. procedure TCustomMemo.SetLines(const Value: TStrings);
  2. begin
  3.   if (Value <> nil) then
  4.     FLines.Assign(Value);
  5. end;

Judging by your observations with GetTextExtent it's probably not the assignment that's the problem, but the resulting invalidation of the memo which might call GetTextExtent itself (or other routines that are “emulated” WinAPI functions).

Alextp

  • Hero Member
  • *****
  • Posts: 1153
    • UVviewsoft
Re: Method call 300 times slower on MacOS than on Windows (GetTextExtent)
« Reply #7 on: September 07, 2020, 09:43:29 am »
Pls compare speed macOS vs Win32 on new trunk.

Manlio

  • Full Member
  • ***
  • Posts: 104
  • Pascal dev
Re: Method call 300 times slower on MacOS than on Windows (GetTextExtent)
« Reply #8 on: September 07, 2020, 01:17:18 pm »
Pls compare speed macOS vs Win32 on new trunk.

I will compare and let you know. (Currently I'm doing other things, but I'll get back to it and report back)

In the meantime, what I found so far:

GetTextExtent is implemented with two steps in CocoaGDISteps.pas:

Code: [Select]
function TCocoaContext.GetTextExtentPoint(AStr: PChar; ACount: Integer; var Size: TSize): Boolean;
begin
  FText.SetText(AStr, ACount);
  Size := FText.GetSize;
  Result := True;
end; 

Both steps (SetText and GetSize) take time, about 1/3 and 2/3 of the total, respectively.

GetSize in particular spends most of its time in the call to glyphRangeForTextContainer:

Code: [Select]
function TCocoaTextLayout.GetSize: TSize;
var
  Range: NSRange;
  bnds: NSRect;
begin
  Range := FLayout.glyphRangeForTextContainer(FTextContainer);
  //for text with soft-breaks (#13) the vertical bounds is too high!
  //(feels like it tryes to span it from top to bottom)
  //bnds := FLayout.boundingRectForGlyphRange_inTextContainer(Range, FTextContainer);
  bnds := FLayout.usedRectForTextContainer(FTextContainer);
  Result.cx := Round(bnds.size.width);
  Result.cy := Round(bnds.size.height);
end;

Even though Range is not used, the call is necessary. Also, glyphRangeForTextContainer seems to be a call to an external library...

NSLayoutManager.inc contains:

Code: [Select]

function glyphRangeForTextContainer (container: NSTextContainer): NSRange; message 'glyphRangeForTextContainer:';


... and I don't know how to dig in further than that.

Still I'll try with the latest version as soon as I can, and I'll let you know.

Thanks !
« Last Edit: September 07, 2020, 02:27:03 pm by Manlio »
manlio mazzon gmail

 

TinyPortal © 2005-2018