Recent

Author Topic: Linking question  (Read 835 times)

jollytall

  • Sr. Member
  • ****
  • Posts: 376
Linking question
« on: December 02, 2024, 08:47:07 am »
It is clear that if I have a unit u0 (or the main program) and it refers (in the interface or implementation section) two units u1 and u2, u1 can only use the subroutines (and types, etc.) that are defined in u1. If u2 has a foo subroutine that is not available to u1.
Now if u1 refers two more units u11 and u12 and u12 has a foo in it, then u1 can happily use it, refer to it.

My expectation that in the above set-up where both u2 and u12 have foo, the compiler tells the linker which foo to use (u2.foo in u2 and u12.foo in u1). However there are some special units (e.g. cmem, cthread, heaptrc) that are normally only referred to in the highest position (main program, top unit, or in case of the heaptrc only in the compiler options), still a lower level unit works differently if the unit is added even outside its normal scope. So, I assume in these cases the lower level unit call to "a" foo is changing at link time and suddenly refers to the top unit foo and not the foo that is directly seen by the unit already at compile time.

Is it so, or is there some compiler/linker magic behind these? How do they do this?

MarkMLl

  • Hero Member
  • *****
  • Posts: 8091
Re: Linking question
« Reply #1 on: December 02, 2024, 09:00:51 am »
An alternative explanation- I'm not necessarily saying this applies in all cases- is that cmem etc. patch themselves into heap management at program startup: note that such things are generally accessed via a function-pointer variable.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Logitech, TopSpeed & FTL Modula-2 on bare metal (Z80, '286 protected mode).
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11984
  • FPC developer.
Re: Linking question
« Reply #2 on: December 02, 2024, 09:04:00 am »
No, there is no compiler/linker magic. Those kind of units set a global variable (typically a record with callbacks) that other parts of the RTL use.

If you don't add the unit, some defaults (in the unix specific case often empty) are used.

Heaptrc needs compiler support to also initialize before implicit units like objpas. Objpas is implicitely included when you use mode Delphi or objfpc (per mode or per parameter)

cdbc

  • Hero Member
  • *****
  • Posts: 1757
    • http://www.cdbc.dk
Re: Linking question
« Reply #3 on: December 02, 2024, 09:16:06 am »
Hi
What the others said and then there's "aliasing", here's an example:
Code: Pascal  [Select][+][-]
  1. { here we get the compiler to /export/ this function as publicly available }
  2. function PickLastDir(const aPath: string): string; public name 'BC_PICKLASTDIR';
  3.  
  4. implementation
  5.  
  6. function PickLastDir(const aPath: string): string;
  7. var li: integer;
  8. begin
  9.   if aPath = '' then exit(aPath); /// well duh!
  10.   { user entered just a directory name, no path -> return it without 'pathdelim' }
  11.   if pos(PathDelim,aPath) = 0 then Result:= aPath
  12.   else begin { find the last 'pathdelim' in string & copy last part to result }
  13.     li:= Length(aPath);
  14.     if aPath[li] = PathDelim then dec(li); /// could end in a 'pathdelim', skip that
  15.     while ((li > 0) and (aPath[li] <> PathDelim)) do dec(li); /// scan backwards
  16.     { only 2 params = from idx and rest of string + replace 'pathdelim' with '' }
  17.     if li > 0 then Result:= copy(aPath,li).Replace(PathDelim,'',[rfReplaceAll])
  18.     else Result:= aPath.Replace(PathDelim,'',[rfReplaceAll]); /// should get rid of the trailing one if present
  19.   end;
  20. end;
and in another unit we then alias the previous function by name, so that we can call it...:
Code: Pascal  [Select][+][-]
  1. { utility functions, here we publicly export these functions, to be made available
  2.   in other units, that can't /see/ us, for import :o) gotta love FPC \o/\ö/\o/ }
  3. function Pch2Str(aPch: pchar): string; public name 'BC_PCH2STR';
  4. function Str2Pch(aStr: string): pchar; public name 'BC_STR2PCH';
  5. { AND HERE WE IMPORT THE PUBLIC ALIASED FUNCTION }
  6. function PickLastDir(const aPath: string): string; external name 'BC_PICKLASTDIR';
The RTL also makes use of aliasing, e.g.: 'InterlockIncrement()' in classes unit, amo.
Regards Benny
If it ain't broke, don't fix it ;)
PCLinuxOS(rolling release) 64bit -> KDE5 -> FPC 3.2.2 -> Lazarus 2.2.6 up until Jan 2024 from then on it's: KDE5/QT5 -> FPC 3.3.1 -> Lazarus 3.0

MarkMLl

  • Hero Member
  • *****
  • Posts: 8091
Re: Linking question
« Reply #4 on: December 02, 2024, 09:21:57 am »
No, there is no compiler/linker magic.

Except, I believe, that the compiler renames e.g. the unit to access debug line numbers.

Code: Pascal  [Select][+][-]
  1. (* Prior to around FPC 2.6.4 it was necessary to import LineInfo here. Later    *)
  2. (* versions of the compiler implicitly import it, and are unhappy at an attempt *)
  3. (* to explicitly import it.                                                     *)
  4. (*                                                                              *)
  5. (* This is complicated by the fact that at least some versions of the compiler  *)
  6. (* apparently rewrite "LineInfo" here to "lnfodwrf" to track the format being   *)
  7. (* used.                                                                        *)
  8.  
  9. uses
  10.   StrUtils, Linux, Unix, BaseUnix, Errors {$ifdef USE_LINE_NUMBERS } , { LineInfo, } lnfodwrf {$endif }
  11.  

That was derived from discussion on the ML.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Logitech, TopSpeed & FTL Modula-2 on bare metal (Z80, '286 protected mode).
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

jollytall

  • Sr. Member
  • ****
  • Posts: 376
Re: Linking question
« Reply #5 on: December 02, 2024, 10:03:45 am »
Thanks for the clarification.

Khrys

  • Full Member
  • ***
  • Posts: 128
Re: Linking question
« Reply #6 on: December 02, 2024, 10:23:13 am »
The reason why  cmem  et al. need to be placed at the beginning of the main unit's  uses  section is a consequence of the way FPC handles unit initialization and involves no magic at all.
The unit tree is traversed left-to-right and depth-first, meaning that in your example...

Code: Text  [Select][+][-]
  1. u0
  2. |
  3. +--------------+
  4. |              |
  5. u1             u2
  6. |
  7. +-------+
  8. |       |
  9. u11     u12

...the order of initialization is as follows:  u11, u12, u1, u2, u0
More specifically, the units'  initialization  blocks are executed in that order. Looking at  cmem...

Code: Pascal  [Select][+][-]
  1. Initialization
  2.   GetMemoryManager (OldMemoryManager);
  3.   SetMemoryManager (CmemoryManager);

...you'll see that all it does is replace the RTL's global  TMemoryManager  record, which  GetMem, FreeMem  (and class constructors/destructors) all rely on. To make sure that all other units (including their  initialize  blocks) use the new memory manager, this code needs to run as early as possible - hence the placement at the very beginning in the  uses  clause of the root unit.



In the case of multiple units defining a function with the same name, it's important to know two things:

FPC allows extensive name shadowing, meaning that the same identifier may refer to different things in different scopes

For example, the  objpas  unit defines  TIntegerArray  as a fixed-size array with a total size of 960 MiB (for legacy reasons I assume). If you don't find this useful (I don't), you're free to redefine it as e.g.  type TIntegerArray = array of Integer;  and the compiler won't complain - it'll simply use the most recently given definition. This also applies to units in the  uses  clause:

Code: Pascal  [Select][+][-]
  1. uses
  2.   u2, u12;
  3.  
  4.   foo(); // Refers to u12.foo

Code: Pascal  [Select][+][-]
  1. uses
  2.   u12, u2;
  3.  
  4.   foo(); // Refers to u2.foo

It's of course also possible to explicitly refer to an implementation by prepending the unit name, e.g.  u12.foo();

FPC mangles function names in a way that includes the unit name (among other things)

This makes FPC-generated code more powerful in the sense that source-level identifiers don't have to be unique even across objects files / translation units. For example, the  Format  function in  SysUtils  is referred to by the symbol  SYSUTILS_$$_FORMAT$ANSISTRING$array_of_const$$ANSISTRING  by the linker instead of its literal name.

 

TinyPortal © 2005-2018