Well, obviously, the source code? Read the code, Luke!
Of course. But I guess I should begin with understating common FPC architecture. I guess FPC has unique architecure in the middle of "Simple one-pass compiler" (in Dragon Book terms) and "Advanced compiler" (in Steven S. Muchnick terms).
If you know Turbo Pascal or Delphi that would help. For a basic usage, the compiler finds its files recursively from invoking it on the main file. You just pass the main file and some search paths, and that's enough to generate a binary. The links between pieces of source are based on USES clauses in the source, pretty much like USING in some other languages. The most important part to grok from this that compiling a source file does not 100% equate to a single invocation of the compiler. An invocation might compile dozens files (compilation units, having their own .o) or more.
Bootstrapping the project, in principle, requires the last FPC release installed. Any tools required are in that install, in practice you also want GIT to check out cutting edge sources.
To build and bootstrap the project is slightly more complicated and requires some more tools that are included in the release, make and a few other tools (rm,gecho, ginstall, cmp and diff). Till recently, GDB has been the only way to debug (with or without a GUI (Lazarus) or TUI (textmode IDE) frontend). GDB is sometimes upgraded from the release version (i.e. Lazarus might package other versions).
There are no library dependencies, not even msvcrt or ucrt. Simple binaries only link to kernel32 and user32.
Originally FPC also used binutils (assembler (g)AS, linker LD, archiver AR as well as windows specific tools as DLLTOOL and WINDRES), over times these have been replaced by own versions, with the final replacement of windres still pending. The next major release might default to own "FPCRES" instead of windres to compile resources.
The main driving force to make own versions of AS/AR/LD were rooted in smartlinking (aka dead code elimination on the linker lever much like GCC's gc-sections). The binutils didn't support it on a per section level on Windows, and the only workaround was to make an object file per symbol and archive those to static libraries using AR. This was horribly slow and memory consuming (I think the LD linker would nowadays eat more memory than available in 32-bit, before the internal linker, LD would eat 1.5GB to link Lazarus), specially for the Windows unit that held the windows api headers with tens of thousands of symbols.
Note that the binutils that we use are from about 2005, the last standalone mingw versions ( before unix compatibility layer MSYS was introduced). Such early LD versions don't support linking MSVC .lib static libraries.
I think the internal FPC linker slowly is starting to support .lib in trunk though, so it is not wholly impossible.
FPC does not support multithreaded builds, but the FPC and Lazarus build processes do allow for some parallelism, by compiling independent packages in parallel, specially in the packages/ directory. So the compiler-rtl bootstrap cycle (compile both 3 times) is single threaded, and then it goes into a multithreaded compilation of the package directory. However the dependencies of the various packages limit parallelism, giving a overall 2-3 times speedup of the packages/ directory.
Still, despite all the bad news, and lack of infinite parallelism, FPC is fairly fast. A good buildscript bootstraps the entire compiler on Linux in about 1min, and on windows on 1.5min, even on already dated hardware like a Ryzen 5 5600g, generating about 1500 compilation units (.o files) and several binaries. And that despite only limited parallelism.
The principles of the Windows bootstrap process are more integrated and should be faster, but Windows' file I/O and process creation is relative slow and hold it back. Even just the clean step with all its "rm" invocations can take 5-15 seconds.
I mean I need to understand frontend API, backend API, and common compilation pipeline.
I think compiler work is a bridge best crossed when you actually need it. Start with the libraries first, as that is where most of the work is for a win9x port. Check out the rtl/win32 library and its makefile.fpc/makefile combo. The Makefile is generated from the Makefile.fpc file by "fpcmake".
Also note that the RTL compiles the lowest level compilation units (system, math, objpas) standalone (invocation per unit), and then compiles "buildrtl", which compiles a bunch of files. This is called a "build unit", saving on the compiler start/build state/stop cycle to compile multiple units in one run. The makefile(.fpc) entries for those units are only to trigger recompilation of "buildrtl" on change.
For a bit dated, but still relevant treatise about FPC building, see
https://www.stack.nl/~marcov/buildfaq.pdf by yours truly. A version of my windows buildscript is at
https://www.stack.nl/~marcov/stdbuild.cmd