Recent

Author Topic: Tutorial on creating a compiler  (Read 20720 times)

440bx

  • Hero Member
  • *****
  • Posts: 3946
Tutorial on creating a compiler
« on: December 16, 2019, 01:20:31 am »
There have recently been a number of posts related to "compiler improvements", that made me wonder what the level of interest is in the forum participants to see a tutorial on the development of a full compiler.

By "full", I mean a compiler that is reasonably full featured, does a "fair" number of optimizations and is self hosting.

By tutorial, I mean produced a step at a time and with lavishly commented code.

I know there are examples out there but, so far, what I've found are either "toy" compilers, that is, languages that lack too many features to be used in the creation of "real" programs or, as in the case of FPC, a full compiler but, not exactly written in a way to teach someone how to write a compiler. 

I first thought about asking the question by conducting a poll but, on second thought, I felt that something "free form" might be better.

All input welcome!
« Last Edit: December 16, 2019, 04:10:52 am by 440bx »
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

lainz

  • Hero Member
  • *****
  • Posts: 4460
    • https://lainz.github.io/
Re: Tutorial on creating a compiler
« Reply #1 on: December 16, 2019, 01:33:33 am »
I think the best way will be including the binaries of that compiler and having working sources that do something.

And yes I want that kind of tutorial. I found several without compiled binaries to test each step of the language.

If this is done instead of a single PDF I think is better a full repository with intermediate builds ready to test or the sources ready to compile and debug for each stage. Not just the final thing
 But as well the final thing must be shipped.

440bx

  • Hero Member
  • *****
  • Posts: 3946
Re: Tutorial on creating a compiler
« Reply #2 on: December 16, 2019, 02:11:25 am »
I think the best way will be including the binaries of that compiler and having working sources that do something.

And yes I want that kind of tutorial. I found several without compiled binaries to test each step of the language.

If this is done instead of a single PDF I think is better a full repository with intermediate builds ready to test or the sources ready to compile and debug for each stage. Not just the final thing
 But as well the final thing must be shipped.
I agree.  Thank you for the feedback.
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

dtoffe

  • Jr. Member
  • **
  • Posts: 55
Re: Tutorial on creating a compiler
« Reply #3 on: December 16, 2019, 02:34:14 am »
There have recently been a number of posts related to "compiler improvements", that made me wonder what the level of interest is in the forum participants to see a tutorial on the development of a full compiler.

What I have seen is a number of posts related to how "someone else" can improve the compiler by adding syntax changes of the most diverse kind. I don't recall seen posts by people asking how to better implement type checking, code generation or whatever "in its own language and compiler". I don't think one translates well into the other.

I would love to see such a tutorial. But however complete the tutorial, there are many decisions to be made, and the value in the choices not picked will be lost. Will your tutorial use Plex and Pyacc ? If I wanted to see hand coded recursive descent then I'm out of luck.

I hope you see my point.

Best,

Daniel

440bx

  • Hero Member
  • *****
  • Posts: 3946
Re: Tutorial on creating a compiler
« Reply #4 on: December 16, 2019, 03:42:44 am »
What I have seen is a number of posts related to how "someone else" can improve the compiler by adding syntax changes of the most diverse kind. I don't recall seen posts by people asking how to better implement type checking, code generation or whatever "in its own language and compiler". I don't think one translates well into the other.
What you stated above is true.  To me, it translates into the fact that, if the people asking for the things they are asking had some knowledge of grammars, how they are designed and implemented then their requests for improvements would be a bit more "on target" (can't think of a better way to put it.)

I would love to see such a tutorial. But however complete the tutorial, there are many decisions to be made, and the value in the choices not picked will be lost. Will your tutorial use Plex and Pyacc ? If I wanted to see hand coded recursive descent then I'm out of luck.
I know what you're saying.  It's actually quite an understatement to state that many decisions need to be made.

I am not really fond of Lex and Yacc, particularly for a tutorial.  lex and yacc have their place but, they greatly subtract from _exposing_ how a grammar is scanned and parsed.  I would use a hand coded recursive descent parser LL(1), not only because I believe it is educationally better but, also because it ensures the grammar is "simple" and, I strongly believe in what Einstein stated "Everything should be made as simple as possible, but no simpler".

That said, in the language design step, I would definitely use some tool to ensure the language grammar isn't flawed.

I hope you see my point.
I tried and I hope I did. ;)

I appreciate the feedback.  Thank you.
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

valdir.marcos

  • Hero Member
  • *****
  • Posts: 1106
Re: Tutorial on creating a compiler
« Reply #5 on: December 16, 2019, 05:36:11 am »
There have recently been a number of posts related to "compiler improvements", that made me wonder what the level of interest is in the forum participants to see a tutorial on the development of a full compiler.
By "full", I mean a compiler that is reasonably full featured, does a "fair" number of optimizations and is self hosting.
By tutorial, I mean produced a step at a time and with lavishly commented code.
I know there are examples out there but, so far, what I've found are either "toy" compilers, that is, languages that lack too many features to be used in the creation of "real" programs or, as in the case of FPC, a full compiler but, not exactly written in a way to teach someone how to write a compiler. 
I first thought about asking the question by conducting a poll but, on second thought, I felt that something "free form" might be better.
All input welcome!
Although a very big work, I would like to see such a tutorial.

First, you should indicate one or two modern affordable books as reference about the needed deep theory.

Then, if possible, your tutorial should show the whole process: Preprocessor, Compiler, Assembler, Linker, Loader, and Memory.

Those outrageously priced old books are always cited here because they showed the whole process from toy language source code (text) through useless working binary.

Again, if possible, your tutorial should include at least the types: Integer, Float, String and Date. And Procedures and Functions.

Beyond being imperative procedural, it would be amazing if a minimum usable imperative object-oriented paradigm (OOP) could be added.

The topics for the compiler part could be: Lexical Analyses, Syntax Analyses, Semantic Analyses, Intermediate Code Generation, Error Handling, Exception Handling, Code Optimization, and Target Code Generation.

valdir.marcos

  • Hero Member
  • *****
  • Posts: 1106
Re: Tutorial on creating a compiler
« Reply #6 on: December 16, 2019, 05:42:48 am »
I think the best way will be including the binaries of that compiler and having working sources that do something.
And yes I want that kind of tutorial. I found several without compiled binaries to test each step of the language.
If this is done instead of a single PDF I think is better a full repository with intermediate builds ready to test or the sources ready to compile and debug for each stage. Not just the final thing but as well the final thing must be shipped.
Distinguishing different phases in smaller source code and separated binaries would be a good idea.
Putting explanation in a PDF together with source code and binaries could also be another good idea.

ASBzone

  • Hero Member
  • *****
  • Posts: 678
  • Automation leads to relaxation...
    • Free Console Utilities for Windows (and a few for Linux) from BrainWaveCC
Re: Tutorial on creating a compiler
« Reply #7 on: December 16, 2019, 05:49:33 am »
I would be interested in seeing such a tutorial, and potentially using it, time permitting...
-ASB: https://www.BrainWaveCC.com/

Lazarus v2.2.7-ada7a90186 / FPC v3.2.3-706-gaadb53e72c
(Windows 64-bit install w/Win32 and Linux/Arm cross-compiles via FpcUpDeluxe on both instances)

My Systems: Windows 10/11 Pro x64 (Current)

valdir.marcos

  • Hero Member
  • *****
  • Posts: 1106
Re: Tutorial on creating a compiler
« Reply #8 on: December 16, 2019, 05:51:34 am »
There have recently been a number of posts related to "compiler improvements", that made me wonder what the level of interest is in the forum participants to see a tutorial on the development of a full compiler.
What I have seen is a number of posts related to how "someone else" can improve the compiler by adding syntax changes of the most diverse kind. I don't recall seen posts by people asking how to better implement type checking, code generation or whatever "in its own language and compiler". I don't think one translates well into the other.
I would love to see such a tutorial. But however complete the tutorial, there are many decisions to be made, and the value in the choices not picked will be lost. Will your tutorial use Plex and Pyacc ? If I wanted to see hand coded recursive descent then I'm out of luck.
I hope you see my point.
Best,
Daniel
First, I believe this tutorial could help beginners to understand how is a compiler internally and how it is build.
Then, it would improve beginners abilities to to better suggest new features or corrections, or even to provide source code to Free Pascal and Lazarus projects.
At last, but no less important, learning about algorithms, data structures and parsing is always useful.

valdir.marcos

  • Hero Member
  • *****
  • Posts: 1106
Re: Tutorial on creating a compiler
« Reply #9 on: December 16, 2019, 06:00:13 am »
What I have seen is a number of posts related to how "someone else" can improve the compiler by adding syntax changes of the most diverse kind. I don't recall seen posts by people asking how to better implement type checking, code generation or whatever "in its own language and compiler". I don't think one translates well into the other.
What you stated above is true.  To me, it translates into the fact that, if the people asking for the things they are asking had some knowledge of grammars, how they are designed and implemented then their requests for improvements would be a bit more "on target" (can't think of a better way to put it.)
I would love to see such a tutorial. But however complete the tutorial, there are many decisions to be made, and the value in the choices not picked will be lost. Will your tutorial use Plex and Pyacc ? If I wanted to see hand coded recursive descent then I'm out of luck.
I know what you're saying.  It's actually quite an understatement to state that many decisions need to be made.
I am not really fond of Lex and Yacc, particularly for a tutorial.  lex and yacc have their place but, they greatly subtract from _exposing_ how a grammar is scanned and parsed.
+1
Quote
I would use a hand coded recursive descent parser LL(1), not only because I believe it is educationally better but, also because it ensures the grammar is "simple" and, I strongly believe in what Einstein stated "Everything should be made as simple as possible, but no simpler".
Very interesting.
Quote
That said, in the language design step, I would definitely use some tool to ensure the language grammar isn't flawed.
Use or build?
Quote
I hope you see my point.
I tried and I hope I did. ;)
I appreciate the feedback.  Thank you.

440bx

  • Hero Member
  • *****
  • Posts: 3946
Re: Tutorial on creating a compiler
« Reply #10 on: December 16, 2019, 06:18:22 am »
Although a very big work
Yes, it is a lot of work.  IF (note the big IF) I decided to go ahead with it, I would definitely like and appreciate some help.

I would like to see such a tutorial.
Pleased to read that.  (normally, I'd say: pleased to hear it but, I read it not heard it ;) )

First, you should indicate one or two modern affordable books as reference about the needed deep theory.

Those outrageously priced old books are always cited here because they showed the whole process from toy language source code (text) through useless working binary.
Strangely, it doesn't really take much "deep theory".  A recursive descent parser is easy to understand and implement.  As far as book recommendations, the one book I really like, which I have recommended a number of times already is definitely _not_ affordable.

Then, if possible, your tutorial should show the whole process: Preprocessor, Compiler, Assembler, Linker, Loader, and Memory.
I don't have a design at this time.  At this time, I just want to find out if there is a reasonably significant amount of interest in such a tutorial.  That said, I don't see a preprocessor being there.  The reason being that, in my experience, preprocessors tend to make a program more difficult to understand because what the programmer reads isn't what the compiler sees.  IMO, while a preprocessor is desirable in some ways, it creates too many undesirable problems resulting in potentially very hard to understand and maintain code.

Again, if possible, your tutorial should include at least the types: Integer, Float, String and Date. And Procedures and Functions.
Integer, float, records, pointers, procedures, functions, strong typing and inline scopes: definitely. Exception handling: almost certainly (though maybe not in the first implementation.) String and Date: I see them more as features that belong in a library, not in the compiler.  Just the "definitely(s)" add up to a fair amount of work.

Beyond being imperative procedural, it would be amazing if a minimum usable imperative object-oriented paradigm (OOP) could be added.
I would definitely not include any OOP because that can really complicate matters.

The topics for the compiler part could be: Lexical Analyses, Syntax Analyses, Semantic Analyses, Intermediate Code Generation, Error Handling, Exception Handling, Code Optimization, and Target Code Generation.
While I don't have a design at this time, I lean towards error handling implemented the "Turbo Pascal" way, every error stops the compilation.  That implies, the compiler has to be _blazingly_ fast, which is doable.

Code optimization and generation: that's where the real "fun" starts.  I would focus on "hand written assembly like" size code that is reasonably fast.  I really like small executables, that is a _must_.

Thank you for your input. I appreciate it.

ETA:

Use or build?
Use not build.  Compiler-Compilers can be used to ensure the grammar doesn't contain ambiguities and/or conflicts.  To me, that's their primary appeal but, when it comes to build it, hand coded scanners and parsers are it.

ETA[2]

I would be interested in seeing such a tutorial, and potentially using it, time permitting...
Noted.  Thank you for the feedback.
« Last Edit: December 16, 2019, 06:45:56 am by 440bx »
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

trev

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 2020
  • Former Delphi 1-7, 10.2 user
Re: Tutorial on creating a compiler
« Reply #11 on: December 16, 2019, 06:23:47 am »
It's a lot of work, but don't let that stop you if it's an itch you want to scratch. Just don't expect it to alter the perception of those who always expect that someone else will do the work for them if they make enough noise.

af0815

  • Hero Member
  • *****
  • Posts: 1289
Re: Tutorial on creating a compiler
« Reply #12 on: December 16, 2019, 07:00:02 am »
The topics for the compiler part could be: Lexical Analyses, Syntax Analyses, Semantic Analyses, Intermediate Code Generation, Error Handling, Exception Handling, Code Optimization, and Target Code Generation.
It would be nice to known where this stages are in the actual compiler (units), for reference purposes. So it is more possible to have a look at the actual sources for study without understanding the whole compiler and to compare this with information from books, how it is working in the wild.   :D
regards
Andreas

440bx

  • Hero Member
  • *****
  • Posts: 3946
Re: Tutorial on creating a compiler
« Reply #13 on: December 16, 2019, 07:08:16 am »
It would be nice to known where this stages are in the actual compiler (units), for reference purposes. So it is more possible to have a look at the actual sources for study without understanding the whole compiler and to compare this with information from books, how it is working in the wild.   :D
I'm rather pleased to see someone asking for that kind of structure.  Every program (not just a compiler) should be written in such a way that it is very intuitive (if not obvious) to determine in which file/unit a logical "process" (groups of functions and procedures that accomplish a task) is found.

(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

JdeHaan

  • Full Member
  • ***
  • Posts: 115
Re: Tutorial on creating a compiler
« Reply #14 on: December 16, 2019, 08:57:29 am »
I'm interested and willing to learn and help.
Recently, I built an interpreter, but a complete compiler is is smth I did't dare to start.
About parsing techniques, it's worth checking out Pratt-parsing (which I used in my interpreter written in FPC and happy to share).

 

TinyPortal © 2005-2018