About parsing techniques, it's worth checking out Pratt-parsing (which I used in my interpreter written in FPC and happy to share).
Thank you for mentioning Pratt parsing. I was not aware of that method of parsing expressions. Definitely a parsing technique I'm going to look at closely.
@MarkMLl
I agree that covering all the aspects you mentioned would result in quite a "tutorial" (if being that extensive can still be called a "tutorial".)
You may not be familiar with "Per Brinch Hansen on Pascal Compilers" but in his book he narrows down the field quite a bit. Specifically, he identifies a problem and offers one of the large number of possible solutions. I find that to be a good approach and I would do the same. Of course, using that approach implies that the reader has a decent programming foundation. IOW, not for complete beginners.
A reasonably knowledgeable programmer can read/study and understand Brinch Hansen's book in about a month. Unfortunately, his book is rather pricey at this time.
This is not something that can be done by a single tutorial.
Agree. Some areas would require a progression from a very simple example to what will be the target implementation.
It would take roughly an academic year to teach the computer science underpinnings (data structures, a suitable implementation language, and the basics of parsing),
I believe that a reasonably knowledgeable programmer can get the necessary knowledge in about a week. Of course, that does _not_ include becoming familiar with all the common parsers and their implications on a grammar. In this case, it would just be one (1): Top down recursive descent, which is fairly easy.
another to review target hardware and implementation approaches (many of which have turned out to be dead ends), and a third to look at actual code generation and the fundamentals of optimisation.
Once the compiler deals with hardware and the O/S, real complexity starts. In this area, the tutorial would present just enough to enable the reader to read the code and understand how things are getting done. IOW, a fair amount of personal effort from the reader would be required.
Follow that by another year doing guided intermediate-level research, at the end of which the student might possibly have some appreciation of what he doesn't yet know hence might possibly be some use for something.
You're right but, to go beyond what is presented in the tutorial, Google is the student's best friend.
I attach one PDF which I have found overwhelmingly useful over the years: it's old, but is one of the best introductions to parsing and naive code generation.
Thank you.
appreciating that there are some "nice features" that could be put in a syntax that would make it unparseable is more useful than knowing every last detail of post-Chomsky linguistic research.
Yes, there are a lot of "nice features" that someone who is not familiar with language grammars would wish for and would make the grammar either incorrect or downright unparsable. That's another area that the "student" will have to investigate on his/her own if they are inclined to know more.
But finally, I do have to emphasise that the hard work of writing a compiler is understanding the low-level stuff: you have to understand the implications of apparently-simple language features (e.g. exceptions) on the implementation and you have to have an intimate understanding of the target architecture.
Very true. That's probably why a lot of compiler books sidestep the issue by creating a virtual CPU and compiling to that instead of a real CPU.
I want to make sure this is clear, I am fully aware that such a tutorial is a lot of work and, to keep it of reasonable size, many topics will not even be mentioned and some will only be explained to the extent that would enable an interested party to understand it fully only after some personal effort on their part. In some cases, such as for someone who isn't familiar with the CPU's instruction's set and assembly language, a significant personal effort.
Thank you for the feedback.