Lazarus

Announcements => Third party => Topic started by: FlierMate on December 25, 2020, 05:00:59 pm

Title: My simple Win32 Compiler
Post by: FlierMate on December 25, 2020, 05:00:59 pm
Introducing my simple Win32 compiler:

https://dev.to/bookhanming1/i-ve-had-fun-learning-to-build-my-own-back-end-compiler-1o86

https://dev.to/bookhanming1/i-have-written-a-basic-compiler-what-next-2nam

Satay Compiler supports two commands:
Quote
ReadLine
WriteLine

You can use Satay IDE (SATAYIDE.exe) to write code and compile from within, or use any text editor and then run the command-line tool (SATAY.exe):
Code: [Select]
WriteLine Hello World!
WriteLine Press Enter to quit...
ReadLine

Package includes:
Code: [Select]
08/01/2021  07:57 PM           146,870 HEXDUMP.exe
21/01/2021  10:07 PM            96,768 SATAY.exe
25/01/2021  07:30 AM           349,696 SATAYIDE.exe
25/01/2021  07:30 AM             8,798 SATAYIDE.pas
25/01/2021  07:31 AM                65 TEST.txt
25/01/2021  07:36 AM               167 TEST1.txt
21/01/2021  10:07 PM                28 TEST2.txt
              8 File(s)        602,392 bytes

Title: Re: My Win32 Reassembler (prototype)
Post by: FlierMate on December 26, 2020, 07:54:34 am
I was 'anyone', but that user name can be confusing....
Title: Re: My Win32 Reassembler (prototype)
Post by: MarkMLl on December 26, 2020, 12:28:10 pm
Please could you explain a bit more what you are trying to do here.

MarkMLl
Title: Re: My Win32 Reassembler (prototype)
Post by: FlierMate on December 26, 2020, 12:58:55 pm
Please could you explain a bit more what you are trying to do here.

MarkMLl

Hi, it all started from my conversation in this thread (involving my old username and your goodself, too):
https://forum.lazarus.freepascal.org/index.php/topic,51683.0.html

There are quite a number of expert in PE (especially FPC developer themselves) on this forum who might give constructive feedback to me for my "prototype of reassembler" because I am still learning the reverse-engineering of portable executable.

And since my project is using Pascal, certainly I would post and share my research work on here as well.

As someone who is curious in how operating system works internally, I have great delight to keep improving this project, and I hope you can share your views on it.
Title: Re: My Win32 Reassembler (prototype)
Post by: MarkMLl on December 26, 2020, 01:24:51 pm
Right. So what is a "reassembler"? The name would suggest that it's something that patches a binary.

Your description of a simple language reminds me of pilot... and you /really/ don't want to read ESR's comments on that :-)

MarkMLl
Title: Re: My Win32 Reassembler (prototype)
Post by: 440bx on December 26, 2020, 01:43:30 pm
Hi, it all started from my conversation in this thread
From what you posted in that thread, I guess that what you are calling a "reassembler" is actually a bare bones, very simple compiler.  Is this correct ?

As someone who is curious in how operating system works internally,
Just as a comment, there is a lot more (a whole lot more!) to O/S internals than the executable format (PE in the case of Windows.)

Since you are starting, I'd suggest you start by looking at how others have done it.  A particularly interesting project is SuperPascal from Per Brinch Hansen.  It's a hand coded Pascal compiler (produces bytecode), overall fairly easy to understand and, there is a version on github that compiles with FPC which means, you can run it under the debugger in Lazarus and see how it works. 

Once you understand that, which will take some work but, it's not too hard then, you'll be ready to get more ambitious and generate actual machine code (or Assembly code first.) 

Another compiler you may want to look at is one from Vitaly Tereshkov (I hope I got his name right) which you can find at https://forum.lazarus.freepascal.org/index.php/topic,49082.msg354754.html#msg354754  that one generates machine code, is self hosting and can be compiled with FPC as well which means you can see how it works by debugging it.

HTH.
Title: Re: My Win32 Reassembler (prototype)
Post by: MarkMLl on December 26, 2020, 03:15:46 pm
Considering what 440bx has said, I'd add that the Wikipedia page on Meta-2 is extremely useful, including its reference PDF.

Following on from that, http://bitsavers.informatik.uni-stuttgart.de/components/intel/_dataBooks/1983_iAPX_286_Operating_System_Writers_Guide.pdf is a description of how Intel expected an operating system for its chips to be written. Some of it is outdated since it refers to technology elements that Intel chose to deprecate, but I still consider it to be an extremely useful resource.

MarkMLl
Title: Re: My Win32 Reassembler (prototype)
Post by: FlierMate on December 26, 2020, 04:23:13 pm
From what you posted in that thread, I guess that what you are calling a "reassembler" is actually a bare bones, very simple compiler.  Is this correct ?

Yes, that is what I meant. I pick the name 'reassembler' because the word 'compiler' is far more powerful and yet mine is just a simple one.

Just as a comment, there is a lot more (a whole lot more!) to O/S internals than the executable format (PE in the case of Windows.)

I learn very slow, it took more than one year for me to build this simple 'compiler' which does basically nothing.

Since you are starting, I'd suggest you start by looking at how others have done it.  A particularly interesting project is SuperPascal from Per Brinch Hansen.  It's a hand coded Pascal compiler (produces bytecode), overall fairly easy to understand and, there is a version on github that compiles with FPC which means, you can run it under the debugger in Lazarus and see how it works. 

Once you understand that, which will take some work but, it's not too hard then, you'll be ready to get more ambitious and generate actual machine code (or Assembly code first.) 

Another compiler you may want to look at is one from Vitaly Tereshkov (I hope I got his name right) which you can find at https://forum.lazarus.freepascal.org/index.php/topic,49082.msg354754.html#msg354754  that one generates machine code, is self hosting and can be compiled with FPC as well which means you can see how it works by debugging it.


The name of author is Vasiliy Tereshkov. Terrific skills! Where did he or she learn that?
(Although I see he / she also hard-coded the DOS stub)

That is a great achievement, his /her compiler is already matured.

If mine was compared to his /hers, I was like living in 2000 B.C trying to learn science, but he / she would be a time-traveler from few thousands years beyond.

I am speechless...


HTH.

It helps. Thank you.

Right. So what is a "reassembler"? The name would suggest that it's something that patches a binary.

Your description of a simple language reminds me of pilot... and you /really/ don't want to read ESR's comments on that :-)

MarkMLl

In a sense, yes, it is a 'reassembler', but instead of patching a binary program, it patches the only one binary program that has been hard-coded in the 'reassembler' itself.... if I am allowed to use the word 'compiler', I would be more than happy.

Thank you MarkMLI for bringing up this topic of discussion.
Title: Re: My Win32 Reassembler (prototype)
Post by: 440bx on December 26, 2020, 06:15:35 pm
I learn very slow, it took more than one year for me to build this simple 'compiler' which does basically nothing.
There are some tutorials online on how to build a compiler.  Jack Crenshaw's is one of the better known ones and it is easy to follow and understand.  Another example worth looking at is JdeHaan's (an FPC forum member) Gear language implemented in FPC.  It's an interpreter but the scanning and parsing parts of it apply to writing a compiler.  It's well documented and fairly easy to understand.  He did an excellent job.  You can find his implementation at https://github.com/jdehaan2014/GearLanguage

IF (note the big IF) you can find it at a reasonable price, the book : Per Brinch Hansen on Pascal Compilers might very well be the best introductory text on compilers ever written but it's out of print and used copies are usually sold for exorbitant prices (don't pay more than $50.00 and a reasonable shipping fee)

The name of author is Vasiliy Tereshkov. Terrific skills! Where did he or she learn that?
Thank you for correcting the name.  FYI, the author is a "he".  As far as where he learned, I surmise he did it like most everyone else did, by reading books and looking at how other people went about implementing a compiler.

It helps. Thank you.
Glad it was helpful and, you're welcome.

HTH.
Title: Re: My Win32 Reassembler (prototype)
Post by: MarkMLl on December 26, 2020, 07:05:40 pm
In a sense, yes, it is a 'reassembler', but instead of patching a binary program, it patches the only one binary program that has been hard-coded in the 'reassembler' itself.... if I am allowed to use the word 'compiler', I would be more than happy.

Thank you MarkMLI for bringing up this topic of discussion.

Please stick to the standard terminology: it's a compiler, and nobody will kick you for doing something crude as a learning exercise (unlike, dare I say it, Python :-)

"Reassembler" is a problematic name, since it really implies something like taking a binary program, patching it, and reassembling it with checksums and signatures as appropriate. Such a thing would be valuable, you're not doing it, so the name is best avoided.

And you're welcome :-)

It can be very difficult for "an outsider"- even an engineer like me- to get into this sort of thing, since so much blatant bullshit is written about it by the computer science priesthood.

The fact is that most of what has been written about compilers focusses on taking a difficult syntax and making sure that it can be compiled without requiring inordinate resources. I have very little time for that, and prefer the philosophy that (by analogy) if you're writing obscure and difficult English it's down to you to improve your presentation, rather than expecting every reader to "get educated".

Most "real" computer languages can be compiled efficiently using a technique called recursive descent, which you will find discussed in that Wp article I pointed you at earlier.

MarkMLl


Title: Re: My Win32 Reassembler (prototype)
Post by: mika on December 26, 2020, 07:35:03 pm
I have included almost everything in the MAKECON.zip, including the CONSOLE.asm (Assembly language), because its output PE is useful for my learning purpose (reverse-engineering).
You are at very beginning. My suggestion:
Search internet for TExeHeader. Replace your ConSec1 and ConSec3 with proper records. Read some documentation about the relevant topics. Play around.
Looking at others code is nice and what not, but true learning is when you put your own code together. Good luck.

Title: Re: My Win32 Reassembler (prototype)
Post by: FlierMate on December 27, 2020, 12:12:53 pm
You are at very beginning. My suggestion:
Search internet for TExeHeader. Replace your ConSec1 and ConSec3 with proper records. Read some documentation about the relevant topics. Play around.
Looking at others code is nice and what not, but true learning is when you put your own code together. Good luck.

Thank you for your feedback. Yes, replacing the DOS Stub and PE COFF header with proper typed record will be in next stage. You said "you are at very beginning" means a lot to me. I will learn more and write better code in the future.   :)
Title: Re: My very simple Win32 Compiler (prototype)
Post by: MarkMLl on December 27, 2020, 12:33:45 pm
Point about record types etc. in this context. There's all very well for output, but for input eventually you will be looking at things character-by-character.

I don't know how far you've got in your reading, but generally speaking the sequence runs something like this:

* The lexer reads character-by-character, and assembles individual lexemes (identifiers, numbers, quoted strings and so on). It keeps a record of the character sequence it's processed and what hasn't yet been processed and stops when it finds something unexpected, so if necessary an error message can identify the point in a line where things went wrong.

* The parser knows the type of lexeme it's expecting and what to do in response.

* Variable names etc. go into a symbol table when first parsed, and might accumulate properties which are referred to later.

* When the parser (or a separate code generation stage) outputs code, it might refer back to the symbol table (e.g. to decide whether a variable is to be manipulated as a byte or word).

In a simple compiler, the parser will output code (often assembler source) on the fly. In a more complex one it will build a tree representing an expression, and that tree will be optimised and output.

Even if that lot is beyond you for the moment, just remember that it is a mistake to try to apply high-level text processing commands to the input.

MarkMLl



Title: Re: My very simple Win32 Compiler (prototype)
Post by: avra on December 29, 2020, 11:48:01 am
This might interest you: https://wiki.freepascal.org/Make_your_own_compiler,_interpreter,_parser,_or_expression_analyzer
Title: Re: My very simple Win32 Compiler (prototype)
Post by: FlierMate on January 13, 2021, 02:54:40 pm
I'm sorry, all. I think the Pascal source code itself of my basic compiler was not professional, I have disowned and removed the source code from everywhere.

However, I still keep the EXE, and I introduce it on DEV community which makes the "basic compiler" looks more appealing.

I apologize again if I upset anyone of you.
Title: Re: My very simple Win32 Compiler (prototype)
Post by: FlierMate on January 13, 2021, 02:56:23 pm
......Even if that lot is beyond you for the moment, just remember that it is a mistake to try to apply high-level text processing commands to the input.

MarkMLl

Noted with thanks, it is beyond my level at the moment, but I appreciate you for pointing out the mistake in the program.
Title: Re: My very simple Win32 Compiler (prototype)
Post by: MarkMLl on January 13, 2021, 03:11:23 pm
That was intended to be more of a general warning that anything else, but /is/ derived from having done this sort of thing multiple times without the questionable benefit of a computer science background.

MarkMLl
Title: Re: My very simple Win32 Compiler (prototype)
Post by: lucamar on January 13, 2021, 03:51:42 pm
[...] I have disowned and removed the source code from everywhere.

Badly done, IMHO. You should own it even as a "failure" and keep it around somewhere, even if just as an example of how things should NOT be done.

Even (or specially) bad code can be a very useful didactic resource: most students will look at it and maybe laugh a little at its clumsiness in public ... but privately they will think "Wow! I was about to do exactly that! Thank the gods I saw it in time!", and it'll at least have served a purpose as an aid in learning.
Title: Re: My Win32 Reassembler (prototype)
Post by: hansotten on January 13, 2021, 04:02:19 pm


IF (note the big IF) you can find it at a reasonable price, the book : Per Brinch Hansen on Pascal Compilers might very well be the best introductory text on compilers ever written but it's out of print and used copies are usually sold for exorbitant prices (don't pay more than $50.00 and a reasonable shipping fee)

I is here: http://pascal.hansotten.com/per-brinch-hansen/
Title: Re: My very simple Win32 Compiler (prototype)
Post by: FlierMate on January 13, 2021, 04:29:00 pm
Badly done, IMHO. You should own it even as a "failure" and keep it around somewhere, even if just as an example of how things should NOT be done.

Even (or specially) bad code can be a very useful didactic resource: most students will look at it and maybe laugh a little at its clumsiness in public ... but privately they will think "Wow! I was about to do exactly that! Thank the gods I saw it in time!", and it'll at least have served a purpose as an aid in learning.

Never thought of that, thanks. I learn the right attitude from you- own it as a failure. Nevertheless, I think I will be able to rewrite the code with minor improvement if I want to.
For example, I will use Cardinal for DWORD, and Array [1..2] of Cardinal for 4-byte address and 4-byte size. However, I am not sure how the Cardinal will be stored in binary - MSB and LSB in reversed order?

I is here: http://pascal.hansotten.com/per-brinch-hansen/

OMG  :o , you're an expert. I am speechless, the book(s) are free!

Title: Re: My very simple Win32 Compiler (prototype)
Post by: MarkMLl on January 13, 2021, 04:42:05 pm
For example, I will use Cardinal for DWORD, and Array [1..2] of Cardinal for 4-byte address and 4-byte size. However, I am not sure how the Cardinal will be stored in binary - MSB and LSB in reversed order?

Stick to native numeric representation until you really need something different, i.e. only do the conversion at input/output time.

MarkMLl
Title: Re: My very simple Win32 Compiler (prototype)
Post by: FlierMate on January 13, 2021, 05:11:18 pm
Stick to native numeric representation until you really need something different, i.e. only do the conversion at input/output time.

MarkMLl

 ;D I am grateful for your guidance( tips & tricks) thus far.
Title: Re: My Win32 Reassembler (prototype)
Post by: 440bx on January 13, 2021, 06:57:04 pm
I is here: http://pascal.hansotten.com/per-brinch-hansen/
You've really done a a great job with your website.

I am extremely pleased to see the book Brinch Hansen on Pascal Compilers available for download.  IMO, it's the best starting point.  Anyone interested in compilers should read that book at least twice. :)

;D I am grateful for your guidance( tips & tricks) thus far.
Read the Brinch Hansen on Pascal Compilers book, you'll be glad you did (for multiple reasons.)
Title: Re: My Win32 Reassembler (prototype)
Post by: MarkMLl on January 13, 2021, 07:49:47 pm
I is here: http://pascal.hansotten.com/per-brinch-hansen/

That's a particularly good collection. I've just checked and I note that the web page is archived at archive.org, but the PDFs might not be... I don't know whether that's fixable. Alternatively I hope you've got it archived/mirrored somewhere as your professional legacy.

MarkMLl
Title: Re: My very simple Win32 Compiler (prototype)
Post by: valdir.marcos on January 14, 2021, 04:24:17 pm
IF (note the big IF) you can find it at a reasonable price, the book : Per Brinch Hansen on Pascal Compilers might very well be the best introductory text on compilers ever written but it's out of print and used copies are usually sold for exorbitant prices (don't pay more than $50.00 and a reasonable shipping fee)
I is here: http://pascal.hansotten.com/per-brinch-hansen/

I is here: http://pascal.hansotten.com/per-brinch-hansen/
OMG  :o , you're an expert. I am speechless, the book(s) are free!

I is here: http://pascal.hansotten.com/per-brinch-hansen/
You've really done a a great job with your website.
I am extremely pleased to see the book Brinch Hansen on Pascal Compilers available for download.  IMO, it's the best starting point.  Anyone interested in compilers should read that book at least twice. :)
Now prices will fall...
Title: Re: My Win32 Reassembler (prototype)
Post by: valdir.marcos on January 14, 2021, 04:25:29 pm
I is here: http://pascal.hansotten.com/per-brinch-hansen/
You've really done a a great job with your website.
+1

Quote
I am extremely pleased to see the book Brinch Hansen on Pascal Compilers available for download.  IMO, it's the best starting point.  Anyone interested in compilers should read that book at least twice. :)
+1
Title: Re: My simple Win32 Compiler
Post by: FlierMate on January 27, 2021, 09:33:28 pm
CLARIFICATION:


I wish to point out that this Satay Compiler might generates EXE which will be falsely detected by antivirus software as Trojan.

If you are concerned, my original post about this issue is on FASM official message board: https://board.flatassembler.net/topic.php?t=21786 (External Link)

Please refer to attached screenshots showing Text1.txt.EXE was first reported as malware by Windows Defender. After I submitted report, Microsoft Security Intelligence removed the detection (for that particular EXE only).

The EXE generated by Satay Compiler makes calls to the following Win32 API functions only:
1. GetStdHandle
2. ReadConsoleA
3. WriteConsoleA
4. ExitProcess

It does nothing malicious. It is unclear to me as to why some AV software reporting these canned EXEs as trojan.

Title: Re: My simple Win32 Compiler
Post by: marcov on January 27, 2021, 09:44:25 pm
The EXE generated by Satay Compiler makes calls to the following Win32 API functions only:
1. GetStdHandle
2. ReadConsoleA
3. WriteConsoleA
4. ExitProcess

It does nothing malicious. It is unclear to me as to why some AV software reporting these canned EXEs as trojan.

It is very simple. Antivirus authors are lazy and reverse the burden of proof, basically denying everything unless proven otherwise.

So assume they simply block everything, and then only start adding exceptions for well known "EXE" signatures. If you think that through, you actually get pretty close to actual antivirus behaviour. It can't be coincidence :-)

So inventing a new kind of EXE signature only invites trouble.
Title: Re: My simple Win32 Compiler
Post by: FlierMate on January 27, 2021, 10:03:38 pm
The EXE generated by Satay Compiler makes calls to the following Win32 API functions only:
1. GetStdHandle
2. ReadConsoleA
3. WriteConsoleA
4. ExitProcess

It does nothing malicious. It is unclear to me as to why some AV software reporting these canned EXEs as trojan.

It is very simple. Antivirus authors are lazy and reverse the burden of proof, basically denying everything unless proven otherwise.

So assume they simply block everything, and then only start adding exceptions for well known "EXE" signatures. If you think that through, you actually get pretty close to actual antivirus behaviour. It can't be coincidence :-)

So inventing a new kind of EXE signature only invites trouble.

You are most probably right, because you are developer of Free Pascal Compiler.

 :D   Thank you for the insights.
Title: Re: My simple Win32 Compiler
Post by: FlierMate on March 05, 2021, 01:14:15 pm
Nonetheless, I have hosted this closed-source simple compiler publicly on :

https://sataycompiler.tech/

(Only the IDE itself has Pascal source code)
TinyPortal © 2005-2018