Recent

Author Topic: Has anyone done a disassembler before?  (Read 2110 times)

anyone

  • Guest
Has anyone done a disassembler before?
« on: October 30, 2020, 03:35:00 am »
Sometimes being ambitious is bad. Even more so if I am to reinvent the wheel.

I was thinking to create a simple i8086 disassembler (i386 CPU instructions are too complex for me to handle) using Pascal. We all know each opcode is corresponding to a hexadecimal number, but the real challenge is the operand.

Example:
Code: Pascal  [Select][+][-]
  1. 10 /r           ADC eb,rb       2,mem=7         Add with carry byte register into EA byte
  2. 11 /r           ADC ew,rw       2,mem=7         Add with carry word register into EA word
  3. 12 /r           ADC rb,eb       2,mem=7         Add with carry EA byte into byte register
  4. 13 /r           ADC rw,ew       2,mem=7         Add with carry EA word into word register
  5. 14 db           ADC AL,db       3               Add with carry immediate byte into AL
  6. 15 dw           ADC AX,dw       3               Add with carry immediate word into AX
  7. 80 /2 db        ADC eb,db       3,mem=7         Add with carry immediate byte into EA byte
  8. 81 /2 dw        ADC ew,dw       3,mem=7         Add with carry immediate word into EA word
  9. 83 /2 db        ADC ew,db       3,mem=7         Add with carry immediate byte into EA word

...word or byte register, immediate word or byte, EA word or byte, etc.

So back to my question, has anyone done a disassembler before using Pascal, or does FPC come with one in open-source?

Thank you everyone.

Awkward

  • Full Member
  • ***
  • Posts: 135
Re: Has anyone done a disassembler before?
« Reply #1 on: October 30, 2020, 05:40:24 am »
i didn't made/use disasm but met one in internet here https://github.com/MahdiSafsafi/UnivDisasm

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11452
  • FPC developer.
Re: Has anyone done a disassembler before?
« Reply #2 on: October 30, 2020, 06:09:35 am »
fpc comes with gnu's objdump

The (Nasm) tables that FPC uses to assemble might also help with the reverse operatiom

ccrause

  • Hero Member
  • *****
  • Posts: 856
Re: Has anyone done a disassembler before?
« Reply #3 on: October 30, 2020, 06:32:59 am »
FpDebug have disassemblers for x86 and avr.

Edit: Fixed x86 link
« Last Edit: October 30, 2020, 10:21:43 am by ccrause »

lucamar

  • Hero Member
  • *****
  • Posts: 4219
Re: Has anyone done a disassembler before?
« Reply #4 on: October 30, 2020, 09:33:03 am »
I did one long, long ago in TP but lost the source in a disk crash and the (paper) notes in a moving (lost a couple boxes choked full of books and notebooks). :'(

IIRC, though, a disassembler for the 8086 was not very difficult because the opcode/operands (don't rightly remember which) have some bits which tell you which mode was used for the operands: register, direct, indirect, register-indirect, etc.  Any good 8086 assembler reference (like, say, the Intel ones) will explain all that.

There are some tricky parts, like dealing with segment prefixes and other such overrides, but it was not very difficult.
Turbo Pascal 3 CP/M - Amstrad PCW 8256 (512 KB !!!) :P
Lazarus/FPC 2.0.8/3.0.4 & 2.0.12/3.2.0 - 32/64 bits on:
(K|L|X)Ubuntu 12..18, Windows XP, 7, 10 and various DOSes.

PascalDragon

  • Hero Member
  • *****
  • Posts: 5479
  • Compiler Developer
Re: Has anyone done a disassembler before?
« Reply #5 on: October 30, 2020, 09:33:22 am »
FpDebug have disassemblers for x86 and avr.

You linked the AVR disassembler twice. I take it you wanted to link to this. ;)

440bx

  • Hero Member
  • *****
  • Posts: 4029
Re: Has anyone done a disassembler before?
« Reply #6 on: October 30, 2020, 09:48:32 am »
I was thinking to create a simple i8086 disassembler (i386 CPU instructions are too complex for me to handle) using Pascal. We all know each opcode is corresponding to a hexadecimal number, but the real challenge is the operand.
...word or byte register, immediate word or byte, EA word or byte, etc.

So back to my question, has anyone done a disassembler before using Pascal, or does FPC come with one in open-source?
This isn't quite what you asked for but, distorm3 is a very capable disassembler library that comes with source - written in C - which is reasonably easy to understand.  If you're interested in learning from it, you can find it at https://github.com/gdabah/distorm

As @lucamar above mentioned, that source code plus the documentation from Intel or AMD, along with some effort on your part, should be enough to point you in the right direction.

HTH.
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

ccrause

  • Hero Member
  • *****
  • Posts: 856
Re: Has anyone done a disassembler before?
« Reply #7 on: October 30, 2020, 10:00:37 am »
You linked the AVR disassembler twice. I take it you wanted to link to this. ;)
Yes indeed, thanks Sven.  Trying to write posts on my phone's touch screen is obviously stretching my talents :-[

MarkMLl

  • Hero Member
  • *****
  • Posts: 6686
Re: Has anyone done a disassembler before?
« Reply #8 on: October 30, 2020, 11:22:16 am »
Also the generic GNU BFD stuff. Writing an opcode/operand decoder is fairly easy in the general case, but these days a disassembler such as IDA or Ghidra will also attempt to work through the control flow of a progam and use that to decide which areas of the input are data and which are accessible code. Looking further back, something from the '90s like Sourcer had some of that control flow stuff, but also a lot of inbuilt intelligence which allowed it to decode and annotate e.g. accesses to known areas of the address or I/O space.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

440bx

  • Hero Member
  • *****
  • Posts: 4029
Re: Has anyone done a disassembler before?
« Reply #9 on: October 30, 2020, 05:49:25 pm »
@MarkMLI

... these days a disassembler such as IDA or Ghidra will also attempt to work through the control flow of a progam and use that to decide which areas of the input are data and which are accessible code.
Indeed, without analyzing the code flow, a disassembler, even one that decodes every byte sequence correctly, will produce a result that leaves something to be desired.

The reason I mentioned distorm is because that library makes it easy to analyze the logic flow during the disassembly, which is absolutely necessary to obtain a genuinely usable result.
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

 

TinyPortal © 2005-2018