Recent

Author Topic: PE Dump / PEDUMP utility  (Read 639 times)

440bx

  • Hero Member
  • *****
  • Posts: 1290
PE Dump / PEDUMP utility
« on: September 03, 2019, 12:09:55 am »
Hello,

I decided to share one of my personal utilities.  This utility is a PE dump program that outputs the contents of a - surprise! - PE file, exe and/or dll.

I'll disappoint you upfront, source is _not_ included nor, do I intend to make it available in the near future.

I originally wrote this utility in C because no existing PE file dump program provided the level of detail I needed. After getting somewhat familiar with FPC, I thought it would be a good and simple exercise to port it to Pascal, this is the result.

What follows is _not_ an explanation of the PE file structure, only a short explanation, for those interested, on of how to use the program and interpret its output.

1. to execute the program, simply type

PeBytesF <executable>

the program always dumps the entire PE file.  There are _no_ switches to control what is dumped, it always dumps _everything_.  Note that in some cases, though rare, the resulting dump can be as large as half a gigabyte.

The output is _very_ detailed.  The program inspects and identifies _every_ byte that is part of the PE specification.

In spite of that, it is reasonably quick. Outputting about 110,000 lines per second on an average machine, in most cases it executes in a few seconds but, in some cases, Chrome_child.dll being one of them, it can take anywhere between 20 to 30 seconds depending on how fast the machine is.

What follows are a few examples of its output and what it means.

Quote

     40.0000                - DOS HEADER -

     40.0000         0  [  2]                        DOS signature : 5a4d     (MZ)

     40.0002         2  [  2]           bytes on last page of file :   90     (   144)
     40.0004         4  [  2]                           page count :    3
     40.0006         6  [  2]                    relocations count :    0
     40.0008         8  [  2]         size of header in paragraphs :    4
     40.000a         a  [  2]      minimum extra paragraphs needed :    0
     40.000c         c  [  2]         maximum memory in paragraphs : ffff     (65,535)

     40.000e         e  [  2]                  initial relative SS :    0
     40.0010        10  [  2]                           initial SP :   b8
     40.0012        12  [  2]                             checksum :    0
     40.0014        14  [  2]                           initial IP :    0
     40.0016        16  [  2]                  initial relative CS :    0
In the above,

1. The first column is the virtual address where the field will be located in memory if the exe/dll is loaded at its preferred load address.

2. The second column is the field's file offset.  This file offset can be used to locate and edit the value of the field in a hex editor (one of the main reason I wrote the utility.)

3. The third column is the field size in bytes.  All fields that are "additional" information, that is, information _not present_ in the PE file are in brackets.

4. Columns 4 (label) and 5 (field value) are self explanatory.

5. values in parentheses are alternative interpretation of the previous column's value - usually simply showing a hex value in decimal but, as shown for the DOS signature, the alternative interpretation is a string in that particular case.

Note that the "DOS HEADER" title has a virtual address but _no_ file offset nor size.  This is to make it even more obvious that it is just a header and the text shown on that line is _not_ present in the PE file. 


Other formatting conventions
Quote
    40.0084                 - IMAGE_FILE_HEADER -

    40.0084        84  [2]                              Machine :      8664
    40.0084 +                          IMAGE_FILE_MACHINE_AMD64

    40.0086        86  [2]                   Number of sections :        11    (    17)

    40.0088        88  [4]                      Time Date Stamp : 5ad8.467e    (2018/04/19 7:34:22)

    40.008c        8c  [4]              Pointer to symbol table :  744.cc00
    40.0090        90  [4]                    Number of symbols :      c19e    (49,566)

    40.0094        94  [2]              Size of optional header :        f0    (   240)

    40.0096        96  [2]                      Characteristics :        27
    40.0096 +                        IMAGE_FILE_RELOCS_STRIPPED           1
    40.0096 +                       IMAGE_FILE_EXECUTABLE_IMAGE           2
    40.0096 +                     IMAGE_FILE_LINE_NUMS_STRIPPED           4
    40.0096 +                    IMAGE_FILE_LARGE_ADDRESS_AWARE          20
In the above,

1. If a field's value has a mnemonic, as is the case for the value of the "Machine" field, the mnemonic text _always_ appears _below_ it.

2. The "Characteristics" field is broken into its constituent bit fields and the mnemonics corresponding to each bit is _always_ on a separate line.

Note also that, the virtual address of each mnemonic is the same as that of the complete field.  To make it more evident that they are a "piece" of the field above them, each virtual address is followed by a "+" sign.  That indicates those lines are simply interpretations of the line above.

More formatting conventions,
Quote
     40.0098                 - IMAGE_OPTIONAL_HEADER -

     40.0098        98  [2]                                Magic :      20b
     40.0098 +                     IMAGE_NT_OPTIONAL_HDR64_MAGIC
...
...
     40.00a8        a8  [4]                      Entry point rva :     14e0    [Va:     40.14e0] [FO:      ae0] [   .text]
     40.00ac        ac  [4]                   Base of code (rva) :     1000    [Va:     40.1000] [FO:      600] [   .text]
In the above, note the following:

1. Every RVA (relative virtual address) is followed by 3 bracketed fields, the first is its equivalent virtual address, the second is the field offset that corresponds to that virtual address and the third is the name of the section that contains that virtual address.

In the above example, the hex value 14e0 corresponds to a virtual address of
40.14e0 and a file offset of ae0 which is found in the .text section.   This information is shown for every RVA found in the PE file and it is one of the reasons this program's output can occasionally be very large.

Similarly, some PE fields are VA (virtual addresses) instead of RVAs (as in the case of TLS callbacks), in those cases, the first field is the VA's equivalent RVA.

and more....
Quote
    40.02f0                SECTION: 10 of 17

    40.02f0       2f0  [8]                                 Name : /4
             752.6920  [15]                                       .debug_aranges

    40.02f8       2f8  [4]                         Virtual size :      85a0    (34,208)

    40.02fc       2fc  [4]           (Relative) Virtual address :   74.a000

    40.0300       300  [4]                     Size of raw data :      8600    (34,304)

    40.0304       304  [4]      File offset/pointer to raw data :   72.5000
    40.0308       308  [4]   File offset/pointer to relocations :         0
    40.030c       30c  [4]   File offset/pointer to linenumbers :         0

    40.0310       310  [2]                Number of relocations :         0
    40.0312       312  [2]                Number of linenumbers :         0

    40.0314       314  [4]                      Characteristics : 4250.0040
    40.0314 +                          SCN_CNT_INITIALIZED_DATA          40
    40.0314 +                                 SCN_ALIGN_16BYTES     5
    40.0314 +                               SCN_MEM_DISCARDABLE    200.0000
    40.0314 +                                      SCN_MEM_READ   4000.0000
In the above,

Note that the DWARF debug section name is shown under the "Name" field _but_ no virtual address is shown, only a file offset.  This indicates a value (in this case the name of the DWARF debug section) that is _not_ mapped by the Windows loader.  IOW, if a program wants to read that value, it will have to load that part of the file _itself_, which means, it will be loaded somewhere/anywhere in the virtual address space, thus making the value of the virtual address impossible to predict by just inspecting the PE file.

This occurs with debugging information and also with security certificates.


other important considerations:

Most PE files have a hint name table and an import address table.  Those tables, in the PE file, are usually but not always identical.  When the table are identical, this causes this program to output the Hint-Name fields twice.  Once for the Hint Name Table and a second time for the Import Address Table.

There are some exceptions, among them:

1. The Borland linker does not populate the hint name table.  As a result, for Borland executables, only an Import Address Table is shown.

2. If the executable is bound then the Import Address Table consists of virtual addresses instead of references to hint names.

The second case is very easy to recognize, it appears as follows:
Quote

  78ca.bc94             - IMPORTS DIRECTORY - number of import descriptors : 4

  78ca.bc94                 IMPORT descriptor :  1  -  Library : ntdll.dll

  78ca.bc94  8.a694  [ 4]            Hint name table :    8.bd10    [Va:   78ca.bd10] [FO: 8.a710] [  .rdata]

  78ca.bc98  8.a698  [ 4]            Time Date Stamp : ffff.ffff
  78ca.bc9c  8.a69c  [ 4]            Forwarder chain : ffff.ffff

  78ca.bca0  8.a6a0  [ 4]           Library name rva :    8.bd00    [Va:   78ca.bd00] [FO: 8.a700] [  .rdata]
  78ca.bd00  8.a700  [10]       Imports library name : ntdll.dll

  78ca.bca4  8.a6a4  [ 4]       Import address table :    8.2000    [Va:   78ca.2000] [FO: 8.0a00] [  .rdata]


  78ca.bd10                 HINT NAME TABLE      : 91 entries

  78ca.bd10  8.a710  [8]             8.c740    [Va:   78ca.c740] [FO: 8.b140] [  .rdata]
  78ca.c740                                                           8.b140  [ 2]        Hint: 34b
  78ca.c742                                                           8.b142  [12]        Name: RtlFreeHeap

  78ca.bd18  8.a718  [8]             8.c74e    [Va:   78ca.c74e] [FO: 8.b14e] [  .rdata]
  78ca.c74e                                                           8.b14e  [ 2]        Hint: 79f
  78ca.c750                                                           8.b150  [11]        Name: swprintf_s

  78ca.bd20  8.a720  [8]             8.c75c    [Va:   78ca.c75c] [FO: 8.b15c] [  .rdata]
  78ca.c75c                                                           8.b15c  [ 2]        Hint: 3a3
  78ca.c75e                                                           8.b15e  [21]        Name: RtlInitUnicodeString
...
...
more hint name table entries
...
...
   78ca.2000                 IMPORT ADDRESS TABLE : 91 entries

   78ca.2000  8.0a00  [8]          78ea.3200
   78ca.2008  8.0a08  [8]          78e9.7350
   78ca.2010  8.0a10  [8]          78ea.5280
   78ca.2018  8.0a18  [8]          78e9.15b0
   78ca.2020  8.0a20  [8]          78e7.7f70
   78ca.2028  8.0a28  [8]          78ea.1430
   78ca.2030  8.0a30  [8]          78ea.1400
   78ca.2038  8.0a38  [8]          78ea.1630
   78ca.2040  8.0a40  [8]          78ea.1480
   78ca.2048  8.0a48  [8]          78e7.6eac
   78ca.2050  8.0a50  [8]          78ed.f370
   78ca.2058  8.0a58  [8]          78ea.3000
   78ca.2060  8.0a60  [8]          78ea.2fc0
   78ca.2068  8.0a68  [8]          78ea.4d40
In the above, when both, the hint name table and the Import Address Table are present and the executable is not bound then, the tables will have the same information.  When the executable is bound, the Hint Name Table will point to Hint-Name pairs while the Import Address Table will contain pointers to where the functions they import are supposed to be found.

Since this post may be reaching the limit of what the forum software allows, additional information will continue in posts after this one.

Download and enjoy.

PS: I would like this program/utility to be available ONLY as an attachment to this post.  IOW, don't re-post it anywhere else.  Thank you.











« Last Edit: October 02, 2019, 08:28:12 am by 440bx »
using FPC v3.0.4 and Lazarus 1.8.2 on Windows 7 64bit.

skalogryz

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 2291
    • havefunsoft.com
Re: PE Dump utility
« Reply #1 on: September 03, 2019, 04:02:50 am »
it's a bit odd:
I decided to share one of my personal utilities.
...
PS: I would like this program/utility to be available ONLY as an attachment to this post.  IOW, don't re-post it anywhere else.  Thank you.

Overall, currently supplied (with fpc/lazarus) objdump.exe provides all the pieces of information.
Likely it didn't do when you started the tool, but these days...

objdump.exe has at least one advantage over your tool - it can be distributed :)
« Last Edit: September 03, 2019, 04:10:35 am by skalogryz »
Patron Cocoa Widgetset development https://www.patreon.com/skalogryz

440bx

  • Hero Member
  • *****
  • Posts: 1290
Re: PE Dump utility
« Reply #2 on: September 03, 2019, 05:07:01 am »
it's a bit odd:
I decided to share one of my personal utilities.
...
PS: I would like this program/utility to be available ONLY as an attachment to this post.  IOW, don't re-post it anywhere else.  Thank you.
I don't see it as odd. My primary intention is to share it with FPC users.

Overall, currently supplied (with fpc/lazarus) objdump.exe provides all the pieces of information.
Likely it didn't do when you started the tool, but these days...
objdump does not come _remotely_ close to the level of detail this program outputs.  That said, objdump does dump the DWARF/stabs debug sections which this utility doesn't, it only outputs the COFF symbols if present.  For Windows PE files on x86 (32 and 64 bit), that is the _only_ feature objdump has over this program.  I didn't include it because I don't need it (never had to manually edit debug symbols.)

In addition to that, objdump is _very_ slow and, has a number of bugs this program does not have.  Not to mention that the output of objdump is very poorly formatted to the point of being close to being incomprehensible and, it cannot be used as a roadmap to edit a PE file, way too much information missing (no file offsets, no field sizes, tables are not output in raw format, only "cooked", making it unusable for hex editing.)

objdump.exe has at least one advantage over your tool - it can be distributed :)
That much is true but, anyone who wants this tool can come get it here.

That said, while I expected the number of downloads to be quite low since it is a somewhat specialized utility, I am a little bit surprised that, as I write this, not even one download so far.  It surprises me that so few people are interested in understanding the internal format of their programs.

It's there for whoever wants it and wants to use it to find out what's in their PE files or, to learn about the PE format (x86, 32 and 64 bit only.)


using FPC v3.0.4 and Lazarus 1.8.2 on Windows 7 64bit.

skalogryz

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 2291
    • havefunsoft.com
Re: PE Dump utility
« Reply #3 on: September 03, 2019, 05:44:32 am »
That said, while I expected the number of downloads to be quite low since it is a somewhat specialized utility, I am a little bit surprised that, as I write this, not even one download so far.  It surprises me that so few people are interested in understanding the internal format of their programs.
you've to wait for 24 hours before making any conclusions. FPC users are on a different time zones.

Besides that, can you name any FPC related utility that comes without sources? and/or cannot be freely distributed?
Patron Cocoa Widgetset development https://www.patreon.com/skalogryz

440bx

  • Hero Member
  • *****
  • Posts: 1290
Re: PE Dump utility
« Reply #4 on: September 03, 2019, 06:12:23 am »
Besides that, can you name any FPC related utility that comes without sources? and/or cannot be freely distributed?
I readily concede those points.  My intention is to make the utility available to anyone who needs it, not to provide the source for it, which would open it to being "bastardized" and/or "cannibalized".

As far as its distribution, The main reason I don't want it distributed anywhere else is, because the copy attached to the forum post is the _only_ copy I can reasonably expect to be the original, unadulterated, copy.  IOW, it is a reasonable attempt to protect potential users and, for that protection to be effective, the potential user should get it from the first post's attachment.
using FPC v3.0.4 and Lazarus 1.8.2 on Windows 7 64bit.

skalogryz

  • Global Moderator
  • Hero Member
  • *****
  • Posts: 2291
    • havefunsoft.com
Re: PE Dump utility
« Reply #5 on: September 03, 2019, 07:10:43 am »
As far as its distribution, The main reason I don't want it distributed anywhere else is, because the copy attached to the forum post is the _only_ copy I can reasonably expect to be the original, unadulterated, copy.  IOW, it is a reasonable attempt to protect potential users and, for that protection to be effective, the potential user should get it from the first post's attachment.
have you considered Code Signing?
Patron Cocoa Widgetset development https://www.patreon.com/skalogryz

440bx

  • Hero Member
  • *****
  • Posts: 1290
Re: PE Dump utility
« Reply #6 on: September 03, 2019, 07:33:01 am »
have you considered Code Signing?
That's a reasonable suggestion.  In the case of this utility, I simply want to make it available, someone who downloads responsibly should simply get it from the first post in this thread, that's the "signing".
using FPC v3.0.4 and Lazarus 1.8.2 on Windows 7 64bit.