Recent

Author Topic: [SOLVED] Reading file in 32 bytes chunks; then XOR compare 4 byte portions  (Read 11392 times)

Gizmo

  • Hero Member
  • *****
  • Posts: 831
I need to code something to search for keys in a raw binary data file.

I need to read 32 bytes of data in sequences throughout a whole file. Starting from the END of that 32 byte sequence, I set my start position as the last 4 bytes. I then have to XOR the 4 bytes BEFORE them with the 8th set of 4 bytes before it, so Wi = W-i XOR W-8.

If the XORd result equals the first 4 bytes examined, then slide to the next 4 bytes and repeat until there are 3 concurrant XOR matches on the trot. When 3 successive matches are found (out of a potential eight), the entire 32 byte sequence is likely to be an AES key and needs to be copied out as a hex string.

What techniques and functions would people use if they wanted to read a file in sequences of 32 bytes, comparing 4 byte chunks in this way. Endianess also needs to be considered. It strikes me as one of those things that shouldn't be that hard but I spent 4 hours on it last night and ended no further forward than I started!

Thanks
« Last Edit: June 14, 2013, 12:20:12 am by Gizmo »

User137

  • Hero Member
  • *****
  • Posts: 1791
    • Nxpascal home
You can use TFileStream and static byte arrays. For example:
Code: [Select]
var
  data: array[0..27] of byte;
  compare, sums: dword;
...
  fs.Read(data, 28);
  fs.Read(compare, 4);
  sums:=0;
  // Start adding them up, and lastly check "if compare=sums then"...

Gizmo

  • Hero Member
  • *****
  • Posts: 831
I've got this, which is a useful way of getting all 4 byte values in a form that I can then say "XOR that with that" :

Code: [Select]
function TForm1.Finder(SourceFile:TFileStream):String;
var
  i, Value, BinaryBytesRead: Integer;
  TotalBytesRead, SizeOfSourceFile : Int64;
  SL : TStringList;
begin
TotalBytesRead := 0;
SL := TStringList.Create;
try
  SourceFile.Position := 0;
  SizeOfSourceFile := Sourcefile.Size;
  while TotalBytesRead < SourceFile.Size do
    begin
      SourceFile.ReadBuffer(Value, SizeOf(Value));//read a 4 byte integer
      inc(TotalBytesRead, SizeOf(Value));
      SL.Add(IntToHex(SwapEndian(Value), 4));
    end;
finally
  SourceFile.Free;
  for i := 0 to SL.Count -1 do
    begin
    Memo1.Lines.Add(SL[i]);
    end;
  SL.Free;
end;
end;     

but it's obviously very slow, inefficient and memory intensive utilising a StringList, and when I apply it to large files (of several Gb), it will take hours!

What would be an equally convenient (i.e. not too confusing) way of storing these values like this (one 4 byte value at a time) but in a more memory efficient way?

Thanks

ludob

  • Hero Member
  • *****
  • Posts: 1173
Define a type that matches best your 32 bytes. Ex:
Code: [Select]
TBlock=array [1..8] of qword;Define your file as a file of TBlock
Code: [Select]
MyFile:file of TBlock;and jus start reading your file:
Code: [Select]
type
  TBlock=array [1..8] of qword;
var
  Block: TBlock;
  MyFile:file of TBlock;
begin
  assignfile(MyFile,'myfiletotest');
  reset(MyFile);
  while not eof(MyFile) do
    begin
    read(MyFile,Block);
    //test your block
    end;
end.

When working with the block, just address your 4 byte parts as
Code: [Select]
Block[i].

Endianess also needs to be considered.
Why? nor XOR nor compare care about endianess. The algorithm as you described it (and as I understood it) is not subject to differences in endianess.
« Last Edit: June 12, 2013, 07:50:54 am by ludob »

taazz

  • Hero Member
  • *****
  • Posts: 5368
I think you mean
Code: [Select]
type
  TBlock=array [1..8] of DWord;
instead of QWord otherwise the block is 64 bytes long not 32.
Good judgement is the result of experience … Experience is the result of bad judgement.

OS : Windows 7 64 bit
Laz: Lazarus 1.4.4 FPC 2.6.4 i386-win32-win32/win64

User137

  • Hero Member
  • *****
  • Posts: 1791
    • Nxpascal home
Code: [Select]
var
  i, Value, BinaryBytesRead: Integer;
...
      SourceFile.ReadBuffer(Value, SizeOf(Value));//read a 4 byte integer
You should be careful with integer type. Integer is not only signed, but it is 2 bytes with some compilers and their options. If you need 4 bytes signed int, use longint type. I wouldn't say this unless you specifically expected it to be 4 bytes, and application only works with it.
« Last Edit: June 12, 2013, 11:15:34 am by User137 »

poiuyt555

  • Jr. Member
  • **
  • Posts: 91
Hi.
I don't want to start the new topic in case of my question..
But, how to get the file contents, that is shown in Hex-editors (WinHex and ets.) using TFileStream?
Use:
Code: [Select]
var
  data: array of byte;
...
  SetLength(data,fs.size);
  fs.Read(data, SizeOf(data));
And then convert data(i) to hex, and then display it?

And what type we must use, only array of byte (not integer, or word)?
Will it works correct in all cases?

ludob

  • Hero Member
  • *****
  • Posts: 1173
I think you mean
Code: [Select]
type
  TBlock=array [1..8] of DWord;
instead of QWord otherwise the block is 64 bytes long not 32.
You think right  ;)

Thanks.

taazz

  • Hero Member
  • *****
  • Posts: 5368
Hi.
I don't want to start the new topic in case of my question..

Sorry you have to start a new topic. Although I do understand that a new poster might not feel comfortable starting his own topics at first what you did is called thread hijacking and is considered rude. In any case search to tkweb and TKHexEditor or something along the lines its a Hex viewer component similiar to ones you find on those Hex editors. That should be a good start.
Good judgement is the result of experience … Experience is the result of bad judgement.

OS : Windows 7 64 bit
Laz: Lazarus 1.4.4 FPC 2.6.4 i386-win32-win32/win64

Gizmo

  • Hero Member
  • *****
  • Posts: 831
Thanks to Ludob, I have got this far. This compiles and runs, but the XOR is never returning true. I suspect this has something to do with the fact that not all the dwords required for the XOR are actually in the buffer at the same time, given that the first dword of the key has to be XOR with itself-1 and itself -8, but I don't really know. This is too advanced for my skill level I feel! :

Code: [Select]
type
  TBlock=array [0..7] of dword;    // a 32 byte array
var
  Block: TBlock;
  SourceFile: file of TBlock;
  i : Integer;
  SL : TStringList;

begin
  i := 0;
  SL := TStringList.Create;

  assignfile(SourceFile, SourceFileName);
  reset(SourceFile);
  while not eof(SourceFile) do
    begin
      read(SourceFile,Block);

      for i := 0 to 7 do
        begin
         if Block[i] > 0 then                               // ignore zero sized blocks
           if Block[i] = Block[i - 1] XOR Block[i - 7] then // Xi = Xi-1 XOR Xi-8
             begin
               SL.Add(IntToHex(SwapEndian(Block[i]), 4));   // If the XOR result is true, keep it
             end;
        end;
    end;
  CloseFile(SourceFile);

  // Display the list in the stringlist
  for i := 0 to SL.Count -1 do
    begin
    Memo1.Lines.Add(SL[i]);
    end;
  SL.Free;
end;

I have attached a picture that shows my aim. The highlighted area is a 32 byte AES key. As you can see, some of the byte regions required are outside of this 32-byte region. e.g. 9DCB3898 (X) = FE2EF993 (X-1) XOR 63E5C10B (X-8)
« Last Edit: June 13, 2013, 12:36:02 am by Gizmo »

Jurassic Pork

  • Hero Member
  • *****
  • Posts: 1290
hello,
you have an error in your code :
Code: [Select]
if Block[i] = Block[i - 1] XOR Block[i - 7] then // Xi = Xi-1 XOR Xi-8
it is :
Code: [Select]
if Block[i] = Block[i - 1] XOR Block[i - 8] then // Xi = Xi-1 XOR Xi-8
else try to read  the source file dword by dword with a dynamic array like this :
Code: [Select]
type
  TBlock= dword;    // 32 byte
var
  Block: TBlock;
  SourceFile: file of TBlock;
  i : Integer;
  SL : TStringList;
  ArrayBlock : Array of TBlock;
begin
  i := 0;
  SL := TStringList.Create;

  assignfile(SourceFile, 'f:\block.txt');
  reset(SourceFile);
  SetLength(ArrayBlock,Filesize(SourceFile) + 1);
  while not eof(SourceFile) do
    begin
      read(SourceFile,Block);
      ArrayBlock[i] := Block;
      inc(i);
    end;

      for i := 8 to Length(ArrayBlock) do
        begin
         if ArrayBlock[i] > 0 then                               // ignore zero sized blocks
           if ArrayBlock[i] = ArrayBlock[i - 1] XOR ArrayBlock[i - 8] then // Xi = Xi-1 XOR Xi-8
             begin
               SL.Add(IntToHex(SwapEndian(ArrayBlock[i]), 4));   // If the XOR result is true, keep it
             end;
        end;


  ArrayBlock := nil;
  CloseFile(SourceFile);

  // Display the list in the stringlist
  for i := 0 to SL.Count -1 do
    begin
    Memo1.Lines.Add(SL[i]);
    end;
  SL.Free;
end;                       

Friendly, J.P
Jurassic computer : Sinclair ZX81 - Zilog Z80A à 3,25 MHz - RAM 1 Ko - ROM 8 Ko

User137

  • Hero Member
  • *****
  • Posts: 1791
    • Nxpascal home
You have several errors there Jurassic Pork. To name few, for-loop uses wrong array indices, and he specifically intended to read the file 32 bytes at the time. The original code is honestly closer to right.

I need to read 32 bytes of data in sequences throughout a whole file. Starting from the END of that 32 byte sequence, I set my start position as the last 4 bytes. I then have to XOR the 4 bytes BEFORE them with the 8th set of 4 bytes before it, so Wi = W-i XOR W-8.

If the XORd result equals the first 4 bytes examined, then slide to the next 4 bytes and repeat until there are 3 concurrant XOR matches on the trot. When 3 successive matches are found (out of a potential eight), the entire 32 byte sequence is likely to be an AES key and needs to be copied out as a hex string.
Did you make these up or do you have a source?

I have problems understanding this explanation. How is there 8 potential matches in 32 bytes data sequence? There are 8 blocks of 4 bytes, you can only compare 7 of them to the last block. Even if the 7 first DWord's match the 8th one, it would mean they are all same number. That wouldn't make any sense. It also means that you propably can't stop the calculation in the middle and say it will propably match. If it is what i'd guess it is, sort of checksum of the beginning data part, then you would need to calculate the sum of all of the XORs. But i don't want to go guessing  :)
Code: [Select]
W[i] = W[-i] XOR W[-8] ... what?
edit: I saw the screenshot. So you need to actually read 40 bytes? Then
Code: [Select]
type
  TBlock=array [0..9] of dword;    // a 32+8 = 40 byte array

X = block[9], and X-1 = block[8]

Further on, did you mean it like this?
Code: [Select]
       for i := 0 to 7 do
       begin
         if Block[i] > 0 then                               // ignore zero sized blocks
           if Block[9] = Block[8] XOR Block[i] then
           begin
             SL.Add(IntToHex(SwapEndian(Block[i]), 4));   // If the XOR result is true, keep it
           end;
       end;
« Last Edit: June 13, 2013, 04:15:01 am by User137 »

Jurassic Pork

  • Hero Member
  • *****
  • Posts: 1290
you are right user137 , my code was just a try ( made too quickly) and my brain is in standby mode  :-[

Jurassic computer : Sinclair ZX81 - Zilog Z80A à 3,25 MHz - RAM 1 Ko - ROM 8 Ko

Gizmo

  • Hero Member
  • *****
  • Posts: 831
User137....as usual, you're correct!

Don't ask me what I was thinking with
Code: [Select]
W[i] = W[-i] XOR W[-8]

Of course, it is

Code: [Select]
Block[9] = Block[8] XOR Block[i] then

There's some tweaking to do to get it to cycle backwards (as it stands, it only computes Block[9]...I need it to check block 8, then block 7 and so on and each time check if the 4-byte sequence behind it and then the 8th 4-byte sequence behind it XORs...I realise that was as per the screenshot, but it needs to repeat that all the way through) but I think I have been helped enough to mark this as solved. Thanks again gents for your help....I'd never have worked this out on my own. 

« Last Edit: June 14, 2013, 12:21:29 am by Gizmo »

 

TinyPortal © 2005-2018