Recent

Author Topic: Using 2 separate Bytes to Represent Word Data Type  (Read 2381 times)

MarkMLl

  • Hero Member
  • *****
  • Posts: 7516
Re: Using 2 separate Bytes to Represent Word Data Type
« Reply #30 on: July 27, 2024, 09:27:08 am »
You can't know everything and everyone has to start somewhere. The problem is for someone else to judge how advanced the knowledge on a certain subject of a person actually is. That is why it is advisable to show your code and indicate where and what your issue is with that presented code.

Note that mixing signed and unsigned types can be a can of worms.

I agree, and I'd also remark that Pascal's automatic type promotion- at least for numerics- can be a big problem as (I believe) it can also be in C++.

Wirth relatively quickly backtracked, and Modula-2 is much stricter: apart from anything else it has separate rules relating to what is compatible and what is assignment-compatible. I believe that many more recent languages (notably Ada and Rust) have also tried to tighten things up, but have fallen foul of legacy expectations.

The bottom line it that one has to be very careful of automatic casts in

* Assignments

* Parameter passing

and in particular

* Intermediate results inside an expression

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Logitech, TopSpeed & FTL Modula-2 on bare metal (Z80, '286 protected mode).
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

jamie

  • Hero Member
  • *****
  • Posts: 6529
Re: Using 2 separate Bytes to Represent Word Data Type
« Reply #31 on: July 27, 2024, 03:28:55 pm »
I see some use of the HI(..) function here and as a reminder to those that may know and info for those that don't.

when using the HI function on types lesser than Size of Integer, you need to be aware of the results of any math operations within the functions prototype call.

For example

 HI(AWordType+1) will call the Integer size or DWORD size instead of the WORD size function and thus give you the wrong results.

 The reason for this is that any math operations result in an INTEGER and the compiler forgets what the original input type was, which in this case is a WORD and then returns the HI WORD of the integer instead of the HI byte of a WORD>

 Delphi does not do this because it uses the LEFT side type as the final type for the function unless there is a typecast.

 For Fpc to work correctly you need to typecast.

HI(WORD(MyWordType+1)) which will then return the Hi byte.

 It took me hours to find this bug when porting D code over. You need to be TypeCase conscience with lower size type other than integer;
 
The only true wisdom is knowing you know nothing

440bx

  • Hero Member
  • *****
  • Posts: 4486
Re: Using 2 separate Bytes to Represent Word Data Type
« Reply #32 on: July 27, 2024, 03:45:39 pm »
@Jamie,

I remember that bug.  Is it still present in v3.2.2 ?
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v3.2) on Windows 7 SP1 64bit.

jamie

  • Hero Member
  • *****
  • Posts: 6529
Re: Using 2 separate Bytes to Represent Word Data Type
« Reply #33 on: July 27, 2024, 03:48:08 pm »
I don't know, haven't checked. I simply cast to ensure perfect code on my side  :o

But it can be checked using a WORD type = $FF

doing a

HI(WordYtpe +1) should return the value 1

if the bug is still there, it will return the value 0
The only true wisdom is knowing you know nothing

jamie

  • Hero Member
  • *****
  • Posts: 6529
Re: Using 2 separate Bytes to Represent Word Data Type
« Reply #34 on: July 27, 2024, 03:53:36 pm »
apparently it is still there
Code: Pascal  [Select][+][-]
  1. procedure TForm1.Button1Click(Sender: TObject);
  2. Var
  3.   W:Word = $FF;
  4. begin
  5.   Caption := HI(W+1).ToString;
  6. end;                                
  7.  
  8.  

results to 0;

same happens when using Byte types, you don't get your upper nibble.

NOTE2
the same goes for LO function too.

« Last Edit: July 27, 2024, 04:03:52 pm by jamie »
The only true wisdom is knowing you know nothing

MarkMLl

  • Hero Member
  • *****
  • Posts: 7516
Re: Using 2 separate Bytes to Represent Word Data Type
« Reply #35 on: July 27, 2024, 04:07:55 pm »
Jamie, I think you're giving a well-considered example of the casting problem I mentioned earlier.

Having to type in a manual cast ("type transfer" in Wirth's terminology) can be tedious in the extreme, but at least then there's no excuse for ignoring the perils of sign extension etc.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Logitech, TopSpeed & FTL Modula-2 on bare metal (Z80, '286 protected mode).
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

jamie

  • Hero Member
  • *****
  • Posts: 6529
Re: Using 2 separate Bytes to Represent Word Data Type
« Reply #36 on: July 27, 2024, 04:10:09 pm »
I think code tools should auto insert a cast around that if it does not see one already.  :o


its not just about the LO and HI functions.

This concept causes issues with function overloading, you do something like that and it could call the wrong function in your own code or you get an error stating it can't determine the function to call.
« Last Edit: July 27, 2024, 04:15:11 pm by jamie »
The only true wisdom is knowing you know nothing

BrunoK

  • Hero Member
  • *****
  • Posts: 566
  • Retired programmer
Re: Using 2 separate Bytes to Represent Word Data Type
« Reply #37 on: July 27, 2024, 04:58:17 pm »
Joined a example of when / what happens re range check.
Code: Pascal  [Select][+][-]
  1. program Project1;
  2.  
  3. {$mode objfpc}
  4. {$H+}
  5.  
  6. uses
  7.   SysUtils,
  8.   StrUtils;
  9.  
  10. var
  11.   Rui16: UInt16;              // Rui16 and Ri16 are
  12.   Ri16: Int16 absolute Rui16; // at the same adress and the same size
  13.   B1, B2, b: Byte;
  14.   Unifier: array[0..1] of Byte;
  15. begin
  16.   Ri16 := -3;
  17.  
  18.   B1 := Hi(Ri16);
  19.   B2 := Lo(Ri16);
  20.  
  21.   WriteLn(IntToStr(Ri16) + ' Is Split into B1:' + IntToStr(B1) + ' and B2:' + IntToStr(B2));
  22.   WriteLn(IntToStr(Ri16) + ' Is Split into B1:' + HexStr(B1, 2) + ' and B2:' + HexStr(B2, 2));
  23.  
  24.   {$PUSH}
  25.   {$RANGECHECKS OFF}
  26.   Rui16 := B1 shl 8;      // Ok, does not overflow sign / 16 bit range
  27.   {$RANGECHECKS ON}
  28.   Rui16  := B1 shl 8;     // <- Also ok, does not overflow sign (There is none)
  29.                           //    doesnt overflow   / 16 bit range
  30.   {$RANGECHECKS OFF}
  31.   Rui16 := B1 shl 9;      // Ok, does not no Range Check
  32.   {$RANGECHECKS ON}
  33.   try
  34.     Rui16  := B1 shl 9;   // <- Not ok, shift pushes High bit outside of 16 bits.
  35.   except
  36.     Writeln('Rui16  := B1 shl 9; -> Not ok, shift overflow');
  37.   end;
  38.  
  39.   {$RANGECHECKS OFF}
  40.   Ri16  := B1 shl 8;      // <- Not ok (it is OK when done intentionaly)
  41.                           //    , but RANGECHECKS OFF
  42.   {$RANGECHECKS ON}
  43.   try
  44.     Ri16  := B1 shl 8;      // <- Not ok, a shift of a positive number should not
  45.                             //    change the SIGN bit of the result on a shift.
  46.                             //    sign bit. (Without range checking passes).
  47.   except
  48.     Writeln('Ri16  := B1 shl 8; -> Not ok, shift changes sign overflow on bit 15');
  49.   end;
  50.   {$POP}
  51.  
  52.   Rui16 := (B1 shl 8) or B2;
  53.  
  54.   WriteLn('Ri16: ', Ri16, '     Rui16: ', Rui16);
  55.  
  56.   Write('Hit enter -> '); ReadLn; // So we can see the console
  57.  
  58. end.

When running the program in the debugger insist on Continue button until the program gets to the readln;

MarkMLl

  • Hero Member
  • *****
  • Posts: 7516
Re: Using 2 separate Bytes to Represent Word Data Type
« Reply #38 on: July 27, 2024, 05:21:34 pm »
This concept causes issues with function overloading, you do something like that and it could call the wrong function in your own code or you get an error stating it can't determine the function to call.

Yes. Sven has tried to educate me before now, and I think it applies only to numeric types. I can't remember though whether you can fudge it by doing something like

Code: Pascal  [Select][+][-]
  1. type
  2.   reallyAByte= byte;
  3.  

...if you could then Hi(i: integer): integer should refuse to accept a parameter of that type.

I've also never explored whther

Code: Pascal  [Select][+][-]
  1. const
  2.  zeroByte: reallyAByte= 0;
  3.  

could be used to tighten things up in an expression: the programmer would /have/ to make his intention clear before it would compile.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Logitech, TopSpeed & FTL Modula-2 on bare metal (Z80, '286 protected mode).
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

local-vision

  • Jr. Member
  • **
  • Posts: 77
Re: Using 2 separate Bytes to Represent Word Data Type
« Reply #39 on: July 28, 2024, 12:41:13 pm »
Glad to read that my ignorance about the specifics regarding this topic are sprucing up some good conversation and points.

Thanks for the test BrunoK and important example/notes jamie, MarkMLI.

Seems that things are not exactly as they should work. When type casting Important to be explicit rather than implicit is part of the message here.

The Move() solution that I listed may not align to the formal way of doing what I want (see post 24# for project).

But it seems robust and there is no confusion.

MarkMLl

  • Hero Member
  • *****
  • Posts: 7516
Re: Using 2 separate Bytes to Represent Word Data Type
« Reply #40 on: July 28, 2024, 10:44:04 pm »
The Move() solution that I listed may not align to the formal way of doing what I want (see post 24# for project).

But it seems robust and there is no confusion.

The problem with Move() (and for that matter using absolute to overlay a variable of a different type) is that it completely bypasses all size checking etc.: get it wrong and you'll write all over the adjacent variable(s) or- worse- the frame pointer or return address on the stack. Some people like to pretend that Pascal is an implicitly safe language, but if it can do things like that...

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Logitech, TopSpeed & FTL Modula-2 on bare metal (Z80, '286 protected mode).
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

Khrys

  • Jr. Member
  • **
  • Posts: 82
Re: Using 2 separate Bytes to Represent Word Data Type
« Reply #41 on: July 29, 2024, 09:03:48 am »
Very generally speaking, reinterpreting the raw memory occupied by a variable of one type as if it belonged to a variable of another type is called type-punning. This can easily be achieved using pointer casts, however...
Code: Pascal  [Select][+][-]
  1. var
  2.   FooWord: UInt16;
  3.   FooByteA, FooByteB: UInt8;
  4. begin
  5.   FooWord := $BEEF;
  6.   FooByteA := PUInt8(@FooWord)[0]; // $EF on little endian, $BE on big endian machines
  7.   FooByteB := PUInt8(@FooWord)[1];
  8. end;
...this is not portable, mainly due to processor endianness. Worse still, in other languages such as C or C++ this is undefined behaviour (strict aliasing rule; dereferenced pointers' types must be compatible). In C (but not C++) this can be circumvented by using a  union, but the recommended standard method is indeed to just use  memcpy  and rely on the optimizer to elide the function call. I tested the  Move()  approach in FPC 3.2.2 with  -O3, but unfortunately the function call wasn't optimized out...



Anyhow, if you simply want to write words/dwords/qwords to a file in a machine-independent format, I'd recommend the  NtoBE/BEtoN  family of functions (native-to-big-endian and vice-versa conversions):
Code: Pascal  [Select][+][-]
  1. Stream.WriteWord(NtoBE(FooWord));
  2. FooWord := BEtoN(Stream.ReadWord());

hansotten

  • Full Member
  • ***
  • Posts: 100
    • The School of Wirth
Re: Using 2 separate Bytes to Represent Word Data Type
« Reply #42 on: July 29, 2024, 10:09:46 am »
ISO and Standard Pascal documents are here: http://pascal.hansotten.com/standard-pascal-and-validation/



 
http://pascal.hansotten.com/ Pascal for Small Machines. The School of Wirth, sources of old Pascal compilers,

BrunoK

  • Hero Member
  • *****
  • Posts: 566
  • Retired programmer
Re: Using 2 separate Bytes to Represent Word Data Type
« Reply #43 on: July 29, 2024, 10:41:40 am »
It took me hours to find this bug when porting D code over. You need to be TypeCase conscience with lower size type other than integer;
And another nicety of converting DELPHI code :
Code: Pascal  [Select][+][-]
  1. program Project1;
  2.  
  3. var
  4.   i: integer;
  5.   vArInt: array [0..4] of integer;
  6.   aArUsedCount : word = 0;
  7. begin
  8.   for i:=0 to aArUsedCount-1 do
  9.     WriteLn('to aArUsedCount-1 ',i);
  10.   WriteLn('After to aArUsedCount-1 ');
  11.  
  12.   for i:=0 to pred(aArUsedCount) do begin
  13.     WriteLn('to pred(aArUsedCount) ', i);
  14.     if i> 10 then
  15.      break;
  16.   end;
  17.   WriteLn('After to pred(aArUsedCount)');
  18.   ReadLn;
  19. end.
  20.  
Delphi dutifully considers Pred(Word) as Pred(integer(Word)).

I, for one , do not touch pred(), next(). As for HI and LO that's calling for trouble because not being natural as to the meaning.
If need arise to hack a bit, it is much clearer to use Trans-typing and eventually absolute.
Some client routines (Callbacks) may be called with DoSomething of an 'anoymous' type pointer from the caller (for example OnSortCompare).
So declaring the procedure
Code: Pascal  [Select][+][-]
  1. function MyCallBack(aPtr: Pointer): boolean;
  2. var
  3.   aMyRecordP: PMyRecord absolute aPtr;
  4. begin
  5.    Result aPMyRecordP^.Name = 'BrunoK' ;
  6. end;
  7.  
is in my POW easier to handle than trans-typing the pointer. But that's my opinion and surely someone else have a different opinion.
Very generally speaking, reinterpreting the raw memory occupied by a variable of one type as if it belonged to a variable of another type is called type-punning. This can easily be achieved using pointer casts, however...
Code: Pascal  [Select][+][-]
  1. var
  2.   FooWord: UInt16;
  3.   FooByteA, FooByteB: UInt8;
  4. begin
  5.   FooWord := $BEEF;
  6.   FooByteA := PUInt8(@FooWord)[0]; // $EF on little endian, $BE on big endian machines
  7.   FooByteB := PUInt8(@FooWord)[1];
  8. end;
More understandable in my POW :
Code: Pascal  [Select][+][-]
  1. procedure Foo;
  2. var
  3.   FooWord: UInt16;
  4.   FooBytes : packed array [0..1] of UInt8 absolute FooWord;
  5. begin
  6.   FooWord := $BEEF;        // $EF on little endian, $BE on big endian machines
  7.   if FooBytes[0] = $EF then
  8.     WriteLn('CPU is little endian');
  9.   if FooBytes[0] = $BE then
  10.     WriteLn('CPU is big endian');
  11. end;
  12.  
For the use, FooByte[n] suffice.

 

TinyPortal © 2005-2018