Recent

Author Topic: [SOLVED]Confusion with signed types: is ( 256 * 1 = 0 ) a design issue or a bug?  (Read 1390 times)

old_DOS_err

  • Newbie
  • Posts: 5
Hi,

[Edited, oops, missed initialising the var in the below code. There is a second post from me below that has the correct code, with the problem replicated.]

Currently doing a front end (DOS) for a little utility for making data files of a given size with repeating data (current main project is extending a duplicate finder, base already works and using that, to find duplicates based on content not name, duplicates get moved out to a designated folder). I have about 20 main library files, file work, string, stringlist, etc. Currently updating my string number unit, had a function in there whereby the user could enter a number in string form, optionally with commas, and the function removed the commas and returned the actual value (generally do Boolean functions and return the var in either Var or Out param). Like most of my number functions, this used DWord.

Decided to update this function, thought it would be useful to extend the range to include negative and QWord (though wouldn't be needed for the front end, I do like to be thorough when I can, and I would have a general number converter). Removing the commas was extremely easy and took only 1 line using StringReplace. You know you have those moments as a programmer (and probably in life) where you find yourself thinking "Why oh Why did I start this?" First problem was getting around pascal's very tight type checking (dear old $V is gone, though not sure how useful it would have been anyway). Eventually worked out a way with untyped reference param and pointer. Got weird results, realized it was a register issue (fixed that). Since I call the converter with just the var and there is no way to know anything about the Var (SizeOf on the untyped reference returns 0 for some reason, can't use Array of Const, I do use that quite often, because there is a 'bug' and it doesn't take DWord, get around by either converting to a string or cast to QWord, but no use here).

Finally got the bugs out of code. Had to replace StrToInt, it had too many issues and too many variations (not tried old reliable Val yet, but if that has gone from the old Turbo Pascal style to the Free Pascal style, then suspect I will be out of luck). So wrote my own string to number converter, easy enough (though really should be unnecessary). Running auto testing (the one good aspect of getting old, is what I have lost in speed of coding I have gained in commenting and testing). The unsigned number types all worked fine (had to reduce the QWord test, it was taking way too long to test EVERY number, used large steps until near the end then went back to single step).

This is the function header, just to make it a bit clearer:

Function p_str_num__str_to_num( Const s_in : String; Var n_ret; size_of_var : Byte; signed_var : Boolean = False ) : Boolean;

p_str_num__ is my own prefix, all my units have their own prefix, so I know it's mine and which unit it is in at a glance.
s_in is the string to convert, this gets tested, so if any errors, exits with False.
n_ret is the var where the number is returned to, this can be any type.
size_of_var, because the function has no way of knowing how big the var is, whilst not ideal, it gets around that problem (e.g. for a Word type, pass 2, for LongInt, pass 4).
signed_var, normally with unsigned, so only need to add True for signed.

Then started testing the signed, positive first. Int8 fine, Int16, not fine. Got an error when it reached 256, i.e. called the convert function with "256" and byte size 2. After some tracking down, found the culprit and to say I was baffled is putting it mildly. I extracted the core essence of the problem, just to make sure I hadn't done something stupid (99% of the time I think there is a bug in pascal, find that I've done something stupid).

[Removed the invalid test extract, the code in the post below is correct and shows the problem.]

Just for completeness, here is the section of the actual test code:

Code: Pascal  [Select][+][-]
  1. Var
  2.   i : QWord;
  3.   i16 : Int16 = 0;  // signed Word - weird, this solved the 0 problem, it seems this value was not correct going in, but this is never used, would use Out but had issues
  4.   s, s2 : ShortString;
  5.  
  6. // just the relevant vars
  7.  
  8. // i16
  9. WriteLn( 'i16 +' );
  10. // [note I wanted the string number to be independent of the index (in the large types had to change that due to huge step, sadly pascal doesn't include Step in For loop)]
  11. s := '0';
  12. For i := 0 To 32767 Do Begin  // [hard coded in the max values, normally prefer hex, but decimal looked better here]
  13.   If Not p_str_num__str_to_num( s, i16, 2, p_str_num__has_sign ) Then Begin
  14.         WriteLn; WriteLn( '|  ERROR CONVERTING:  s = ', s, ',  i = ', i );
  15.         Exit;
  16.   End;
  17.   s2 := i16.ToString;  // [tested ToString over a range of values and types and found it to be perfect]
  18.   If s <> s2 Then Begin
  19.         WriteLn; WriteLn( '|  ERROR: converted number does not match' );
  20.         WriteLn( '      s = ', s, ',  i = ', i, ', i16 = ', i16 );
  21.         Exit;
  22.   End;
  23.   If Not p_str_num__add_1_to_str_num( s ) Then Begin  // this is my own function to add 1 to any string number, no size limit apart from ShortString
  24.         WriteLn( '|  ERROR: problem with add 1' );
  25.         Exit;
  26.   End;
  27. End;
  28.  
  29.  

Was reading about the whole RHS business with Free pascal after running into QWord problems (nice solution by Thaddy to locally turn off range checking). But this has got me stumped.

So the core question is does anyone have experience of doing mixed type arithmetic where they have found a simple way around pascal jumping the gun and presuming the type or going to the lowest type?

Luckily it wasn't crucial, just means either ditching any idea of working with signed numbers (was toying with idea going the whole way and write the core in assembler, already have quite a bit of assembler code in my pascal, but first it would reduce the hope of making the code portable and more importantly, I'd either have to limit the program to 64 bit machines, or exclude 8 byte types, or try and emulate pascal great 32 bit get around with the Hi Lo record.

[Added: Really sorry, totally forgot my manners. Thank you for any help offered. Phil,]

Program Details:

Lazarus: 3.4 (date: 2024-05-25)   FPC: 3.2.2   Version: Win64
Operating system: Windows Pro 10, 22H2
Hardware: Intel Core i5-3570K CPU, 3.40GHz, RAM: 8GB, 64 bit

Bio (give a take a year or 2):
1980s: commercial programmer in small companies, self taught, including assembler (main language dBASEIII)
1990s: Computing degree, introduced to Turbo Pascal, loved it, final year project: my own programming language
Took a long break from computing
2018: returned to hobby programming, found Free Pascal and Lazarus (hooray, didn't have to use C++)
Write mostly file management programs, done a few programs for friends (may one day put stuff on a website)
« Last Edit: November 08, 2024, 10:53:30 am by old_DOS_err »

Thaddy

  • Hero Member
  • *****
  • Posts: 16199
  • Censorship about opinions does not belong here.
Re: Confusion with signed types: is ( 256 * 1 = 0 ) a design issue or a bug?
« Reply #1 on: November 04, 2024, 03:14:29 pm »
That would be caught with {$rangechecks on} if the type is declared as a signed type like shortint or the compiler can see that the range -128..127 is covered.
(This was already the case back in the days of TP)
« Last Edit: November 04, 2024, 03:17:06 pm by Thaddy »
If I smell bad code it usually is bad code and that includes my own code.

old_DOS_err

  • Newbie
  • Posts: 5
Re: Confusion with signed types: is ( 256 * 1 = 0 ) a design issue or a bug?
« Reply #2 on: November 04, 2024, 06:41:39 pm »
Hi Thaddy,

Thank you for the fast reply and nice to hear from you, seen your posts many times whilst reading this forum. At least once you have provided a solution when someone has asked a similar question that I have been stuck on. Sorry taken a little while getting back, running more tests and checks.

BTW, forgot to mention this is my first post. Got close to posting a few times, but I'm old school, so do like to work it out myself. But this one beat me. Sorry, a few little glitches in the first code, and some grammar not corrected in the post, was rushing a bit at the end.

Oops, my extracted test code had a major boo boo, that was one of the 99%. That's what I get for rushing and not treble checking. Was extending the extracted tests and all coming up 0, then noticed I had forgot to initialize s to 1, it was zero, hence e * s = 0. I'm blaming senility (funny how senility has dogged much of my coding for over 40 years).

Anyway, went back and checked and the original error is still there, but you were right Thaddy, I used your method for turning off range checking to get around the QWord problem, rather than typecast everything, which was fine for the unsigned. Turns out this has a quite a different outcome in the land of signed types.

The following is much closer to the original and has a correct view of the problem. Essentially, with range checking on I cannot do mixed type calcs, even though not out of range, but with the range checking off, I get the wrong result. Double checked this time, put break point on the original at "256", the string number converting, and with the range checking off, the 2 vars, e and s, are 256 and 1, yet when multiplied, get 0!


Code: Pascal  [Select][+][-]
  1. Program demo_256;
  2. {$mode objfpc}{$H+}{$R+}{$Q+}
  3. Uses
  4.   Classes, SysUtils;
  5.  
  6. { much better test of original problem, in which 256 * 1 = 0
  7.  
  8. used untyped reference to get around the type problem, but wonder now if this is bringing in new problems with signed vars
  9.  
  10. I've kept in the tries, just so you can see what I have already tried.
  11. }
  12. Procedure assign_test( Var n );  // take any type
  13.   Var
  14.     s : Int8 = 1;  // this is the sign - initialize positive
  15.     pi16 : PShortInt;  // signed Word
  16.     e : Int32;  // this is var where the value gets stored into as it builds, really want to keep it down to just 2 pairs, otherwise defeats general type
  17.  
  18.   Begin
  19.     // the core code is copied straight out of the actual conversion routine
  20.     e := 256;  // put a conditional test in the conversion code to stop at "256", could see from hovering on the vars e was 256 and s was 1
  21.     pi16 := @n;  // now the pointer shares the address with the var passed in (this may be a problem for signed vars, fine for unsigned)
  22.     WriteLn( '|  assign_test: start: e = ', e, ',  pi16^ = ', pi16^ );
  23.     //pi16^ := Int16( e * s );  // now this hits a range error
  24.  
  25.     {$push}{$warn 4110 off}{$R-}  // excellent one from Thaddy
  26.     pi16^ := Int16( e * s );  // now this doesn't hit range error, but guess what, yep, comes as a 0, which was the original error replicated
  27.     {$pop}
  28.     WriteLn( '|  assign_test: end: e = ', e, ',  pi16^ = ', pi16^ );
  29.   End;
  30.  
  31. (*Procedure assign_test( Var n );  // this has all the extra tries in, wanted to keep the above cleaner
  32.   Var
  33.         s : Int8 = 1;  // this is the sign - initialize positive
  34.         pi16 : PShortInt;  // signed Word
  35.         e : Int32;  // this is var where the value gets stored into as it builds, really want to keep it down to just 2 pairs, otherwise defeats general type
  36.         //i16 : Int16;  // new, but not the way I want to go, introducing vars for every type
  37.  
  38.   Begin
  39.         // the core code is copied straight out of the actual conversion routine
  40.         e := 256;  // put a conditional test in the conversion code to stop at "256", could see from hovering on the vars e was 256 and s was 1
  41.         pi16 := @n;  // now the pointer shares the address with the var passed in (this may be a problem for signed vars, fine for unsigned)
  42.         WriteLn( '|  assign_test: start: e = ', e, ',  pi16^ = ', pi16^ );
  43.         //pi16^ := Int16( e * s );  // now this hits a range error
  44.         //pi16^ := Int16( Int16( e ) * s );  // nope, still hits the range error
  45.  
  46.         // since know can assign to different types with casting, introducing a new Int16 (ShortInt) var
  47.         //i16 := Int16( e );  // only need this if using the extra var method
  48.         //pi16^ := i16 * s;  // still hits range check
  49.         //pi16^ := i16 * Int16( s );  // still hits range check
  50.         //pi16^ := Int16( i16 * s );  // still hits range check
  51.  
  52.         {$push}{$warn 4110 off}{$R-}  // excellent one from Thaddy
  53.         //pi16^ := Int16( e * s );  // now this doesn't hit range error, but guess what, yep, comes as a 0, which was the original error replicated
  54.         pi16^ := Int16( e ) * Int16( s );  // thought if I equalised them, no joy, still 0
  55.         //pi16^ := ShortInt( e ) * ShortInt( s );  // just checking, it is pascal
  56.         {$pop}
  57.         WriteLn( '|  assign_test: end: e = ', e, ',  pi16^ = ', pi16^ );
  58.   End;
  59. *)
  60.  
  61. Procedure assign_test2( Var n );  // going to drop below 256 to see if that is the problem
  62.   Var
  63.     s : Int8 = 1;  // this is the sign - initialize positive
  64.     pi16 : PShortInt;  // signed Word
  65.     e : Int32;  // got a really weird value error on int word, passed in '0', seemed to be 0 int i16, the pointer, yet when got back to test, the value of the var was
  66.  
  67.   Begin
  68.     // ok, this just got weirder, still hit the range check at 250, but in removing the range check, now pi16 showed -6, but now i16 on return correctly showed 250!
  69.     e := 250;  // now below
  70.     pi16 := @n;
  71.     WriteLn( '|  assign_test: start: e = ', e, ',  pi16^ = ', pi16^ );  // correctly show 250 and 10
  72.     //pi16^ := Int16( e * s );  // really weird now this hits a range error
  73.  
  74.     {$push}{$warn 4110 off}{$R-}  // excellent one from Thaddy
  75.     pi16^ := Int16( e * s );  // now this doesn't hit range error, but guess what, yep, comes out as 0, which was the original error replicated
  76.     {$pop}
  77.     WriteLn( '|  assign_test: end: e = ', e, ',  pi16^ = ', pi16^ );
  78.   End;
  79.  
  80. Procedure test_i16;
  81.   Var
  82.     i16 : Int16 = 0; // the return var
  83.     s : Int8 = 1;  // this is the sign
  84.     e : Int32 = 0;  // the converted value
  85.  
  86.   Begin
  87.     e := 256;
  88.     i16 := Int16( e );
  89.     WriteLn( '|  Int16( e ):  i16 = ', i16 );  // 256
  90.  
  91.     e := 10;  // arbitrary, since passing as Var don't want this to equal 256
  92.     assign_test( e );  // this is now the same method as the actual and should be 256
  93.     //assign_test2( e );  // this is now the same method as the actual and should be 256
  94.     i16 := Int16( e * s );
  95.     WriteLn( '|  i16 = ', i16 );  // YEP 0, HOW???????
  96.   End;
  97.  
  98. Begin
  99.   test_i16;
  100.  
  101.   WriteLn; WriteLn('Finished...'); ReadLn;
  102. End.
  103.  

If anyone has done mixed type signed calculations, can you spot what I am missing, or please let me know if there is a way around this impasse, without either abandoning signed types altogether or going back to assembler (or as done a few times, create overload copies for every type, even then with what I've seen not sure even that would work).

Thank you for any help (forgetting my manners, sorry),

Phil

Thaddy

  • Hero Member
  • *****
  • Posts: 16199
  • Censorship about opinions does not belong here.
Re: Confusion with signed types: is ( 256 * 1 = 0 ) a design issue or a bug?
« Reply #3 on: November 05, 2024, 07:39:10 am »
Code: Pascal  [Select][+][-]
  1. pi16 : PShortInt;  // signed Word? No, shortint is a signed byte!
  2. // should be
  3. pi16 : PSmallInt;  // This is a signed Word
See:
https://www.freepascal.org/docs-html/current/prog/progsu154.html

Wrong pointer type can hide range errors, moreover pointer types in general can under circumstances hide range errors.
But using the correct type should fix it.

BTW with range checks off and or overflow checks off, 0 -zero - is the expected result, since the low byte of 256 is....., wait for it ..... 0.
If you got the range check error it was correct, because of the wrong type.
« Last Edit: November 05, 2024, 08:57:44 am by Thaddy »
If I smell bad code it usually is bad code and that includes my own code.

old_DOS_err

  • Newbie
  • Posts: 5
Re: Confusion with signed types: is ( 256 * 1 = 0 ) a design issue or a bug?
« Reply #4 on: November 05, 2024, 11:50:00 am »
Thanks again Thaddy for the prompt reply.

The reason why it is PshortInt is that is it a general container, I build a number into that. I also have i64 : PInt64 for 8 byte values. I've never had a type issue, other than parameters of course, using DWord as a container, then stepping down to the correct size. This method worked fine for unsigned. I write file managent programs, so never have to deal with signed numbers. And yes, as an assembler program I know about 256, if i was incrementing values in assembler, I would just check the OF flag, probably the sign flag as well (SF I think, don't think I've used it, says a lot). I had hoped that pascal was a bit more flexible. But I think the upshot of this experience is that if using signed, then I cannot assume any leeway from pascal.

Ok, thanks again. I'm going to look at the original code and I may have to introduce specific variables for all signed types (as I say, already have ShortInt and Int64, so just need to add Int16 and Int8, which I already have in pointer form (never had such a short function with so many variables, but that is the price I suppose for trying to get around the type restrictions on parameters).

I hate giving up on code, bit of terrier that way, good and bad quality to have. But it would be nice to see this conversion routine to conclusion. This was my first post and after the hash I made of it, may well be my last :)

If I still can't get the signed to work, but feel they should do, I'd like to do another post, just rather better constrcted than this. Where would you recommend I post that to? Wasn't sure if this was the right category.

Phil

Thaddy

  • Hero Member
  • *****
  • Posts: 16199
  • Censorship about opinions does not belong here.
Re: Confusion with signed types: is ( 256 * 1 = 0 ) a design issue or a bug?
« Reply #5 on: November 05, 2024, 11:53:07 am »
A PShortInt is a byte size. In your case that is really the cause. It must be PSmallint.

Do not get discouraged..

We are always available to help.
After reading your explanation there are more options to do it in Pascal in a proper way.
But in your example, note my remarks, because that fixes it.
« Last Edit: November 05, 2024, 12:06:22 pm by Thaddy »
If I smell bad code it usually is bad code and that includes my own code.

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11947
  • FPC developer.
Re: Confusion with signed types: is ( 256 * 1 = 0 ) a design issue or a bug?
« Reply #6 on: November 05, 2024, 04:06:32 pm »
(note that a simple way of avoiding what is short and what is small, is to use int8,int16 and their corresponding pointer sizes. They have been in FPC for quite some time.

As to why type is not promoted, I can vaguely remember this was a TP/early delphi "feature" to use partial (AH/AL or AX vs EAX) to gain speed. FPC got it somewhere during the 2.0 or 2.2 cycle afaik.  One of those that are not documented but only found when people start submitting bug reports with code that differs in results between Delphi and FPC.
« Last Edit: November 05, 2024, 04:09:20 pm by marcov »

Thaddy

  • Hero Member
  • *****
  • Posts: 16199
  • Censorship about opinions does not belong here.
Re: Confusion with signed types: is ( 256 * 1 = 0 ) a design issue or a bug?
« Reply #7 on: November 05, 2024, 04:47:49 pm »
It does not differ between Delphi and FPC: you must use the correct type.
If I smell bad code it usually is bad code and that includes my own code.

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11947
  • FPC developer.
Re: Confusion with signed types: is ( 256 * 1 = 0 ) a design issue or a bug?
« Reply #8 on: November 05, 2024, 05:03:15 pm »
It does not differ between Delphi and FPC: you must use the correct type.

Afaik it did in older FPC versions.

old_DOS_err

  • Newbie
  • Posts: 5
Re: Confusion with signed types: is ( 256 * 1 = 0 ) a design issue or a bug?
« Reply #9 on: November 06, 2024, 01:36:51 pm »
Thanks to the dialogue with Thaddy, I had a moment of clarity, rather rare these days (the problem with getting old is sometimes you don't realise how heavily you've forgotten something until it walks back up to you holding a wet fish and slaps you around the face with it). I had totally forgotten the basics of how computers deal with negative numbers.

Have just run a full auto test on the unsigned and, more importantly, the signed, both positive and negative, up to 4 byte types. Finally got all the bugs out. Still to test Int64, have already ran the QWord test (had to use steps for that, was taking way too long to do EVERY number), but the mechanics are all the same as the other signed, the only interesting number is going to be max negative (because I build the number as a positive, basically absolute, then negate on last digit, if converting a max negative, say converting a string to a byte that has '-128', have to add and negate at the same time, := ( 28 + 100 ) * -1, since 128 stored into Int8 will give an error or be wrong value if range turned off).

The main thing is that I now have a working string number converter that replaces the whole range of string to int functions and good old Val, that will accept a number with commas and give a boolean return if an invalid number.

Thanks marcov for the contribution. I really only use assembler in pascal when doing bit work, as I know this potentially makes the code way too hardware dependent (because I haven't got into porting to other platforms, that is a world of pain I have yet to experience). My Turbo Pascal days are long behind me, but up until generics, my linked list class was using sort of mock polymorphism using pointers, written in TP back in the 90s. I did end up using Int8 and ^Int8, etc. One of the things I also seemed to experience, which I have seen noted in the free pascal documentation, though not quite connected, is that even if 2 variables are effectively the same type, but have different names, it can cause issues.

For those interested, the reason why signed and unsigned cannot be treated the same is that the right most bit is effectively the sign, rather similar to UTF8, once that bit is set to 1 the number is then negative. That is what I forgot, oops. It is about the only downside of placing a function into a library, trying to cater for the range of types.

Thanks for all the responces, good to get clarity when getting into a pickle.

Phil

p.s. I couldn't find the signature part in the forum profile, did search for it, the entry I found when someone asked that question, gave an option that wasn't on my screen. So can some please tell me the fool proof way to add the below bit to the signature section. Also couldn't see where I mark this post as solved.


Program Details:

Lazarus: 3.4 (date: 2024-05-25)   FPC: 3.2.2   Version: Win64
Operating system: Windows Pro 10, 22H2
Hardware: Intel Core i5-3570K CPU, 3.40GHz, RAM: 8GB, 64 bit

Bio (give a take a year or 2):
1980s: commercial programmer in small companies, self taught, including assembler (main language dBASEIII)
1990s: Computing degree, introduced to Turbo Pascal, loved it, final year project: my own programming language
Took a long break from computing
2018: returned to hobby programming, found Free Pascal and Lazarus (hooray, didn't have to use C++)
Write mostly file management programs, done a few programs for friends (may one day put stuff on a website)

old_DOS_err

  • Newbie
  • Posts: 5
Thanks again to Thaddy and marcov for their responses.

I have just ran over night the full auto test of all the types, unsigned and signed, positive and negative.

Learnt long ago in the good old days of global variables to never assume that a part that was working will still work after making changes.

One of the things I see all the time, but generally ignore, is the compiler warning about mixed types leading to Int64 calculations. In doing the QWord calcs, I initially had the range checking suppressed (see above), but after getting wrong results, put the range checking back on. The QWord calc, which was fine with range check off now ran into problems again. Initially typecasting the whole calculation did not work. What I recognised in thinking about the odd compiler warning, is that as soon as the calc involved different types, it was going to have problems. So for example where I extract the byte value from the string, the next digit to convert, this is now done first and assigned to the QWord var. Had to spread a single calc from 1 line to 3, but at least it works with range checking on.

 

TinyPortal © 2005-2018