Recent

Author Topic: Superscripts and subscripts  (Read 12015 times)

William Marshall

  • Jr. Member
  • **
  • Posts: 52
Re: Superscripts and subscripts
« Reply #15 on: June 27, 2017, 07:49:36 pm »
     This looks interesting, and I've played around with it a bit.  What I need is an interactive, self-formatting TEdit for formulas.  I wrote a formatting function called by the TEdit's OnChange event.
 
Code: Pascal  [Select][+][-]
  1. function FormatFormula(Str:string):string;
  2. var i,L:integer;
  3. const SUB_0 = #$E2#$82#$80;     SUPER_0 = #$E2#$81#$B0;     SUPER_PLUS  =  #$E2#$81#$BA;
  4.       SUB_1 = #$E2#$82#$81;     SUPER_1 = #$C2#$B9;         SUPER_MINUS =  #$E2#$81#$BB;
  5.       SUB_2 = #$E2#$82#$82;     SUPER_2 = #$C2#$B2;
  6.       SUB_3 = #$E2#$82#$83;     SUPER_3 = #$C2#$B3;
  7.       SUB_4 = #$E2#$82#$84;     SUPER_4 = #$E2#$81#$B4;
  8.       SUB_5 = #$E2#$82#$85;     SUPER_5 = #$E2#$81#$B5;
  9.       SUB_6 = #$E2#$82#$86;     SUPER_6 = #$E2#$81#$B6;
  10.       SUB_7 = #$E2#$82#$87;     SUPER_7 = #$E2#$81#$B7;
  11.       SUB_8 = #$E2#$82#$88;     SUPER_8 = #$E2#$81#$B9;
  12.       SUB_9 = #$E2#$82#$89;     SUPER_9 = #$E2#$81#$B9;
  13.  
  14. begin
  15.   Result := '';
  16.   L:=Length(Str);
  17.   case L of
  18.     0:exit;
  19.     1:begin
  20.         Result:=Str;
  21.         exit;
  22.       end;
  23.   2:begin
  24.       case Str[2] of
  25.         '0':Result := Str[1]+SUB_0;
  26.         '1':Result := Str[1]+SUB_1;
  27.         '2':Result := Str[1]+SUB_2;
  28.         '3':Result := Str[1]+SUB_3;
  29.         '4':Result := Str[1]+SUB_4;
  30.         '5':Result := Str[1]+SUB_5;
  31.         '6':Result := Str[1]+SUB_6;
  32.         '7':Result := Str[1]+SUB_7;
  33.         '8':Result := Str[1]+SUB_8;
  34.         '9':Result := Str[1]+SUB_9;
  35.         '+':Result := Str[1]+SUPER_PLUS;
  36.         '-':Result := Str[1]+SUPER_MINUS;
  37.         else Result := Str;
  38.       end;
  39.       exit;
  40.     end;
  41. else begin
  42.   for i:=1 to L-2 do      // The last two characters might be a charge, and should be treated separately
  43.     begin
  44.       case Str[i] of
  45.         '0':Result := Result+SUB_0;
  46.         '1':Result := Result+SUB_1;
  47.         '2':Result := Result+SUB_2;
  48.         '3':Result := Result+SUB_3;
  49.         '4':Result := Result+SUB_4;
  50.         '5':Result := Result+SUB_5;
  51.         '6':Result := Result+SUB_6;
  52.         '7':Result := Result+SUB_7;
  53.         '8':Result := Result+SUB_8;
  54.         '9':Result := Result+SUB_9;
  55.         else Result := Result+Str[i];
  56.       end;
  57.     end;
  58.  
  59. if Str[L] in ['+','-'] then
  60.   if Str[L-1] in DigitChars then
  61.     begin
  62.       case Str[L-1] of
  63.         '0':Result := Result+SUPER_0;
  64.         '1':Result := Result+SUPER_1;
  65.         '2':Result := Result+SUPER_2;
  66.         '3':Result := Result+SUPER_3;
  67.         '4':Result := Result+SUPER_4;
  68.         '5':Result := Result+SUPER_5;
  69.         '6':Result := Result+SUPER_6;
  70.         '7':Result := Result+SUPER_7;
  71.         '8':Result := Result+SUPER_8;
  72.         '9':Result := Result+SUPER_9;
  73.       end;
  74.  
  75.       if Str[L]='+' then
  76.         Result := Result+SUPER_PLUS
  77.       else Result := Result+SUPER_MINUS;
  78.     end
  79. else
  80.   else for i:= L-1 to L do
  81.     begin
  82.          case Str[i] of
  83.            '0':Result := Result+SUB_0;
  84.            '1':Result := Result+SUB_1;
  85.            '2':Result := Result+SUB_2;
  86.            '3':Result := Result+SUB_3;
  87.            '4':Result := Result+SUB_4;
  88.            '5':Result := Result+SUB_5;
  89.            '6':Result := Result+SUB_6;
  90.            '7':Result := Result+SUB_7;
  91.            '8':Result := Result+SUB_8;
  92.            '9':Result := Result+SUB_9;
  93.            else Result := Result+Str[i];
  94.          end;
  95.        end;
  96. end;
  97. end;
  98. end;
  99.  
  100. procedure TfrmScratch.edFormulaChange(Sender : TObject);
  101. begin
  102. edFormula.Caption := FormatFormula(edFormula.Caption);
  103. end;
  104.  

      It works, but not very well.  When a number is typed, the cursor jumps to the beginning of the string, and when a '+' or '-' is typed, it and the last character disappear.  I think I could make it work better if I could change the formatted string back to a "simple string" (ie, no superscripts or subscripts) and then reformat that.  But I can't figure out how to do it.
     I've also been working on a RichMemo version with some success.  TEdits would be much easier to incorporate into my program, but as wp pointed out, the subscripts and superscripts are pretty small (though not much smaller than in a RichMemo).  I'm not sure which approach I'll eventually use, probably the first one I can perfect.  Any suggestions for getting the UniCode approach to work would be appreciated.
Lazarus 1.8.0; fpc 3.0.4; Windows 10

wp

  • Hero Member
  • *****
  • Posts: 11855
Re: Superscripts and subscripts
« Reply #16 on: June 27, 2017, 08:47:30 pm »
In the attached solution based on your code the sub/superscript UTF8 characters are converted to normal characters whenever the edit receives the focus, i.e. you can enter numbers and charges as usual. When the edit loses focus, however, the numbers and signs are converted back to sub/superscripts. All you have to do is to call the FormatFormula and UnformatFormula function in the OnExit and OnEnter event handlers of the edit.

Maybe that's what you want...

CCRDude

  • Hero Member
  • *****
  • Posts: 596
Re: Superscripts and subscripts
« Reply #17 on: June 27, 2017, 08:47:42 pm »
Regarding the cursor jumps, try this:
Code: Pascal  [Select][+][-]
  1. procedure TfrmScratch.edFormulaChange(Sender: TObject);
  2. var
  3.    p: TPoint;
  4. begin
  5.    p := edFormula.CaretPos;
  6.    edFormula.Caption := FormatFormula(edFormula.Caption);
  7.    edFormula.CaretPos := p;
  8. end;

The problem with the vanishing trailing "+" is in line 42 of your code snippet.

Keep in mind that you go through Str bytewise, not charwise.

You could switch to utf8string for the FormatFormula function, but then you need your constants as utf8, not as multiple chars, and specify $CODEPAGE UTF8 (or similar)...

William Marshall

  • Jr. Member
  • **
  • Posts: 52
Re: Superscripts and subscripts
« Reply #18 on: June 28, 2017, 05:03:02 pm »
     OK.  I got it working and modified it a little (see attached project).  When you type a formula into the top edit, it is formatted automatically - exactly what I wanted.  And pressing the button shows the unformatted formula, a function that will be useful in my program.  But there is still a problem.  Using wp's version of the Superscript function, the superscript before the sign appears as a non-numerical character, unless the digit is 1, 2, or 3.  Then it works as expected.  Unformatting shows the correct digit.  I changed that to a more "brute force" method (the commented lines), but get the same kind of problem.  But now the charge digit is the last subscripted digit in the formula, again unless it's a 1, 2, or 3.  For example, typing in P2O4T6RrH8+ (obviously not a real formula) gives the correct formatting, but the 8, or whatever digit, becomes a 6, and remains a 6 in Unformatting.
     The problem probably has something to do with 1, 2, and 3 being the 2-byte superscripts.  I've been staring at the code for hours and have no clue to what's wrong.  %) (There ought to be a symbol for frustrated.)
Lazarus 1.8.0; fpc 3.0.4; Windows 10

fred

  • Full Member
  • ***
  • Posts: 201
Re: Superscripts and subscripts
« Reply #19 on: June 28, 2017, 05:45:17 pm »
A small typing error:
Code: Pascal  [Select][+][-]
  1. SUPER_8 = #$E2#$81#$B9;
  2. SUPER_9 = #$E2#$81#$B9;
I assume the last byte of SUPER_8 must be B8.

Your code seems to work in unformatting if I try P2O4T6RrH8+
I can't see the sub/super right, I suppose I use the wrong font.
« Last Edit: June 28, 2017, 06:00:38 pm by fred »

wp

  • Hero Member
  • *****
  • Posts: 11855
Re: Superscripts and subscripts
« Reply #20 on: June 28, 2017, 06:25:09 pm »
I had issues when typing something like H4-, then the "4" was replaced by some other UTF8 character. After searching for some time I found that creating a UTF8 codepoint by "adding" its individual bytes as chars as I did in my sample code is nonsense because fpc's automatic codepage conversion does a lot of things which i do not understand...

Therefore, I decided to write a little parser which combines the codepoints from bytes, and now the bug is gone:

Code: Pascal  [Select][+][-]
  1. function Subscript(ch: char): String;
  2. begin
  3.   case ch of
  4.     '0': Result := SUB_0;
  5.     '1': Result := SUB_1;
  6.     '2': Result := SUB_2;
  7.     '3': Result := SUB_3;
  8.     '4': Result := SUB_4;
  9.     '5': Result := SUB_5;
  10.     '6': Result := SUB_6;
  11.     '7': Result := SUB_7;
  12.     '8': Result := SUB_8;
  13.     '9': Result := SUB_9;
  14.     else Result := ch;
  15.   end;
  16. end;     { Subscript }
  17.  
  18. function Superscript(ch: char): String;
  19. begin
  20.   case ch of
  21.     '0': Result := SUPER_0;
  22.     '1': Result := SUPER_1;
  23.     '2': Result := SUPER_2;
  24.     '3': Result := SUPER_3;
  25.     '4': Result := SUPER_4;        
  26.     '5': Result := SUPER_5;
  27.     '6': Result := SUPER_6;
  28.     '7': Result := SUPER_7;
  29.     '8': Result := SUPER_8;
  30.     '9': Result := SUPER_9;
  31.     '+': Result := SUPER_PLUS;
  32.     '-': Result := SUPER_MINUS;
  33.     else Result := ch;
  34.   end
  35. end;     { Superscript }
  36.  
  37. function FormatFormula(Str:string):string;
  38. var
  39.   i,L:integer;
  40. begin
  41.   Result := '';
  42.   L:=Length(Str);
  43.   case L of
  44.     0: exit;
  45.     1: begin
  46.          Result:=Str;
  47.          exit;
  48.        end;
  49.     2: begin
  50.          if Str[2] in DIGIT_CHARS then
  51.            Result := Str[1] + Subscript(Str[2])
  52.          else
  53.          if Str[2] in SIGN_CHARS then
  54.            Result := Str[1] + Superscript(Str[2])
  55.          else
  56.            Result := Str;;
  57.          exit;
  58.        end;
  59.     else begin
  60.            for i:=1 to L-2 do      // The last two characters might be a charge, and should be treated separately
  61.              Result := Result + Subscript(Str[i]);
  62.  
  63.            if Str[L] in SIGN_CHARS then
  64.              for i := L-1 to L do
  65.                Result := Result + Superscript(Str[i])
  66.            else
  67.              for i := L-1 to L do
  68.                Result := Result + Subscript(Str[i])      ;
  69.          end;
  70.     end;
  71. end;     { FormatFormula }
  72.  
  73. function UnformatFormula(AText: String): String;
  74. var
  75.   i: Integer;
  76.   p, q: PChar;
  77.   b0, b1, b2: Byte;
  78.   found: boolean;
  79. begin
  80.   Result := '';
  81.  
  82.   if AText = '' then
  83.     exit;
  84.  
  85.   SetLength(Result, Length(AText));
  86.  
  87.   p := @AText[1];
  88.   i := 1;
  89.   while (p^ <> #0) do begin
  90.     b0 := byte(p^);
  91.     case b0 of
  92.       $E2:
  93.         begin
  94.           found := true;
  95.           q := p;
  96.           inc(q);
  97.           b1 := byte(q^);
  98.           if b1 = $82 then begin // subscript 0..9
  99.             inc(q);
  100.             b2 := byte(q^);
  101.             if (b2 in [$80..$89]) then
  102.               Result[i] := char(b2 - $80 + ord('0'))
  103.             else
  104.               found := false;
  105.           end else
  106.           if b1 = $81 then begin  // superscript 0, 4..9, +, -
  107.             inc(q);
  108.             b2 := byte(q^);
  109.             case b2 of
  110.               $B0, $B4..$B9:
  111.                 Result[i] := char(b2 - $B0 + ord('0'));
  112.               $BA:
  113.                 Result[i] := '+';
  114.               $BB:
  115.                 Result[i] := '-';
  116.               else
  117.                 found := false;
  118.             end;
  119.           end else
  120.             found := false;
  121.           if found then
  122.             p := q
  123.           else
  124.             Result[i] := char(b0);
  125.         end;
  126.  
  127.       $C2:
  128.         begin
  129.           q := p;
  130.           inc(q);
  131.           b1 := byte(q^);
  132.           found := true;
  133.           case b1 of
  134.             $B9: Result[i] := '0';
  135.             $B2: Result[i] := '2';
  136.             $B3: Result[i] := '3';
  137.             else found := false;
  138.           end;
  139.           if found then
  140.             p := q
  141.           else
  142.             Result[i] := char(b0);
  143.         end;
  144.       else
  145.         Result[i] := char(b0);
  146.     end;
  147.     inc(p);
  148.     inc(i);
  149.   end;
  150.   SetLength(Result, i-1);
  151. end;

William Marshall

  • Jr. Member
  • **
  • Posts: 52
Re: Superscripts and subscripts
« Reply #21 on: June 28, 2017, 07:02:24 pm »
     Hey!  It works!  Thanks so much to everyone, especially wp.  This little sub-project turned out to be much more involved than I ever imagined.  Quite literally, I could never have done this on my own.  I don't suppose I've gotten anyone interested in chemistry?  My project has a long way to go, and I'm sure I'll be back with more questions before long.  Thanks again.  :D
Lazarus 1.8.0; fpc 3.0.4; Windows 10

 

TinyPortal © 2005-2018