Recent

Author Topic: Syntax coloring  (Read 651 times)

440bx

  • Hero Member
  • *****
  • Posts: 5264
Syntax coloring
« on: May 10, 2025, 04:39:40 am »
Hello,

consider the following:
Code: Pascal  [Select][+][-]
  1. const
  2.   ASTRING = 'some string';
  3.   ANUMBER = 10;
  4.  
  5. begin
  6.   writeln('some string');
  7.   writeln(ASTRING);
  8.  
  9.   writeln(10);
  10.   writeln(ANUMBER);
  11. end;
  12.  
in those statements, the color used to show 'some string' is different than the color used to show ASTRING.  Is it possible to configure Lazarus to make it use the same color for 'some string' and ASTRING since they both are strings ?

I'd like to do the same thing for numeral constants too.
 
IOW, constant identifiers would be colored using the same color as their base type.

Thank you for your help.
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v4.0rc3) on Windows 7 SP1 64bit.

TRon

  • Hero Member
  • *****
  • Posts: 4351
Re: Syntax coloring
« Reply #1 on: May 10, 2025, 06:29:30 am »
afaik, not possible. You might be able to set a different/distinct color for identifiers but not based on their actual content and/or type.
Today is tomorrow's yesterday.

440bx

  • Hero Member
  • *****
  • Posts: 5264
Re: Syntax coloring
« Reply #2 on: May 10, 2025, 06:35:29 am »
afaik, not possible. You might be able to set a different/distinct color for identifiers but not based on their actual content and/or type.
That's what I thought too but, I figured I'd ask just in case someone knew of a trick to get it done.

It think it would be a nice thing to have.
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v4.0rc3) on Windows 7 SP1 64bit.

munair

  • Hero Member
  • *****
  • Posts: 828
  • compiler developer @SharpBASIC
    • SharpBASIC
Re: Syntax coloring
« Reply #3 on: May 10, 2025, 08:45:57 am »
There is a difference between immediates and identifiers. I'm really curious why one would want to have the same color for both. I'm actually quite happy spotting immediates at first glance. It makes coding so much easier. I don't know of any IDE actually supporting what you ask.
It's only logical.

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 11130
  • Debugger - SynEdit - and more
    • wiki
Re: Syntax coloring
« Reply #4 on: May 10, 2025, 09:01:12 am »
Unfortunately not possible.

The highlighter works based on the local syntax only. It does not know what was done to an identifier at some place else.

So when it encounters "ASTRING" then this is just an identifier.

Thaddy

  • Hero Member
  • *****
  • Posts: 16928
  • Ceterum censeo Trump esse delendam
Re: Syntax coloring
« Reply #5 on: May 10, 2025, 11:10:52 am »
Solvable by making the syntax table public and extensible.
Anyway that is possible with SynAnySyn?
Due to censorship, I changed this to "Nelly the Elephant". Keeps the message clear.

440bx

  • Hero Member
  • *****
  • Posts: 5264
Re: Syntax coloring
« Reply #6 on: May 10, 2025, 03:49:19 pm »
There is a difference between immediates and identifiers. I'm really curious why one would want to have the same color for both. I'm actually quite happy spotting immediates at first glance. It makes coding so much easier. I don't know of any IDE actually supporting what you ask.
The reason why is simple.

currently when reading code, there is no clue what a statement such as "writeln(SOMETHING_HERE);" actually does.   when a "writeln('some string');" appears, the different color used to show "some string" makes it obvious it's a string, same thing with a numeral.  It would be great if that coloring "stuck" to constant identifiers.  That way, "writeln(MY_CONSTANT_STRING);" would be colored the same way as "writeln('a string');" making it obvious that "MY_CONSTANT_STRING" is a string (without having to make it obvious by naming it like that, which still requires _reading_ the statement, unlike coloring.  when colored differently, you don't need to read the statement to know it's a string or a numeral, the coloring instantly gives that info.)

I think it would be a small but very nice enhancement.  I have no idea how hard it might to implement using the current syntax highlighters.

(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v4.0rc3) on Windows 7 SP1 64bit.

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 11130
  • Debugger - SynEdit - and more
    • wiki
Re: Syntax coloring
« Reply #7 on: May 10, 2025, 04:02:04 pm »
I think it would be a small but very nice enhancement.  I have no idea how hard it might to implement using the current syntax highlighters.

It be a lot of work....

It's one think if you only look at the current unit. And maybe even ignore scope. So you "just" have to look up the previous instance of the identifier (and parse it's context and value).

Of course even then, that would already be a "mini codetools" (and then why re-invent the wheel, there is a big codetools already).

No scope:
Code: Pascal  [Select][+][-]
  1. const foo=1
  2. procedure bar; const foo='a'; begin end;
  3. procedure abc; begin write(foo); end; // to find the number it needs scope
- and in a method of  a class, it could be a const defined in the class
- it could be in an include
- it could be in an ifdef
- it could be in a different unit
- ...

But even in the one unit, no scope it does not fit the idea of the current highlighter.

The highlighter must be able to perform minimum work to update on every keystroke. And searching for definitions like the above on every key stroke is not a good idea.

Something like this must update on idle, and apply on top of highlighting. And that is why it is a lot of work. It needs to be written from scratch (SynEdit has the Markup concept as a place to plug it in, but the actual lookup and applying needs to be done...)

And if it uses codetools, even on idle, it would be good (but not mandatory) if it could run in a thread. Otherwise if it starts the users next key stroke could be processed with some noticeable lag.  Though probably ok without threading.... Just icing of the cake.

Also dependency on codetools (or some similar tool) means, that it probably only works for (mostly) error free code. Code below the current edit may not show correct colors. (because if e.g. begin/end are not matching up then codetools will give up / and there are probably reasons for that making it complex to change)

440bx

  • Hero Member
  • *****
  • Posts: 5264
Re: Syntax coloring
« Reply #8 on: May 10, 2025, 04:22:03 pm »
I see what you're saying.

Far, very far, from easy.  Actually, quite involved.

All the things you mentioned reminded me of that presentation Anders did for the "on demand parsing" of C# in Visual Studio.  They ran into the problems you mentioned above.

The simple case, the one where only one scope is taken into account would not be too hard to manage.  Once full scoping is taken into account, it's a whole different ballgame.
(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v4.0rc3) on Windows 7 SP1 64bit.

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 11130
  • Debugger - SynEdit - and more
    • wiki
Re: Syntax coloring
« Reply #9 on: May 10, 2025, 05:06:44 pm »
At current the Highlighter in 99% of all cases (when the user types into the editor) needs to scan between 1 and 10 lines.

Of course there are exceptions
- (*  with nested comments
- begin  // so that may find something to indicate the missing end (a new named procedure - after a code block - indicates top level, all begin must be closed

Less drastic exceptions
- "class of"  if the "of" is 10 lines later
- "strict private" if the private is 10 lines later
- some cases of anonymous functions


And well, even so the HL is fast enough to scan 100k lines in such a case, its on my todo, it shouldn't.
Normally it only needs to immediately scan, until the end of the visible screen. What happens below does not matter.

Of course the file can be open in a 2nd editor and scrolled to the bottom. Or the new minimap could show it all.
But even in those cases, it could do blocks of 5k or 10k lines in an async proc. Giving the editor time to  react to other events. And the user would hardly notice if the bottom of the minimap updates maybe 50 additional millisecond later (because of the few breaks that the HL takes, and immediately continues if there is no further user input)



In a way, the code for the "lookup" of a constant would also seem to be fast at first. Because the constant is defined on top. But a change of the procedure name changes the scope, and that changes which of the definition needs to be used.
So therefore the entire storage of what is before changes.

And also, if code changes, the rescan of all the below (as the "new before" for code even further below) will take much longer. Because a scan like codetools is more expensive than the current HL.

So in the scenario of a 2nd editor showing the bottom of the file, every edit must do a codetool scan for all of the file below that edit.
And that must be done without blocking.


Also codetool is currently storing info in a way fast enough for single lookups when you search the current identifier.

But in a HL, you would have to check every identifier, because every identifier could be such a constant. So when highlighting you would do hundreds of codetool lookups during a scan. (and that is for each paint event, as the HL has no storage other than the state at the start of line / a markup would need that)

That would obviously need some different approach too.

munair

  • Hero Member
  • *****
  • Posts: 828
  • compiler developer @SharpBASIC
    • SharpBASIC
Re: Syntax coloring
« Reply #10 on: May 10, 2025, 05:24:31 pm »
currently when reading code, there is no clue what a statement such as "writeln(SOMETHING_HERE);" actually does.

That's why it is important to use descriptive identifiers, such as FILE_EXT_CSV = 'csv'; (capitals for constants). It is up to the programmer to be systematic in naming variables, so that the code is understandable, even when reading back after several years. I also use the mouse A LOT to hover over identifiers to see their type. This is really a non-issue in modern IDEs IMO. And as Martin_fr explained, it is not easy to add support for the functionality you ask for.
It's only logical.

440bx

  • Hero Member
  • *****
  • Posts: 5264
Re: Syntax coloring
« Reply #11 on: May 10, 2025, 05:54:15 pm »
That's why it is important to use descriptive identifiers, such as FILE_EXT_CSV = 'csv'; (capitals for constants). It is up to the programmer to be systematic in naming variables, so that the code is understandable, even when reading back after several years. I also use the mouse A LOT to hover over identifiers to see their type. This is really a non-issue in modern IDEs IMO. And as Martin_fr explained, it is not easy to add support for the functionality you ask for.
You can have the best naming convention imaginable but, it will still be a long shot from being as useful and indicative as different coloring.

Refer to the attachment.  Once all those strings are constant identifiers there is a whole lot that is lost due to all the identifiers being the same color.  The fact that all strings are green helps a lot more than any naming convention, no matter how good, ever could.  Not to mention that by having all constant identifiers be uppercase, the contrast between a field name and a field's type (in the example given) is lost.  Not even baptism could make up for that.

I am fully aware it is not easy to implement a general solution to have the color assigned to the identifier based on its type but, it would still be a very nice feature to have.

(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v4.0rc3) on Windows 7 SP1 64bit.

munair

  • Hero Member
  • *****
  • Posts: 828
  • compiler developer @SharpBASIC
    • SharpBASIC
Re: Syntax coloring
« Reply #12 on: May 10, 2025, 07:26:41 pm »
I usually code like this (using the code from your screenshots):

Code: Pascal  [Select][+][-]
  1. const
  2.         SUB_SYSTEM_DATA = 'SubSystemData';
  3.         PROCESS_HEAP = 'ProcessHeap';
  4.         FAST_PEB_LOCK = 'FastPebLock';
  5.         PRTL_CRITICAL_SECTION = 'PRTL_CRITICAL_SECTION';
  6.         AT1_THUNK_SLIST_PTR =   'At1ThunkSListPtr';
  7.         PSLIST_HEADER = 'PSLIST_HEADER';
  8.         IF_E0_KEY = 'IFE0Key';
  9.         CROSS_PROCESS_UNION = 'CrossProcess (union)';
  10.         CROSS_PROCESS_FLAGS = 'CrossProcessFlags';
  11.         PROCESS_IN_JOB = 'ProcessInJob';
  12.         PROCESS_INITIALIZING = 'ProcessInitializing';
  13.         PROCESS_USING_VEH = 'ProcessUsingVEH';
  14.         PROCESS_USING_VCH = 'ProcessUsingVCH';
  15.         PROCESS_USING_FTH = 'ProcessUsingFTH';
  16.         RESERVED_BITS_0 = 'ReservedBits0';
  17.  
  18. // ...
  19.                
  20. emit_LABELnVALUE( <SUB_SYSTEM_DATA, SubSystemData> );
  21. emit_LABELnVALUE( <PROCESS_HEAP, ProcessHeap> );
  22.  
  23. emit(0);
  24. emit_LABELnVALUE( <FAST_PEB_LOCK, FastPebLock> );
  25. emit_DataType( <PRTL_CRITICAL_SECTION> );
  26.  
  27. begin
  28.     LEVEL_inc(0);
  29.     emit(0);
  30.     emit_RTL_CRITICAL_SECTION_W61(FastPebLock);
  31.     LEVEL_dec(0);
  32. end; { FastPebLock }
  33.  
  34. emit(0);
  35. emit_LABELnVALUE( <AT1_THUNK_SLIST_PTR, At1ThunkSListPtr> );
  36. emit_DataType( <PSLIST_HEADER> );
  37.  
  38. begin
  39.     if At1ThunkSListPtr <> nil then
  40.     begin
  41.         LEVEL_inc(0);
  42.         emit(0);
  43.         emit_SLIST_HEADER_x32(At1ThunkSListPtr);
  44.         LEVEL_dec(0);
  45.     end; { At1ThunkSListPtr }
  46. end;
  47.  
  48. emit(0);
  49. emit_LABELnVALUE( <IF_E0_KEY, IFE0Key> );
  50.  
  51. emit(0);
  52. emit_UnionHeading( <CROSS_PROCESS_UNION> );
  53.  
  54. begin
  55.     LEVEL_inc(0);
  56.  
  57.     { the width of the ReservedBits field is much greater than the HEX }
  58.     { WIDTH we’ve been using (emit_.. does that) because }
  59.     { we set the width to be used for the next few values to be correctly }
  60.     { right justified. }
  61.     { 6 groups of 4 bits + 1 group of 3 bits + 6 separating spaces }
  62.     LEVEL_VALUEWIDTH( ((6 * 4) + 3) + 6);
  63.  
  64.     with CrossProcess do
  65.     begin
  66.         emit(0);
  67.         emit_LABELnVALUE( <CROSS_PROCESS_FLAGS, CrossProcessFlags> );
  68.  
  69.         { groups of 4 bits }
  70.         emit(0);
  71.         emit(0);
  72.         emit_LABELnVALUE( <PROCESS_IN_JOB, boolean(ProcessInJob)> );
  73.         emit_LABELnVALUE( <PROCESS_INITIALIZING, boolean(ProcessInitializing)> );
  74.         emit_LABELnVALUE( <PROCESS_USING_VEH, boolean(ProcessUsingVEH)> );
  75.         emit_LABELnVALUE( <PROCESS_USING_VCH, boolean(ProcessUsingVCH)> );
  76.         emit(0);
  77.         emit_LABELnVALUE( <PROCESS_USING_FTH, boolean(ProcessUsingFTH)> );
  78.         emit(0);
  79.         emit_LABELnBITs( <RESERVED_BITS_0, ReservedBits0>, 27 { spare bits } );
  80.     end; { with CrossProcess do }
  81. end;
  82.  
  83. LEVEL_dec(0);

The advantage is that constants are defined once and can be used anywhere. Duplicating strings hardcoded in the source makes projects less manageble. Alternatively you could try an editor that can set the color for different identifier types, but I doubt that any editor would support coloring different types of constants. It's probably impossible when using editing solutions such as SynEdit.

Unfortunately Pascal doesn't support case sensitive identifiers.
« Last Edit: May 10, 2025, 07:28:32 pm by munair »
It's only logical.

440bx

  • Hero Member
  • *****
  • Posts: 5264
Re: Syntax coloring
« Reply #13 on: May 10, 2025, 08:14:54 pm »
The advantage is that constants are defined once and can be used anywhere.
That's why I'd like to use colored constants instead of the hard coded strings.

The problem is that in this case, the use of constants comes with several _severe_ dowsides, among them:

1.  The constant's name hides the _true_ label name.  For instance, there are no spaces in the label "FastPebLock" this fact is hidden in the name "FAST_PEB_LOCK".

2. the CamelCase structure of the field name is no longer visible.  It is hidden in the all uppercase name.

3. there will be many cases where it will not be possible to tell if a constant name is that of a field/identiifier or a type name.

4. The code will look very uniform forcing the programmer to actually _read_ the code.  As it is now, the fact that 'FastPebLock' is camel case and it's data type is all upper case makes it instantly obvious that one is a field name and the other is a data type without any reading (just pattern recognition does it.)

Normally, I do it the way you suggested but, in that program, the information loss is too great.

If I decide to use constants, which I am not sure I will, they will be camel case to reflect the original constant and prefixed to indicate whether it is a field name or a field type, something along the lines of:  fn_FastPebLock (fn = field name) and dt_PSLIST_HEADER (dt = data type) for its type but, it still will require reading instead of simply recognizing a pattern.  On that note, the presence of an underscore in fn_ and dt_ is also to make the fn and dt recognizable as patterns which would be harder without the underscore.


(FPC v3.0.4 and Lazarus 1.8.2) or (FPC v3.2.2 and Lazarus v4.0rc3) on Windows 7 SP1 64bit.

 

TinyPortal © 2005-2018