Recent

Author Topic: New ID attributes on SynEditHighLighter unit.  (Read 57822 times)

Edson

  • Hero Member
  • *****
  • Posts: 1296
Re: New ID attributes on SynEditHighLighter unit.
« Reply #30 on: November 27, 2013, 01:10:37 am »
 %)
This is material for a new Topic. Some like  "Reflexions about the Scope of the New Highlighters".

It's a very interesting topic. I have faced some of this issues  when designing a highlighter. and I can resume it, saying what I have pointed before:

Code: [Select]
Numbers and strings are constant too, but at the Syntactic or Semantic Level. At the lexical level they are differents tokens category. It depends on what level we want to carry the HL.
Like I see, most of the difficulties you find on the definition of HL, are because they bellow to the Sintactic level.

At the Lexical level, tokens are tokens and they don't depend on blocks, range or context. At the next level, there are other complications. Thats why I prefer to maintain the HL like lexers with just a few sintactic features.

Returning to the topic, we agree on including:

* SYN_ATTR_NUMBER
* SYN_ATTR_VARIABLE
* SYN_ATTR_DIRECTIVE
* SYN_ATTR_ASM 
* SYN_ATTR_TEXT   

Now, I just need to know how to make a patch.  :-\
Lazarus 2.2.6 - FPC 3.2.2 - x86_64-win64 on Windows 10

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9754
  • Debugger - SynEdit - and more
    • wiki
Re: New ID attributes on SynEditHighLighter unit.
« Reply #31 on: November 27, 2013, 01:53:57 am »
Now, I just need to know how to make a patch.  :-\

And how to use [ quote ] instead of [ code ]....

WinMerge can generate patches. But you need the original, and the modified.

Since you should base a patch on the latest SVN, I recommend to use TortoiseSVN, which also has the option to create a patch.

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9754
  • Debugger - SynEdit - and more
    • wiki
Re: New ID attributes on SynEditHighLighter unit.
« Reply #32 on: November 27, 2013, 02:14:30 am »
About lexical and syntactical.

I am not sure what the point is (other than complexity and speed...)

But a generic HL does not need to handle any token special.

It may have a hardcoded implementation for strings, but a string token is still nothing special. No different from a keyword, or anything else.

A pascal string (simplified) could be defined as:
'  go to state "string"

In state "String" the following tokens are recognized (and end the state)
' #10 #13 #0

And that's it.

Edson

  • Hero Member
  • *****
  • Posts: 1296
Re: New ID attributes on SynEditHighLighter unit.
« Reply #33 on: November 27, 2013, 04:17:56 am »
Quote
I am not sure what the point is (other than complexity and speed...)

It's order. It is dividing the complexity in layers for better analysis and implementation.

If we define the HL just work identifying tokens (lexical) we simplify the process. Just need to define simple rules. No worry for context, ranges, blocks.

Strings are tokens that we can define in differents ways. Using the syntax of this highlighter http://forum.lazarus.freepascal.org/index.php/topic,22148.0.html
We can define Strings style Pascal:

  <Token Start="'" End="'" Attribute='STRING'></Token>
  <Token Start="#" Content = '0..9' Attribute='STRING'> </Token>

The first sentence is what I call "Delimited Token". The second way is "Token by content". With this two ways of definyng tokens, we can cover most of the tokens categories. Numbers are usually "Token by content", Comments are usually "Delimited Token".

One line Comment:
  <Token  Start="//" Attribute='COMMENT'> </Token>

Directives:
  <Token Start="{$" End="}" Attribute='DIRECTIVE'></Token>
  <Token Start="{%" End="}" Attribute='DIRECTIVE'></Token>
 
Strings style Python:
  <String Start="&quot;&quot;&quot;" End="&quot;&quot;&quot;" multiline=true></String>


But they are just tokens, wherever they appear. Given the HL the ability for changing the category of one token is not part of the lexical level.

We can focus the problem of the blocks, at the same way we want to treat the ASM blocks. Each block or range (begin ... end, repeat ... until) should be strictly processed for a nested HL. Inside a block there is only tokens. Managing blocks is part of the sintactical work.
« Last Edit: November 27, 2013, 04:29:41 am by Edson »
Lazarus 2.2.6 - FPC 3.2.2 - x86_64-win64 on Windows 10

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9754
  • Debugger - SynEdit - and more
    • wiki
Re: New ID attributes on SynEditHighLighter unit.
« Reply #34 on: November 27, 2013, 01:53:57 pm »
Quote
It's order. It is dividing the complexity in layers for better analysis and implementation.

ok, so basically a way to deal with complexity, and many resulting properties.

You are right, about strings, if you match tokens with regular expressions (or similar). But then they are still nothing special.

Of course if you want to keep it simple, and avoid any kind of state or context, then it may be hard to deal with fpc nested comments. You do need to maintain the nestlevel.

But back to the point, which IIRC was
Quote from: Edson
Constants, Variables, Directives, Functions could be consider like a sub-category of Identifiers, in most of the cases, but in some syntaxs they could have some lexical special definitions, like the variables in PHP.
Quote from: Edson
Quote from: martin
This attempt to structure tokens, on a  global (across all Highlighter) level, is adding to my dislike of the initial idea.
It's necessary to have some kind of generalization for working with Scriptable and Multi-Sintax HL.

Of course, not all the Syntax can be included, but they can be managed with a special HL.
I do not see at all, why making one token "subclass" to another is needed. You may do so for your own HL (though I do not even see, why that would be needed [1]), but in generic (cross HL) there seems no point in it.

The closest to that is the color config of IDE directives in pascal, which inherits the colors of fpc directives. But the tokens are still independent.

[1] Maybe if you want to offer dependent configs? But then there would (or could) be new base-classes of attributes. E.g. pascal could have 3 directive attributes instead of 2: directive-base, D-fpc, D-ide. The base would never be used alone. But with that you get back to the question, which one is the default?

Quote from: Edson
I think it's obvious that a Scriptable Highlighter must have a finite quantity of elements and rules in order to cover many syntax. But trying to cover all the existent syntax would be impossible, and it will require a lot of code and process.

finite, yes, but due to the limitations of any PC, and human beings.

But hard-coded maximum (other than high(Integer)) ? No.

A user can define as many attributes as he wants. So long as the can be identified by the matching engine.
- Environmen-var %foo
- global var ::Foo
- local var $foo
- object var :$foo
....
There is no limit, Neither should there be.

Edson

  • Hero Member
  • *****
  • Posts: 1296
Re: New ID attributes on SynEditHighLighter unit.
« Reply #35 on: November 27, 2013, 06:48:30 pm »
A string is just a simple token, because usually it's a delimited token using the same single-char character like delimiter. But it's not necessarily true.

  <Token Start="#" Content = '0..9' Attribute='STRING'> </Token>

This is an unusual definiton for a token string used in Pascal.

Quote
I do not see at all, why making one token "subclass" to another is needed. You may do so for your own HL (though I do not even see, why that would be needed [1]), but in generic (cross HL) there seems no point in it.

The Keywords are the clasical example. They don't have a special lexical definition. They have the same definition for identifiers:

  <Token Start= "A..Za..z_" Content = "A..Za..z0..9_" Attribute='IDENTIFIER'> </Token>

So they are no other category of token (of course we can create a definition for every weyword and make them common tokens, but it's not practical).

Like the keywords are a subset of the Identifiers, we can consider them a sub-category of identifiers.

This is a fast way for identify a keyword. First we find a identifier and then we compare if it is a Keyword. (We can apreciate this on most of the HL implementation).

The same way we can do with constants, variables, and functions. If all of them have the same token definition. (Pascal have).

But some languages can have a different token struct for a variable (like PHP), constant, function or even a Keyword. In this cases we have to create a special definition for this kinds of token:

  <Token Start= "A..Za..z_" Content = "A..Za..z0..9_" Attribute='IDENTIFIER'> </Token>
  <Token Start= "@" Content = "A..Za..z0..9_" Attribute='KEYWORD'> </Token>

Here, we can not say that a KEYWORD is a subcategory of an IDENTIFIER.

Other example of subcategory, no so easy to see, could be the "Operator". I can define an OPERATOR like a sub-category of a SYMBOL (I haven't implemented yet). Again, we do this for speed and avoid to write a token definition for each operator.
Lazarus 2.2.6 - FPC 3.2.2 - x86_64-win64 on Windows 10

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9754
  • Debugger - SynEdit - and more
    • wiki
Re: New ID attributes on SynEditHighLighter unit.
« Reply #36 on: November 27, 2013, 07:29:22 pm »
String exist in many more form, such as containing escaped quote. Or special escapes allowing them to go past end of lline, to the next line.
In perl there is q//, q{}, q!!, ... and many more.

And that is just by looking at a few given languages. Highlighters must be able to deal with any definition of a language.

And if we deal with yet unknown languages, then why should keywords be subset of identifiers?
A language could define, that keywords must be all upper, while identifiers must be all lower case. Then they are entirely different groups.

Reading on, you pointed that out yourself.

So then again, why applying those "scope" rules, to a scriptable HL, when the power of a scriptable HL should be, that it can deal with any language?

As for operator: "and" is also an operator. It may be both is some languages, a keyword and an operator., then it would be in both groups (according to your rules): identifier and symbol. But that makes no sense.

--------------------
If you want to do such grouping in your own HL, no one stops you.

But limiting all highlighters, by adding the concept of rules, that can not be fulfilled by all highlighters, imho not a good idea.


--
One more, you assume that identifiers are always separated by none letters.

There is a programming language called "whitespace". No if this exists, it is possible to also create one, that has no spaces at all, but where there are only a-z and nothing else. all tokens , keywords, operators, identifiers are part of a single very long word.
One way would be camelcase, they all start with an uppercase, and consume the following lower-case letters

Another would be that they start with a letter that indicate the len (A=1.-.Z-26). Then parsing can only start at the very begin:
Code: [Select]
AXBXYCFOOAX
AX BXY CFOO AX

Again what then is a subset of what? All tokens are made of a-z, and all are at the same level. Are operators a subset of identifiers, or identifiers a subset of operators?
There may not even be identifiers. "Whitespace" and "brainfuck" to not have identifiers...

Edson

  • Hero Member
  • *****
  • Posts: 1296
Re: New ID attributes on SynEditHighLighter unit.
« Reply #37 on: November 27, 2013, 08:08:30 pm »
Quote
why should keywords be subset of identifiers?

No. That's not what I have expresed:

A keyword/Variable/Constant/Function, DO NOT HAVE to be a subcategory of a Identifier. I have showed one definition of when it's not true. It's optional for who define the syntax.

Quote
A language could define, that keywords must be all upper, while identifiers must be all lower case. Then they are entirely different groups.

Totally agree:

  <Token Start= "A..Z" Content = "A..Z_" Attribute='KEYWORD'> </Token>
  <Token Start= "a..z" Content = "a..z" Attribute='IDENTIFIER'> </Token>

Quote
So then again, why applying those "scope" rules, to a scriptable HL, when the power of a scriptable HL should be, that it can deal with any language?

For speed and complexity.

Quote
As for operator: "and" is also an operator. It may be both is some languages, a keyword and an operator., then it would be in both groups (according to your rules): identifier and symbol. But that makes no sense.

It seems, I haven't been so clear.  :o You are mixing the lexical and sictactical levels.

Quote
One more, you assume that identifiers are always separated by none letters.

No. That's not true. Look at the definition os an identifier:

  <Token Start= "A..Z" Content = "A..Z_" Attribute='IDENTIFIER'> </Token>

The definition if for content. It can include spaces or symbols. And if I define for delimiters, it can contents almost whatever, like a HEREDOC.

Lazarus 2.2.6 - FPC 3.2.2 - x86_64-win64 on Windows 10

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9754
  • Debugger - SynEdit - and more
    • wiki
Re: New ID attributes on SynEditHighLighter unit.
« Reply #38 on: November 27, 2013, 08:34:10 pm »
Ok, no we do have a huge misunderstanding somewhere.

You spoke about introducing scopes (as in one type of token being a subset for another):

1) This is fol your HL only, and not generic?
There is no code/definition or other reference to has in any other HL?
None of the base classes for HL need to provide anything special for this?

2) This is.. or This is not ... related to the  IDs for GetDefault attribute?

3) (Assuming "yes, only your HL" in (1) ): Your HL will.. enforce this? ... offer it as an option?

-----
Code: [Select]
It seems, I haven't been so clear.  :o You are mixing the lexical and sictactical levels.
How so? Or how so, any more than you?

How is that different from choosing some identifiers as a keyword? Or from defining some symbols as operators (which you mentioned, by declaring them as subset)

I did not say that "and" is an operator depending on context. In the example "and" would *always* be an operator, same as "+" is.

There is no syntactical analyses needed to say: "and" is an operator.

----
Code: [Select]
[quote]    One more, you assume that identifiers are always separated by none letters.
[/quote]No. That's not true. Look at the definition os an identifier:

Sorry inaccurately worded by me.

The point is, a language could define +++-* as identifier (allowing it as name for a variable or function. (In brainfuck there are only symbols).

--------------------
So then the point about the scoping is:
Quote
For speed and complexity.

I am not convinced. But since I have not looked at any code, I can not judge, if in your case it will speed up things, or reduce complexity.

Just at what point, will there be established, what is a subset of what? Since it will only be known after the user conf was read?


Edson

  • Hero Member
  • *****
  • Posts: 1296
Re: New ID attributes on SynEditHighLighter unit.
« Reply #39 on: November 27, 2013, 10:32:50 pm »
You spoke about introducing scopes (as in one type of token being a subset for another):

Yes. I spoke about:
a) The scope of a nested HL (range or block).
b) The ability for defining subsets of a token with a different Attribute (like Keywords).

I'm not sure what do you refer exactly?

1) This is fol your HL only, and not generic?
There is no code/definition or other reference to has in any other HL?
None of the base classes for HL need to provide anything special for this?

For creating my HL, I have been learning of some Lexer's and editors (Notepad++ and UltraEdit). One lexer can support nesting, but it's too much power (and process) for a highlighter. I had to simplify the flexibility for gain speed.

As I have seen, Nested HL are a need if we can make a reasionably flexible scriptable HL.  By now I haven't implemented the nesting. (I'm taking a rest of HL by now), but I expect to develop it, on a future.

I had not structured yet, how it will be defined a case of nested HL. I have some ideas buy nothing clear by now. Not even if this will need to modify the base class.

But what I can say by now, is that (and this is material for another Topic) it's related to the Folding and Code-Completion features.

If we work with nested HL, we should consider, that even the whole HL have attibutes, like the background color, or the default font. We could see all the background of a procedure on a different color, if we use nested HL for blocks.

Quote
2) This is.. or This is not ... related to the  IDs for GetDefault attribute?

In some way. Because when working with scriptable HL, it's necessary to make generalization about the attributes.  If the HL works at the lexical level, it can name their attributes on any way (probably need just a few constant SYN_ATTR_COMMENT, SYN_ATTR_IDENTIFIER,   SYN_ATTR_KEYWORD , etc.).

If we work at the syntantic level, we can managed a lot of attribute (SYNS_AttrASP, SYNS_AttrAssembler, SYNS_AttrAttributeName,   SYNS_AttrAttributeValue, SYNS_AttrBlock, ...)


Quote
How is that different from choosing some identifiers as a keyword? Or from defining some symbols as operators (which you mentioned, by declaring them as subset).

For visual results, no differences.

The operator "and" could be defined as a Subset of Identifiers with the attribute OPERATOR. For visual efects, it will be similar to defining the operator "&&" like a subset of Symbol.

There is not inconsistence.

But if we don't define "and" like a OPERATOR token, it will be sure considered as an identifier (and coloured as such). And it's lexically correct. Probably, syntactically we know that "and" is an operator, but the HL have not way for know it.

Again, there is not inconsistence.

Remember when I said that managing subsets of tokens is some kind of sintactical approach of a HL.

Quote
The point is, a language could define +++-* as identifier (allowing it as name for a variable or function. (In brainfuck there are only symbols).

OK we can define some like:

 <Token Start= "+" Content = "+-" Attribute='IDENTIFIER'> </Token>

Or

 <Token Start= "+" End = "-" Attribute='IDENTIFIER'> </Token>

The rule is than at "Lexical Level", an attribute is just THE WAY A TOKEN IS SHOWED in the screen. This mean that a VARIABLE attribute, no necessary have to correspond to the definition of a VARIABLE in the syntax of a language.

In fact I could define my identifiers like:

<Token Start= "A..Za..z_" Content = "A..Za..z0..9_" Attribute='DIRECTIVE'> </Token>

And if the attribute DIRECTIVE have the same properties that I expect to see on the editor for an IDENTIFIER, the user won't have idea (and really don't care) the name of the attribute, except for Config.

About dinamyc attributes:

We can define dynamically, many attributes on a syntax.  This definition create a new attribute, with the label NUMBER:

  <Token Start="0..9" Content = '0..9' Attribute="NUMBER"> </Token>

We can say that the NUMBER attribute always exists, but in a scriptable HL, you have to create it.

Currently I can not create dinamicaly attributes on my highlighter (But I have analysed that possibility), but I have defined two "empty attributes" for using on defining new token struct:

  <Token Start="{%" End="}" Attribute='EXTRA1'></Token>

Craete dinamicaly an new attribute on a scriptable HL should consider the name of the attribute ("MYSTRING") and the label (STRING).

Lazarus 2.2.6 - FPC 3.2.2 - x86_64-win64 on Windows 10

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9754
  • Debugger - SynEdit - and more
    • wiki
Re: New ID attributes on SynEditHighLighter unit.
« Reply #40 on: November 27, 2013, 11:26:39 pm »
Quote
For creating my HL, I have been learning of some Lexer's and editors
Yes, I noted you done some research...  you forced me to do a bit of read up too, somewhere back on this thread.

Going to reply out of order...
-----------------------------------------
about the "and"
You wrote:
Quote
Other example of subcategory, no so easy to see, could be the "Operator". I can define an OPERATOR like a sub-category of a SYMBOL
I do not know: did you mean that in a lexical, or syntactical parser? IMHO that decision can be done on a lexical level. The token (e.g. "+" can be looked up in a lexicon, and there it says operator)

Then the same can be said for "and". "and" only exists as operator. It never can be anything else. So it can be determined on a lexical level.

Meaning, that if you introduce subclasses as you said "operator =subclass of symbol", then something is wrong.
Just goes to show, how very tricky it is to define those subclasses.

Nevertheless, you are free to define and as keyword (current pas HL does do that do / in fact "and" is both"). Then it (kind of) works.

-----------------------------------------

Quote
Code-Completion features
Is not currently HL related at all. Codetool have there own scanner, that does run much less frequent.
-----------------------------------------
Quote
a) The scope of a nested HL (range or block).
b) The ability for defining subsets of a token with a different Attribute (like Keywords).

I was mainly about (b).

*** As for (a) *** (referred to as "assumption" in the rest of my reply)
"nested HL"
1) 2 HL, one nested in the other (e.g. asm in pascal)?
2) nested = context sensitive (nested comment, or keyword (nested) in a block?

*** As for (b) ***
I am very sceptical of this. I can see that for some fixed HL, where such thinks can be predicted, but in a scriptable, I see it as a limitation.

As I already set (and tried to show), only after the scripted definition are read, you can calculate, what may be a subset of what.
But then it just means extra vork, extra code, and I see no benefit.

If you define it upfront (hardcoded), It is a limitation, and only text can be highlighted where the subsets are indeed of the expected kind.

Anyway
Quote
For creating my HL
Your choice then.

-----------------------------------------

Quote
If we work with nested HL, we should consider, that even the whole HL have attibutes, like the background color, or the default font. We could see all the background of a procedure on a different color, if we use nested HL for blocks.

Not sure is this assumption 1 or 2?
HL, can mix attributes, so yes you can define to return multiple attributes for some code.

-----------------------------------------
Quote
In some way. Because when working with scriptable HL, it's necessary to make generalization about the attributes.

This, I have a real problem with.

I do not see any need for such assumptions. On a lexer level, there is a direct mapping from match to token-kind.
The token kind must have no meaning to the lexer at all.

On a syntactical level, state can change depending on token-kind found. There may be groups of token sharing behaviour.
It is not necessary to group them, but then the behaviour must be repeatedly specified by the user in the config file.
Grouping is an option to make configuring easier for the user. But grouping itself must be a configuration. Hardcoded groups are always going to be a limitation.

-----------------------------------------
As for "how to create attributes", I newer doubted the ability to create the rules for any of my examples.
My examples where given only to show how the could conflict with any assumption made about token subclassing. (unless the subclassing comes from config, and is not mandatory)




Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9754
  • Debugger - SynEdit - and more
    • wiki
Re: New ID attributes on SynEditHighLighter unit.
« Reply #41 on: November 27, 2013, 11:36:09 pm »
Just one more note. If the problem is to decide which rule applies...

e.g
<Token List="Begin,end,case,try" Attribute='KEYWORD'> </Token>
<Token Start= "A..Za..z_" End = "A..Za..z0..9_" Attribute='IDENTIFIER'> </Token>

Now "begin" is matched by both rules.

Then rules should have a priority. That is similar, but it does not define a relation between the resulting types (subclassing does define a relation)

priority can be the order of declaration in the list of rules.

Edson

  • Hero Member
  • *****
  • Posts: 1296
Re: New ID attributes on SynEditHighLighter unit.
« Reply #42 on: November 28, 2013, 05:54:00 am »
Just one more note. If the problem is to decide which rule applies...

e.g
<Token List="Begin,end,case,try" Attribute='KEYWORD'> </Token>
<Token Start= "A..Za..z_" End = "A..Za..z0..9_" Attribute='IDENTIFIER'> </Token>

Now "begin" is matched by both rules.

Then rules should have a priority.

I have a fixed priority. Subcategory definition prevalece. It's like we define numerical series/prefix on telephone plan number:

0099 -> Country A
00995 -> Country B         //subcategory prevalece

Actually the syntax of the "syntax file" is something like:

<Token Start= "A..Za..z_" End = "A..Za..z0..9_" Attribute='IDENTIFIER'> </Token>
<Keywords> begin end case try </Keywords>

I adopted this struct, for compatibility with the Notepad++. So we can easy adapt their language definitions.

And the order of the declarations (Token, Keywords) is not important. I do two-pass reading. First I read the definition of Identifiers, numbers. Because they are the base for building other tokens definiitions.

Quote
I do not know: did you mean that in a lexical, or syntactical parser? IMHO that decision can be done on a lexical level. The token (e.g. "+" can be looked up in a lexicon, and there it says operator)

Yes, we can have some rules at the lexical level, that can define "and", and "+" like attribute OPERATOR. (Note: I don't say they are Operators, just are lexical defined with the Attribute OPERATOR). But we can do too, at the syntantical level.

There is a confusion when we talk about type, category or attribute of a token.

"Token is one or more characters, grouped by specific rules. Each token have one and only one attribute."

At the lexical level, we can only talk about tokens and their attributes.

At the syntactical level we can recognize types, procedures, functions, classes, objects, ...

Quote
Then the same can be said for "and". "and" only exists as operator. It never can be anything else. So it can be determined on a lexical level.

* OPERATOR (lexical) -> Have a forecolor, backcolor, font, ... . Just have rules for defining the token.

* operator (syntactical) -> Some token that can construct expresions, have strict rules on the language.

Quote
As I already set (and tried to show), only after the scripted definition are read, you can calculate, what may be a subset of what.
Quote
Meaning, that if you introduce subclasses as you said "operator =subclass of symbol", then something is wrong.

I don't know what is wrong.  In the HL we just define attributes for tokens. Lexically we can define operator in diferentes ways: Like a "delimited token", like a "token by content", "Like a subcategory of a token". And we can use one or more definitions at the same time. Syntactically, it doesn't care. We can even consider some other IDENTIFIER (no lexically identified) like an Operator. Probably here is the confusion.

I dont see any conflict.
Lazarus 2.2.6 - FPC 3.2.2 - x86_64-win64 on Windows 10

Martin_fr

  • Administrator
  • Hero Member
  • *
  • Posts: 9754
  • Debugger - SynEdit - and more
    • wiki
Re: New ID attributes on SynEditHighLighter unit.
« Reply #43 on: November 28, 2013, 10:51:37 am »
Nothing you wrote does in anyway change what I wrote.

Most of it extends what can be done, but does not deal with the actual issue described by me.

And as I wrote, as for implementation it does probably not bother you.

The issue is constructed purely from what you said. With the single goal to show, that your definition (in regards to subclassing) could lead to conflicts.

Those conflicts may not matter in your implementation. But that is not the point.

The point is to show that subclassing (by the need to avoid conflicts) can add limitations to the capabilities of  the HL.
This is why I think they are not a good idea.

This is not to say that a HL must be limitless. Limits will occur, and in more than one place. But why adding them where they are (imho) not needed.

I described the problem twice. But somehow we look at it at too different an angle. So your response show that you concentrated on other aspects than I.

I saw this as a very abstract theoretical example. You seem to look at the practical side only (just guessing, sorry if wrong).

This part is not that important. But if interested, re-read my posting, and put it only in the context of those words I quoted from you. Ignoring all the possibilities, of what could be done.

-----------

Now to more practical. Let me try to understand how you deal with subclasses.

Quote
<Token Start= "A..Za..z_" End = "A..Za..z0..9_" Attribute='IDENTIFIER'> </Token>
<Keywords> begin end case try </Keywords>

I assume Keywords, are in this case meant to be a subclass of identifiers?

If so, and only if so:
At which time is that established?
1) Hardcoded?
2) calculated from the above config?

Edson

  • Hero Member
  • *****
  • Posts: 1296
Re: New ID attributes on SynEditHighLighter unit.
« Reply #44 on: November 28, 2013, 05:53:31 pm »
Quote
Most of it extends what can be done, but does not deal with the actual issue described by me.

Probably I'm not understanding the issue. Can it be shown on a practical case? Would you please put some practical case when this issue occurs?. 

Quote
The point is to show that subclassing (by the need to avoid conflicts) can add limitations to the capabilities of  the HL.

I think the confusion is about the term "subclass". I have never used this word, because it can have some strong implications. I just say "subset" or "subcategory".

Quote
<Token Start= "A..Za..z_" End = "A..Za..z0..9_" Attribute='IDENTIFIER'> </Token>
<Keywords> begin end case try </Keywords>

I assume Keywords, are in this case meant to be a subclass of identifiers?

If so, and only if so:
At which time is that established?
1) Hardcoded?
2) calculated from the above config?


By now I have it hardcoded.
Formally, it should be something like this:

<Token Start= "A..Za..z_" End = "A..Za..z0..9_" Attribute='IDENTIFIER'> </Token>
<Subset Set='IDENTIFIER' Attribute='KEYWORD'>
   begin end case try
</Subset>

I maintain in the form <KEYWORDS> </KEYWORDS>:

1. For compatibility with Notepad++.
2. For simplicity. It is easier than get the concept of "Subset of tokens"
3. Because I haven't defined other case of Subset. I'm studying the case of Symbols and Numbers.

Lazarus 2.2.6 - FPC 3.2.2 - x86_64-win64 on Windows 10

 

TinyPortal © 2005-2018