AFAIK each FoldBlock represents a logical area of folding system.
This is at least ambigious, if not misleading.
FoldBlock is part of the range (
http://wiki.lazarus.freepascal.org/SynEdit_Highlighter#Important_note_on_Ranges ).
It's purpose is to provide the absolute minimum information needed, so that a scan of this line can be done.
Actual information about the folding area, can be discovered during the scan. It is not stored, but can be returned when GetFoldNodeInfo is called.
Ranges (if they are an object) are re-used. And with that the same FoldBlock object is used to represent the state on many lines.
begin // range at the end of line has foldblock 0x0001
if a then begin // range at the end of line has foldblock 0x0002
end; // range at the end of line has foldblock 0x0001 since the inner fold is closed
if longer_condition then begin // range at the end of line has foldblock 0x0002 again (re-used), even the X pos differs.
end;
end;
So FoldBlock does not represent a logical area of the folding system.
It seem that a single FoldBlock is enough for storing the whole lines via it's unlimited nested Children.
not sure what is meant by this?
HL only have one FoldBlock : HL.RootCodeFoldBlock Once its created, wouldn't be replaced. (immutable).
The top level is always "unfolded". But it has (can have) many (siblings or nested) children. So there are many blocks.
As with the range, one needs to differentiate between the working copy for the currently scanned line (not immutable), and the range (+ foldblock) stored with already scanned lines (immutable)
HL has one Range (property CodeFoldRange) for one time
this property is immutable; meaning the value is replaced one with other when the line being parsed is switched from one line to next line (and so from one to random line index).
a Range is created when HL is start parsing the first line.
another range is created when HL start parsing the next line.
But, When the first line is being re-parsed,
a new range is recreated.
Therefore a new range is also created when second line was being re-parsed.
being amazed ?
see last comment.
A new range is not always created.
Start parsing a line: A mutable copy of the last known range (from end of previous line) is created (by assigning to the existing working range object)
End parsing: The list of existing stored ranges is searched for a range equal to the working range. If found (a reference to) the found range is returned, otherwise it is created and added.
Now the show:
SynEdit works in per line basis.
So parsing the whole lines from first to last would never happen.
It happens when the text is first loaded.
It also happens when (for example) in pascal, with nested comment, a "(*" as added at the start of file, turning all into a comment. That is a todo, it needs to be optimized to scan only to the point needed (last visible line, or last line affecting visible lines).
A line can only be scanned if all lines above were scanned. Only then the range at its start is valid. (though there may in future be HL that can provide the range without this requirement)
Q: Wow, I am now noticed that SynEdit has such stupid behaviour: reparsing a line more than once.
A: You are another stupid when exposed earlier conclusion without knowing the big design behind it.
SynEdit (core) itself does not trigger or need "reparsing"
Some modules (like folding) need to parse a line a 2nd time, to got extra info.
BUT this is only for visible lines, not for the entire document (so maybe 50 out of 10000 lines).
Also those modules can/should then cache the data the got (until the line changes).
If several independent modules need to do this, it may be possible to further optimize this....
Q: Lets Don't be pedantic. When a line is unchanged, why this line being parsed twice or thousands time?
It is not. (not thousands)
Unless modules are added to synedit that do not cache the info.
It may be scanned twice, because the 2 scans operate in different mode, with different purpose.
With many modules it may have ONE extra scan per module. This will eventually need to be optimized, but is not a problem at current.
Q: can the parsed info be stored statically, so parsing will only be required once?
See above, there may in the future be a need to CACHE this for the VISIBLE lines (and visible lines only).
There is no point to store this info for 100.000 lines.
Just a note:
A huge part of this design is to try and keep memory usage down (and yes there is a lot still to be optimized even further than now)
Why? Why, if memory is cheap and available?
Well because using more memory can slow down computation.
I made this experience when working on fpdebug. Loading, storing, and then working with (searching) debug info. I tested with debugging the IDE itself, with all packages having full debug info. Initially the index (created in memory on the debug info) was several 100 MB. Not much. But finding a format that required just under 100MB, even though it made the code itself more complex, improved speed quite a lot.