Hi
I remember 1st time when I saw hash codes and tables : It was in 1979 when I disassembled microsoft level2 basic rom on my TRS-80 machine. These days, it was clever because :
1) the kewords were transformed to 1 byte in small ram
2) the original program listing was reconstituted, doing the opposite way
3) apple computers' basic interpreter did not have such assembler tokenization, they lost
If you want to implement a compression tool, a dictionary or a hash table, even a boolean or huffman like tree can be, for sure, helpful.
but in the case of a highlighter, compression is not the goal. I suppose that there could be faster ways, maybe a
sorted list of all allowed keyword. As it is a sorted constant, a kind of 'prediction' -proceeding by elimination- can be make with a character scanner that abandon the seach instant when match fails.
This way of Highlight may break the "tradition" but also may have a chance to be the faster one, since nowadays the original text will stay in memory (no need to shrink like in 1980 when ram was so damn expensive, and when programmers used/invented tokens/hash tables for that shrinking purpose).
If you still believe in hash codes, ask yourself a few questions :
- I plan to spare memory, like around 1980 when I had 16Kb of ram only? no
- hash codes will be used to keep coordinates of font changing? no
- hash codes are going to help for a dichotomic seach on an orderd constant list? no, it is ordered !
... and so on
and the Light will come to you