Lazarus

Programming => Packages and Libraries => SynEdit => Topic started by: Martin_fr on March 24, 2015, 05:10:17 pm

Title: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: Martin_fr on March 24, 2015, 05:10:17 pm
This is a follow up on http://bugs.freepascal.org/view.php?id=27712

http://bugs.freepascal.org/view.php?id=27712#c82262
Quote
This issue is related to your Q"Do you think the default highlight should be dotted underline instead of dotted box?".

Right, I broke the rule myself. It is tricky do decide, if something should go into a bug report or not, and sometimes even it changes when a part of the issue grows bigger.

The problem is that the issue o mantis is about what should happen to selected text.

The code I committed refers to that issue, and sometimes years after I wrote some code I need to know why. Then I can use SVN to find related mantis issues, and read up on it.
In this case  I need to read up on the "overwriting" of selections.

As it stands now a huge parts of the notes (including many of mine) are not about this. This makes it much harder to keep track.

Also as a rule of thumb: Mantis is to describe the issue, and describe what solution was picked and why.
If there is a need to discuss many options, that is sometimes better moved to the forum.
But again that is a rule of thumb, it does not always apply, So there was nothing wrong with bringing up added topics.

------------------------------
Well I am obviously never using the IME in productions myself.So I can only act on feedback, and I am happy for any feedback I can get. Of course feedback must be divided in to (and all a valid, but have different importance)
- technical
- official (design) guidelines / usability
- personal taste

In that aspect (especially for issues that are either of the last category or not clear which category, it would be good if more people could comment.
------------------------------
********** IME and highlighting  **********

The point of the below in **NOT** to impose my taste (that is not worth anything, since I do not use IME at all). It is for me to categorize the comments I received according to the above outline, and ideally get feedback from more people.


http://bugs.freepascal.org/view.php?id=27712#c82255
Quote
Can the highlighter be turned off during input with IME? (See img7.png))
The highlighter bothers input with IME.
(According to image, referring to "Same word highlight" during IME activity (within IME)


http://bugs.freepascal.org/view.php?id=27712#c82262
Quote
The IME has his own drawing attribute. (See https://www.coscom.co.jp/learnjapanese801/lesson11.html [^] and https://developer.mozilla.org/en-US/docs/Mozilla/IME_handling_guide [^] ).
So you must not draw your selfish attribute in during input with IME.
Or you should turn on "IME handled by system" by default.

1) According to the first link, in Japanese words are not separated by spaces. So a "current word" highlight makes limited sense overall? (it acts a current phrase?). That leaves the question if it should be switched off altogether for any Japanese text?

Btw, in such case, if you use SynEdit for Japanese, you can disable the feature entirely. But maybe in mixed text, some detection is needed?

2) The 2nd link mentions that there are 2 styles. One for the entire IME string, one for the clause (active clause). That exists in SynEdit.
The link explicitly states that (in case of full IME integration) those styles are not coming from the IME but are applied by the app. (It gives details on how they can be configured)
Quote
Style of each clause
... Therefore, it can be overridden by prefs
It does not mention if or how that style is affected by other styles that may apply in combination.


My test with libre office (my reference for full IME) show, that the IME applies text color, background color, borders, over-lining, and strike-through to the active IME composition (those attributes are taken from the surrounding text).
Libre office does not apply underlining (as that would obliviously interfere with the underlining of the IME style. (In open office the borders (frame) are several pixels away from the IME underline.

I also think that in SynEdit it is correct to use the current font color (e.g. if string and comment are different color, depending on what you are editing.)


That leaves to consider what to do in SynEdit. For current word, it has first to be decided, if that should apply to Japanese at all.
But there also are other highlight, You can set up user-defined-markup http://wiki.lazarus.freepascal.org/New_IDE_features_since#Multiple_user_defined_word_highlight.2Fmarkup
to have some words or text parts to be highlighted. And there may be other highlights in future that would apply.

A)
If they change font color, or bold/italic then they do not interfere with the IME. I would in such cases expect that it is a matter of taste, if they should be applied to an active IME.
Even if they change background, it would still be a matter of taste.

B)
Of course if they add underline, (or maybe a border, or a backgorund color) - that is anything that obscures the IME dotted underline - then it becomes a usability case.


On (A) I would be keen to get opinions from more people who use an IME.

On (B) I would think it might be a good idea, if the active IME would suppress any frame and underline from other highlights.
(Adding background suppression can be later made an option, but would be simple to archive by changing the code on your own PC)

On example of the "current word" highlight that would mean, that the background color would still change, but the border would no longer be drawn into the IME.

------------------------------
********** IME and the drop down **********

I noted that the IME dropdown, is kind of glued right below the text.

Maybe between 1 and 3 extra pixel would improve the visibility of the underlines?

For testing:
components\synedit\lazsynimm.pas
line 518
In: procedure LazSynImeFull.WMImeRequest(var Msg: TMessage);
Code: [Select]
        cp^.cLineHeight := TCustomSynEdit(FriendEdit).LineHeight + 1;
The " + 1" does not currently exist. Also try " + 2" or " + 3"

------------------------------
Other comments?
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: Martin_fr on March 24, 2015, 05:23:45 pm
On a more serious note: There seems to be an actual bug.

The full IME does not support the changing of "clauses" by using shift-cursor-right/left.
That obviously will need fixing.


The system IME also draws indicative underlines for *all* clauses (current selected clause and other clauses. But this is something that I do not see in libre office (nor Firefox) (the 2 full IME examples I know off). So I consider it an optional feature. (And SynEdit does not support any suitable underlining. That is underlining needs to start 1 or 2 pixel into the word, to be distinct from the previous/next underline). That is considerable more work, therefore not a quick to add feature. But a welcome feature request for the long term.

Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: skalogryz on March 24, 2015, 06:29:20 pm
Martin, do you have access to Windows? malcome is using notepad (windows's IMM) for comparison.

There's mozilla link used as a reference.
I'm afraid that mozilla IME is used in mozilla based applications only (firefox, thunderbird) and cannot be used by SynEdit. But still the article is pretty good for understanding of how IMEs work.
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: skalogryz on March 24, 2015, 06:59:49 pm
on the note about colors of IME - they should match editor's selected colors. And styles (underlines) should follow system IME defaults
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: Martin_fr on March 24, 2015, 07:35:19 pm
IME integration can be done at different levels.

Notepad is kind of minimum. App provides the coordinates on screen. Windows does all the rest. It works, but is not very nice, since it simply goes on top of existing text, and temporarily hides that.
In SynEdit, you get this if you choose "IME handled by system".

Libre Office and Firefox do a full integration. App draws the text as part of the entire document. SynEdit attempts that too. (Very few editors seem to do that)

Firefox docs help little, since it does not match Windows API. Good documentation on Windows API... well still looking for it. MSDN is of little help.

Sure underlines should match defaults. (As far as this is supported. SynEdit can not currently underline a half a char.

The question was how match of the extra highlight should be applied. (That is only of interest for full IME).
IMHO, since the text is immediately part of the document, all extras should be applied. Exception: If such highlight interferes with IME markup. E.g. underline is used by IME. not a good idea to apply other underlines.
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: Martin_fr on March 24, 2015, 07:36:54 pm
Afaik  "IME handled by system" now works like notepad.

There was a bug about overwriting selection. Fixed, and also adapted the fix to be notepad like.
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: Martin_fr on March 24, 2015, 07:39:17 pm
IME is windows only sofar.
That is part of the reason, that there is little user conf. When I did this, I wanted to first see how it would be on other platforms. But then again, that might be to long to wait for.
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: skalogryz on March 24, 2015, 07:46:02 pm
Libre Office and Firefox do a full integration. App draws the text as part of the entire document. SynEdit attempts that too. (Very few editors seem to do that)
what about Windows Wordpad?

The question was how match of the extra highlight should be applied. (That is only of interest for full IME).
IMHO, since the text is immediately part of the document, all extras should be applied. Exception: If such highlight interferes with IME markup. E.g. underline is used by IME. not a good idea to apply other underlines.
Maybe you shouldn't apply extra highlight while composition is in progress? In that case IME users can distinguish between when composition is happening and when it's not happening.

Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: Martin_fr on March 24, 2015, 08:01:48 pm
what about Windows Wordpad?
Behaves like libre office. Except you cant test styles like frames. Wordpad does not offer frames as text format

Quote
Maybe you shouldn't apply extra highlight while composition is in progress? In that case IME users can distinguish between when composition is happening and when it's not happening.

Well that was the initial question.

And as I said. Underlines, should be suppressed for exactly that reason.
Font color does not affect this. Neither does bold or italic.

Background color, may in some cases. That one is hard to decide.

The problem is: SynEdit is modular. The IME does not (and can not) know, which other modules apply highlights, or if a highlight is because it is a string (in pascal), or something else. (Well on that last note, SynEdit could be extended, but then we talk about bigger development, not just a change to IME)

IME can simply say cancel ALL other backgrounds (actually all below priority X, and if X is high enough...). But that will also cancel, if you configured pascal strings to have a yellow background.

Also (besides the question, what is a word, see orig post), if I type new text (and I have the "same word highlight" active), then personally, I would want to see the highlight as I type.

The same applies for user-defined-words highlight. I have a list of "wrong text fragments", that get an immediate red warning. I assume I would want to see them in the IME too?

Any way. In cases where it does not interfere with the underlines of the IME, it is IMHO a personal choice.

@skalogryz Are you using an IME in daily work?
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: skalogryz on March 24, 2015, 08:29:36 pm
Behaves like libre office. Except you cant test styles like frames. Wordpad does not offer frames as text format
Hmm. I've tested IME in richmemo - no surprises - it works, but looks a bit different than wordpad. (font size difference maybe?)

@skalogryz Are you using an IME in daily work?
Nope. Just had some experience working with IME in the past.
We (the company i worked for) ran in the exactly the same problem as Lazarus-team right now - lack of knowledgeable users and testers :)
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: Martin_fr on March 24, 2015, 08:38:34 pm
Hmm. I've tested IME in richmemo - no surprises - it works, but looks a bit different than wordpad. (font size difference maybe?)
If you start the IME in the middle of a line, then while you type does text to the right of the IME move? Or does the IME overlap that text?

Quote
Nope. Just had some experience working with IME in the past.
We (the company i worked for) ran in the exactly the same problem as Lazarus-team right now - lack of knowledgeable users and testers
I am happy for any feedback I can get. After all I am flying blind.

But before I go and develop something, I want to
1) ensure I understand it deep enough
2) Differentiate between personal taste (even if it might be very very common taste, I could not judge that anyway), or guideline/technical issue.

If I change something, because of taste, changes are that some one else will complain and want it back.

Also Understanding things helps me to find a reasonable solution I can add now, rather than a perfect one I can do at some time in the distant future.
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: Martin_fr on March 24, 2015, 08:56:48 pm
On a more serious note: There seems to be an actual bug.

The full IME does not support the changing of "clauses" by using shift-cursor-right/left.
That obviously will need fixing.

Fixed in 1.5
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: skalogryz on March 24, 2015, 09:28:28 pm
If you start the IME in the middle of a line, then while you type does text to the right of the IME move? Or does the IME overlap that text?
It does move the text to the right (https://www.youtube.com/watch?v=yNnwjYUlv70). But that's no surprise, since even in Vista wordpad is doing the same. I'd think it has been doing that since adding support for IME to windows richedit control.

But before I go and develop something, I want to
1) ensure I understand it deep enough
2) Differentiate between personal taste (even if it might be very very common taste, I could not judge that anyway), or guideline/technical issue.

If I change something, because of taste, changes are that some one else will complain and want it back.

Also Understanding things helps me to find a reasonable solution I can add now, rather than a perfect one I can do at some time in the distant future.
The article on mozilla explains the logic behind "underlines" pretty well.
0) (not explained in the article), if a text has not underlines it's ready to read text (this text is subject to any additional underlines by SynEdit)
1) curved underline is user entry. (in the video, you can see that windows is using "dashed-underline" instead). So this is a raw input and might not match the final text
2) then the entered text is split into clause. Each defined clause is underlines by a single line. Currently editted clause is underlined by a thick line (or double-line). (kinda-briefed through this steeps in the video). BUT. in the (www.coscom.co.jp) it suggest that you need to press Shift-Left. in windows you instead need to press Shift-Right to switch between clauses.
3) once the adjustment of clauses is done - hit enter.
the whole section become non-underlined - that's an indication that entry is finished and text is in "ready-to-read" mode... and so on - back to 0)

I presume that CJK speakers/writers are rarely using clause adjustments.  And it's as hard for them as for us to to hold shift when putting Capital Letters.
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: skalogryz on March 24, 2015, 09:52:53 pm
from SynEdit perspective. All you want to know is that's the state of the composition (what's the clauses are (if any defined), what raw input text is). Then you could "redraw" the right size according to the width of composting text .
Knowledge about clauses is needed to draw correct underlines.
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: Martin_fr on March 24, 2015, 10:07:46 pm
Quote
0) (not explained in the article), if a text has not underlines it's ready to read text (this text is subject to any additional underlines by SynEdit)
1) curved underline is user entry. (in the video, you can see that windows is using "dashed-underline" instead). So this is a raw input and might not match the final text
2) then the entered text is split into clause. Each defined clause is underlines by a single line. Currently editted clause is underlined by a thick line (or double-line). (kinda-briefed through this steeps in the video). BUT. in the (www.coscom.co.jp) it suggest that you need to press Shift-Left. in windows you instead need to press Shift-Right to switch between clauses.
3) once the adjustment of clauses is done - hit enter.
the whole section become non-underlined - that's an indication that entry is finished and text is in "ready-to-read" mode... and so on - back to 0)

In Synedit:
1) dotted, easy to change to dashed. I've seen both styles.
2) Current SynEdit text drawer has no support to "interrupt" an underline. So it is currently not possible to underline each clause. (wordpad switches to a continuous, none dotted underline, so you can not see where clauses start/end either.)
The key press are handled by system, so they are the same everywhere: Shift-right/left resizes the clause. (shows a selection like style in most editors.
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: Martin_fr on March 24, 2015, 10:13:20 pm
from SynEdit perspective. All you want to know is that's the state of the composition (what's the clauses are (if any defined), what raw input text is). Then you could "redraw" the right size according to the width of composting text .
Knowledge about clauses is needed to draw correct underlines.
Try it in SynEdit (with both modes.) And with today's latest 1.5.

I can read the clause info from the system. But I can not currently draw underlines, that leave a tiny gap. So that is for sometime in the future.

I also have to see how well I can suppress other underlines.

The rest of the questions has not yet been answered.
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: skalogryz on March 24, 2015, 11:47:02 pm
malcome approved the fix. That's far more reliable than me testing anything :)
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: Martin_fr on March 25, 2015, 12:16:12 am
That was only "overwrite selection"

That is why I opened the topic here. We started discussing in mantis, and adding more and more IME related issues or features.
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: Martin_fr on March 25, 2015, 12:34:55 am
I added
Code: [Select]
    property OnIMEStart: TNotifyEvent read FOnIMEStart write FOnIMEStart;
    property OnIMEEnd: TNotifyEvent read FOnIMEEnd write FOnIMEEnd;

So outside the IDE, you can enable/disable additional highlights any way you want to.
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: skalogryz on March 25, 2015, 12:45:55 am
That is why I opened the topic here. We started discussing in mantis, and adding more and more IME related issues or features.
well... somehow we need to bring more CJK community people here. They could test (patch) for sure.
I remember that one of the guys gave a link (http://forum.lazarus.freepascal.org/index.php?topic=6888.0) to Chinese Lazarus forum.
Links are dead now.
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: malcome on March 25, 2015, 11:23:45 am
Hi Martin,
I am trying Microsoft Visual C++ 2010 Express for now.
He respects IME drawing. See attached image.

Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: malcome on March 25, 2015, 11:43:27 am
I am trying Geany, He respects IME.
But He seems to have the cruel bug.
Your Editor is great.
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: Martin_fr on March 25, 2015, 04:22:12 pm
I do not have Geany. Visual Studio is a good idea.
I have the 2010 Express version. That uses the equivalent to "Handled by System".

"Handled by system" is less work for the IDE. And it means, that the IDE can not control the highlight. (actually, it could probably set font, and background color...)

Not "Handled by system" (full integration) is more work for the IDE, but allows all the extras.

So comparing the full integration to a "handled by system" is not possible, they are different things.

We do not know, if visual studio wants to set those colors, since it technically can not.

-----------------
Anyway, ignoring if it should or should not.
And also, that some of it might not be trivial to fix, so if it was agreed to be fixed, the fix would probably be someday later.

**IF** it was done, what should be drawn?

*Currently the font color does follow context. If you type in a string, the font is blue. Similar in a comment. (if you type outside a string it is red (should maybe be black, shortcoming of the highlighter))
Should the text in the IME be blue, while typing in a string/comment? Or Should it be black.?

*If you use "Highlight current line", then the background of the current line is light-green.
If the IME is open on that line, should the IME show a white background, or follow the light green?

*If a breakpoint is set on that line, the background is red.
If the IME is open on that line, should the IME show a white background, or follow the red?

*Current word highlight:
You answered already => you do not want this highlight.
If the highlight, would use the font color (and there was no frame, no background) Should IME allow it?

*User defined word - highlight (see wiki) http://wiki.lazarus.freepascal.org/New_IDE_features_since#Multiple_user_defined_word_highlight.2Fmarkup
Should IME show them? All? None? Some in some cases?

Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: Martin_fr on March 25, 2015, 04:28:34 pm
As noted on mantis:
Code: [Select]
  FImeMarkupSelection.MarkupInfo.BackPriority := 99999; // or higher, overrides all other backgrounds
  FImeMarkupSelection.MarkupInfo.Background := clWhite; // whatever your back color is

  FImeMarkupSelection.MarkupInfo.ForePriority := 99999; // or higher, overrides all other backgrounds
  FImeMarkupSelection.MarkupInfo.Foreground:= clBlock;// whatever your font color is
That will set background and foreground color fixed.

Code: [Select]
  FImeMarkupSelection.MarkupInfo.StylePriorityrePriority := 99999; // or higher, overrides all other backgrounds
  FImeMarkupSelection.MarkupInfo.Style:= [];
  FImeMarkupSelection.MarkupInfo.StyleMask:= [fsBold, fsItalic, fsUnderline];
This will REMOVE all styles


Unfortunately, this does not work for frames. They can not currently be removed. That needs changes to parts of the highlight system.
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: Martin_fr on March 25, 2015, 04:38:04 pm
Which one is best? Compare the distance.

First is default / current

line 555
Code: [Select]
        cp^.cLineHeight := TCustomSynEdit(FriendEdit).LineHeight;
        cp^.cLineHeight := TCustomSynEdit(FriendEdit).LineHeight + 1;
        cp^.cLineHeight := TCustomSynEdit(FriendEdit).LineHeight + 2;
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: malcome on March 26, 2015, 12:03:51 am
IMHO, Same space as in case of "IME handled by system".

BTW
Actually, this is very tiny problem.
Because we use IME, only a comment part or the string constant part in IDE Source Editor.
It means that only "IME handled by system" is enough.
I am a programmer too, So I can understand your feel that you'd like to make the effort the leading role.
Above is only about "IDE Source Editor", not about "TSynEdit Component".
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: Martin_fr on March 26, 2015, 02:19:36 am
Well I dont mind adding an option on the color. But I need to understand the exact details.

In the IDE it is pascal in most times (so comment and string).
But even then, someone may have green comments, and blue strings. Then what, should that part of color be preserved?

If a person writes there own highlighter, then what should be preserved?

See the list of questions I posted.

----------------------
The 2nd part is that suppressing the color is not that simple.

Well not if done in SynEdit. Using OnIMEStarted you can do what you want.....

Look at the code I posted. works for background color. But not for frames. So this is a task I might to later.

But If so, then I try to figure out the details now. Then open an issue on mantis, with the details, and later take care of it.
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: malcome on March 26, 2015, 10:55:22 am
Isn't below link helpful? for now
https://api-dev.bugzilla.mozilla.org/show_bug.cgi?id=159263
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: malcome on March 26, 2015, 11:26:06 am
I read a link a little, It seems to be very hard task....
I think it's better to pour your wonderful ability into other big ones...

They worked for 4 years....
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: Martin_fr on March 26, 2015, 05:13:16 pm
https://api-dev.bugzilla.mozilla.org/show_bug.cgi?id=159263#c5
Quote
I think this is not bug.
Because on Mozilla, IME composition string is drawn by Mozilla.
But on many other applications, that is drawn by IME(on composition window).
We cannot get the IME color settings with WinAPIs.
Because the APIs is not existing.

Please differ between
1) "drawn by IME" (== handled by system)
2) drawn by app / full integration

-----------------
(1) should work in SynEdit. However the IME uses proportional font. SynEdit has no influence on this.

(2) As in the quote: Those colors are NOT known.

-----------------

What can be done:

A) I can add configuration for IME colors to SynEdit/IDE

B) I can look at disabling only parts of other highlights. e.g such other highlights that make the IME underline hard to spot.


As I said NOT all of it will be now.

-----------------

I found another  issue for all "drawn by app / full integration" (wordpad, open-office, Firefox...)

If the IME text is partly scrolled of the screen , you can not see the current convert target (underlined/drop-box). IT does not scroll in when needed.

Do you think this should scroll to the convert-target, even if that means caret is scrolled off screen?

-----------------

You can always make changes to your SynEdit.
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: skalogryz on March 26, 2015, 05:20:22 pm
imho, there shouldn't be an option.
"non-integrated" approach (the same as observed in Notepad) is just lousy in 2015 on the most modern IDE in the world.

Btw, you can "non-integrated" look like "integrated" as long as you know the length of the composition text window width.
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: Martin_fr on March 26, 2015, 05:39:30 pm
imho, there shouldn't be an option.
"non-integrated" approach (the same as observed in Notepad) is just lousy in 2015 on the most modern IDE in the world.
Hence it is not default. As I said MS visual studio express 2010 has the none integrated.

In SynEdit its a separate class. If you drop SynEdit on a form, only the full integration gets compiled into your app. (smart linking). You need to add the none integrated yourself, if you want it.
In the IDE there is an option. No harm.

Quote
Btw, you can "non-integrated" look like "integrated" as long as you know the length of the composition text window width.
Partly. it still would not do the same char spacing. because SynEdit applies the enforced monospacing.

Also the none integrated window will wrap if needed, then the length does no longer matter.

Can someone test with a current MS-Office/Word ?
I assume that will be integrated, but what if (hint, reduce window width):
- the IME becomes longer than the line, and is scrolled
- press space to convert the chars at the begin (scrolled out)

In this case the caret (correctly) remains at the end of the IME. But in al apps that I tested, that means that the part you actually work on (to which the drop down applies) is scrolled out.
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: Martin_fr on March 26, 2015, 07:10:00 pm
Scrolling fixed in r48512

In case it is not wanted, change
Code: [Select]
  FAdjustLeftCharForTargets := True;
to false,
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: Martin_fr on March 26, 2015, 07:34:33 pm
I added that the IME prevents underline from other styles.

Preventing frames is a TODO, because current markup code does not allow it.

Preventing font, and background color, italics or bold, will not be added as default, but when/if colors become configurable, it can be specified by the user.

only issue: if surrounding code is underlined, the IME underline joins the surrounding underline. The markup currently does not allow to leave a gap in such cases. That might one day be added.....


That's it for now.
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: malcome on March 26, 2015, 09:25:33 pm
I think, distinguish about "IDE Source Editor" and about "SynEdit Component"?

About "SynEdit Component",
We can customize freely for our user. change property, override method, ...
(Of course this is thanks to your preparations)
In a worst case, we can patch your code.
(Of course this is thanks to your decided Open Source)
So We aren't in trouble so much. I don't have an opinion, so far.

About "IDE Source Editor",
You have to make good one for Lazarus users.
I'm one of them. So I have an opinion. (All things I have told so far are about that)

-----------------

I think, distinguish about "Text Editor(≒IDE Source Editor)" and about "Word Processor"?

Both of them are classified clearly in Japan at least.
A Word Processor exists to make the document decorated beautifully.
Still, Text Editor is required light weight.
Of course, We're talking about Text Editor.(believed so)
If Lazarus Source Editor is heavy, there would be a lot of people who quit using it.
Your responsibility is very important.

-----------------

I knew about Scroll problem. Why silent.
Because we do not input so long with IME.
And used only about 80 chars.(Right margin indicator is very convenient)
So I wasn't interested.
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: Martin_fr on March 26, 2015, 09:37:39 pm
Well thanks for all the feedback.

As you can see, I plan to add config for some of the still open issues. But tats a bigger task, so it will be later.


Until then, if you find any other problem, let me know.
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: malcome on March 26, 2015, 10:14:29 pm
I have made your work increase. Sorry.
I'd like to say, Your editor does not have big problems, and your editor is great.
Good luck!
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: malcome on March 26, 2015, 10:51:41 pm
....I lied to you.
http://bugs.freepascal.org/view.php?id=27707 is a big problem.

PS
I'm planning to play by Linux in the summer holidays.
If Japanese input is perfect on Lazarus in Linux then, I'd call you "God Martin".
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: skalogryz on March 27, 2015, 12:54:51 am
....I lied to you.
http://bugs.freepascal.org/view.php?id=27707 is a big problem.
Let me ask you this question: is there any instance when a character (specifically quotes) should not be shown as full width in a text-editor or word-processor.



PS
I'm planning to play by Linux in the summer holidays.
What about OSX?
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: Martin_fr on March 27, 2015, 02:56:54 am
Let me ask you this question: is there any instance when a character (specifically quotes) should not be shown as full width in a text-editor or word-processor.

Well If I understand utf specs correctly,  any char that is defined as either narrow or half-witdh, but not ambiguous, should be shown half-width (always).

Equally all wide, full width, but not ambiguous, are shown full width.

But for ambiguous chars the utf8 spec seems to give no hint at all. (Well as far as I got reading it...)


And for ambiguous it does not even depend just on the settings of the OS. It depends on other factors.

For example on my PC some ambiguous chars are displayed half-witdh with one font, full-witdh with another.
But I guess nothing even guarantees that it will be always the same with the same font. At least in theory it could depend on context (e.g. surrounding chars).

Anyway SynEdit can override widths, and for ease of editing, it may make sense to show all ambiguous chars as full width, so they fit into the same grid as the Japanese glyphs.

Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: skalogryz on March 27, 2015, 03:44:22 am
I'm thinking about two things.

First. The context that should be used to resolve ambiguity. Is the content of text a context? It surely is. But just like in malcome patch - system language is also a context.

Second. SynEdit considers all text as full-width anyway (it doesn't have half-width!).
Remember this post (http://forum.lazarus.freepascal.org/index.php/topic,27385.msg169106.html#msg169106)? It's not related but if every character would be "stretched" ... but it's not the point.

I presume SynEdit stops treating any unicode sequences as monospace characters and tries to render then as a single line of text. (cyrillic too?) Maybe an exception should be made for unicode characters from CJK group? In this each character would be rendered as full-width monospace characters. That should look as expected.
I'd think that having CJK characters as sort of exception should do the trick easily.
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: Martin_fr on March 27, 2015, 05:11:36 am
First. The context that should be used to resolve ambiguity. Is the content of text a context? It surely is. But just like in malcome patch - system language is also a context.

Yes it is, and I would be happy to use it. But someone needs to code a list of ambiguous chars. Malcome's patch affects many half-width chars.

Now I can see, that someone editing Japanese in monospace might want to force half-width into the full width grid too, and even that is ok. Only not as default. Just because a systems codepage is Japanese does not mean that a latin "A" becomes double-width. That would have to be an option, that needs explicit enabling.


Further more:"system language is also a context." A context. One, but not the only one. The problem here is that if the System decides that a char has a certain width, then one must be carefully about enforcing another. (See the link you posted).

In most cases enforcing a bigger width is a lesser issue. So ok, there would be a risk, that an ambiguous char on a Japanese PC has some extra spacing. Not good, but well.

Then there is the problem what happen on a none Japanese system. Many ambiguous chars are narrow. So that should be the choice, or should it not? In a latin text you don't want a quote to introduce an extra space, don't you?
Only depending on font some, but not all, ambiguous chars may be double width, and forced to narrow they will overlap....

So this issue affect none Japanese setup too. (Actually maybe only if East Asian fonts are installed).

Anyway I am taking it further than the current issue needs: Detecting the codepage is fine, *IF* there is a list of ambiguous chars. More fixes will be needed, but it is a good start.


Quote
Second. SynEdit considers all text as full-width anyway (it doesn't have half-width!).
SynEdit name for "full-width" is "double-width".But that is naming. SynEdit has 2 widths for monospaced chars. (And more for tabs)

Quote
I presume SynEdit stops treating any unicode sequences as monospace characters and tries to render then as a single line of text. (cyrillic too?) Maybe an exception should be made for unicode characters from CJK group? In this each character would be rendered as full-width monospace characters. That should look as expected.
I'd think that having CJK characters as sort of exception should do the trick easily.

SynEdit treats everything as monospace, except that mono = duo.

If not Japanese would not behave with SynEdits grid. East asian fonts on windows use some sort of font fallback.
If not forced by SynEdit then Japanese would have aprox 1.7 the width of a latin char. SynEdit expands that to factor 2.

If that was not done SynEdit (expecting everything to the grid) could not place the caret correct.

If Japanese chars would be allowed their desired wide, SynEdit would need to support proportional font behaviour to place the caret correct. (Maybe some day, but coming from the history SynEdit has, that is not possible now)
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: skalogryz on March 27, 2015, 05:49:11 am
Yes it is, and I would be happy to use it. But someone needs to code a list of ambiguous chars.
I think there should be an utility in FPC to convert these unicode tables to pascal arrays? no?
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: Martin_fr on March 27, 2015, 06:23:33 am
Well currently I have a huge case statement.
So I would need a tool to generate this.

But I want to test, storing this as a data structure. And reduce the loop code. Then I could also have different data structures for different locale.

So yes, ideally there will be a tool to generate the code for this. But right now, I don not have one.
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: skalogryz on April 08, 2015, 05:27:29 am
Well currently I have a huge case statement.
So I would need a tool to generate this.

So yes, ideally there will be a tool to generate the code for this. But right now, I don not have one.
ok here you go.
Both the product cjkinfo.pas (to lookup char widths) and the tool (if you want something different than the product)

the lookup is in cjkinfo.pas. the single function GetCJKWidth that should return the width value for a unicode character.

The rest of files is the tool. data11.pas - is utility unit to parse/read the file, as well as additional utilities to store and process the parsed information.
The generation of pascal code is at data11read.pas.

Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: Martin_fr on April 08, 2015, 11:57:08 am
Please add it to the bug report. So I can add it, time permitting.

The problem here is to convert all codepoints to utf16 first. This will add to the time consumption of this code.

The current implementation already is on my "must be changed" list, because it turned out to be too slow. 
It did not matter with just one caret (except when you paste huge text from clipboard, or when you indent many 1000 lines)

But now, that there is multi caret, and time consumption is multiplied for each caret, it matters more.

But I will add it, at least to be able to compare. Just need to find the time to to it.

DO you know what license this code has? I need something compatible with SynEdit.
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: skalogryz on April 08, 2015, 02:23:42 pm
The problem here is to convert all codepoints to utf16 first. This will add to the time consumption of this code.
So are you looking for ability to find the width from UTF8.
a function like these:
Code: [Select]
  // utf8 - is the pointer to the first byte in utf8 sequence
  // bytesinchar - is length of utf8 character in bytes. Though it could be easily read from the first byte itself.
  function GetCJKWidth(utf8; pchar: bytesinchar: Integer): TCharWidth; overload;   
  function GetCJKWidth(utf8; pchar: TCharWidth; overload;

DO you know what license this code has? I need something compatible with SynEdit.
It's whatever license needed, since I'm the author. 
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: Martin_fr on April 08, 2015, 02:52:25 pm
look at the current code, it has a nested case.

However I suspect that is not really good. And also it can not be modified at run time (see below).

So what I want to do, is a tree like lookup table. It needs however to be optimize to use little memory, so that it nicely fits into as few cache lines as possible.

lookup: array  [byte] of ... // firstbyte

each page then has a low and high byte, and further pages.  I have a tree like this in another place already, for searching multiply search terms in a string within one search run.

But thats just one idea. I need to see, if it really speeds up things.

----------
Making it configurable also will allow to mix in font info. You can ask which unicode ranges are supported by a font. (eastern chars are in a fallback font)

Because on my system, some but not all ambiguous are full width. Maybe that can be detected. But thats extra

-----------
Currently there are several runs
- find codepoint borders (in byte array)
- find ltr/rtl infro
- find charwidth

maybe those can be combined. But again that needs testing.

----------
The code is also in the wrong place. It was a hack to start adding some support at all.

But then, I dont want the above to defer improving the current code.

So the next step would be to aim at a similar lookup like current (never mind if the structure is in data or code).

----------
Anyway I have one or 2 other items on my list, that I want to finish first.
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: skalogryz on April 12, 2015, 06:38:23 am
Here you go! utf8 based look up.
As promised
Code: [Select]
function GetCJKWidth(utf8: PChar; defWidth: TCharWidth = cwN): TCharWidth; overload;
function GetCJKWidth(utf8: PChar; charLen: Integer; defWidth: TCharWidth = cwN): TCharWidth; overload;
.
charLen is size of utf8 character in bytes.

Internally it stores utf8 character ranges as a pair of Int64. Whenever a character is looked-up it's also converted to Int64 and then binary search is performed over an array. Similar to the original version.

a smaller improvement could be done for the search.
So for each character a smaller section of the array is searched (based of the character length), rather than the full array.

upd: attached to the bug tracker
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: Martin_fr on April 12, 2015, 07:40:09 pm
I added your code. But for now left it in the ifdef.

I did some measurements.
- add 250.000 lines to SynEdit (inside BegintUpdate)
- add 250.000 Carets (column bode selection on 3rd column, top to bottom, zero width)
  ecEditorTop,  ecRight,  ecRight,  ecColSelEditorBottom,
- type an "X" (with all 250 chars)

I admit 250k Carets is not normal, but inserting that many line can happen. This will also affect copy and paste or changing indent of selected lines.

times include the entire execution. Since the double width code makes only part of the time, the actual different on that code alone is much bigger.

Also for fairness, I did not optimize the embedding of your code, That is inlined the most inner call into the loop. No idea how much that would affect the result.

Accuracy can be +/- 0.1 sec

Times when compiled with all kind of debug -Criot -Sa -O1 and others
Old:
 1.24
 0.85
 5.00

Yours:
 2.48
 2.05
 8.83


Times when compiled without any of those and -O3
Old:
 0.70
 0.57
 3.21

Yours:
 1.34
 1.21
 5.21

Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: skalogryz on April 13, 2015, 05:06:50 am
Ok. here's an update.

The performance gain should be achieved by eliminating unnecessary searches.

For example: if character is one byte length and it's code out of range of "Na" widths , then don't search and just return width as "N"
Title: Re: Considerations on IME - Japanese in SynEdit (probably all other IME users too)
Post by: Martin_fr on April 13, 2015, 05:15:34 am
Can you supply a patch to the unit please (Your previous code is already committed to svn, so it should be a small patch)
TinyPortal © 2005-2018