Forum > General

Is this right?



I'm trying Korean characters. I tested something and found following result. 

--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---procedure TForm1.btnTestClick(Sender: TObject);var   s: string;   ws: WideString;begin   s:= '가나다라 abc';               //  Each Korean characters seems to take 3 bytes   memo1.lines.add(IntToStr(Length(s)));        //   so, this gives 16 --- correct   ws:= s;                                       // I convert them to WideString   memo1.lines.add(IntToStr(Length(ws)));     //   8,  which is also correct   memo1.lines.add(copy(ws, 1, 2));              //  returns   '가나', which is correct   memo1.lines.add(copy(ws, 1, 6));              //  returns '가나다라 a', correct       ws:= '가나다라 abc';                     // But if I directly assign to widestring,    memo1.lines.add(IntToStr(Length(ws)));     // this is 16, which means it is treated as Ansistring    memo1.lines.add(copy(ws, 1, 3));             // this gives wrong result. does not return '가'. Broken codces. end;

So, once I assign a string to normal string type variable and then re-assign it to widestring variable, it's OK. But assigning a string directly to Widestring type variable treats it as string. 

Is this correct operation or kind of bug?

You should explicitly use `UTF8Encode()` and `UTF8Decode()`.
Auto convert codepage often makes fail result.


[0] Message Index

Go to full version