Forum > Beginners

Korean text into utf8

(1/3) > >>

BIT:
Please tell me how to recode Korean text into utf8 ?

Alextp:
Install CudaText. Open you  text files with Korean. Choose Korean codepage in the statusbar: "Encoding / Reload as / <encoding>".
Then choose "File / Encoding / Convert to / UTF8". Save the file.

BIT:

--- Quote from: Alextp on September 15, 2021, 05:43:36 pm ---Install CudaText. Open you  text files with Korean. Choose Korean codepage in the statusbar: "Encoding / Reload as / <encoding>".
Then choose "File / Encoding / Convert to / UTF8". Save the file.

--- End quote ---

I want to open a Korean file in SynEdit

BIT:
The problem I have is this:
The file itself is encoded in EUC-KR
When opened in SynEdit, I convert to WinCPToUTF8, (for Russian language support)
Save in UTF8ToWinCP, Korean becomes like this ??????.

skalogryz:

--- Quote from: BIT on September 15, 2021, 06:41:34 pm ---When opened in SynEdit, I convert to WinCPToUTF8, (for Russian language support)

--- End quote ---
this is not necessary Russian language support, this is your "windows code-page" support.

If you need more reliable way of conversion, you might need to specify the code page explicitly.
Here's an example of using FPC charset conversion tools.

--- Code: Pascal  [+][-]window.onload = function(){var x1 = document.getElementById("main_content_section"); if (x1) { var x = document.getElementsByClassName("geshi");for (var i = 0; i < x.length; i++) { x[i].style.maxHeight='none'; x[i].style.height = Math.min(x[i].clientHeight+15,306)+'px'; x[i].style.resize = "vertical";}};} ---uses   .. cp949, charset; // cp949 - is Korean code page. Charset - is the unit with that provides routines to convert the characters, based on "cp949"  procedure TForm1.FormCreate(Sender: TObject);var  s: string;  u: WideString;  f: Text;  r: integer;begin  AssignFile(F,'input.txt');  Reset(f); // this just to load characters from the file. The solution may vary  ReadLn(f, s);  CloseFile(f);   SetLength(u, length(s));   r := getunicode(PChar(s),length(s),getmap(949), tunicodestring(@u[1])); // from "Charset"  SetLength(u, r);   SynEdit1.Text := UTF8Encode(u);end; 
instead of dealing with GetUnicode() function directly, one might use "fpwidestring" wideString manager. But I'm not sure if it's friendly with LCL.

Navigation

[0] Message Index

[#] Next page

Go to full version