Forum > General

Extended ASCII use - 2

<< < (2/10) > >>

JuhaManninen:
Yes. The sample code applies to any UTF-8 text. I didn't quite understand what 'European' characters meant.

engkin:
Me neither, but I looked at his other post.

raymond:
Whilst I'm evaluating the replies I would say
ASCII means 'American Standard Code ..' and is 128 values.
An accented character only occurs in a European language and features in the
'Extended' part of  ASCII (in the UTF8 code set).
My interest is in using fpc without Lazarus, as was possible with fpc and FreeDOS and code set IBM 850, which covers all the 'European' characters.
UTF8 is a single-byte code set and should be usable as such.
In fpc can one use UTF8 as single bytes in arrays ?

tetrastes:
UTF8 is NOT single-byte code set. It uses one byte only for first 128 (ASCII) characters. For others it uses up to 4 bytes.
So if you want what you call "'extended' part of ASCII" to be one-byte, change your system codepage to IBM850, and fpc will work as you want. ::)

Thaddy:

--- Quote from: raymond on January 11, 2022, 02:52:51 pm ---UTF8 is a single-byte code set and should be usable as such.
In fpc can one use UTF8 as single bytes in arrays ?

--- End quote ---
No. UTF8 is NOT a single byte code set!
Indeed, CP_UTF8 can use up to 4 bytes and for e.g. arrays and database field you will have to reserve 4 bytes per char! This is often overlooked.
So the answer is no. You can not treat UTF8 as single byte.

Navigation

[0] Message Index

[#] Next page

[*] Previous page

Go to full version