Yes, it is a complicated subject. My understanding is that a code unit is a particular sequence of bits. In ASCII a code unit is 7 bits, in UTF-8 a code unit is 8 bits and in UTF-16 a code unit is 16 bits. A codepoint may consist of one or more code units. In ASCII a code unit and a codepoint are the same. In UTF-8 a codepoint can be one, two or three code units.
... or even four codeunits.
Yes except that terms "codeunit" and "codepoint" are used only with Unicode AFAIK. They were not needed with ASCII.
When speaking of characters, most people refer to the glyphs that we see, like 'A' or '€' which would be identical to a codepoint.
Those 2 happen to have only one codepoint, yes.
In the days of ASCII a code unit, codepoint and character referred to the same 7 bits, or simply a byte.
Words codeunit and codepoint were not used then. A "character" meant a byte. Yes, everything was simple in the old days.
Today a character or codepoint may occupy one or more bytes (code units) depending on the encoding. It would have been simpler if the world could agree on a single encoding covering all glyphs we use for communication.
Clearly you still don't understand the concepts of Unicode.
A codepoint occupies one or more codeunits depending on encoding, yes.
A "character" occupies one or more codepoints. I put the word "character" in quotes because its meaning is so ambiguous.
See these 2 characters: ÓÓ
They look the same, don't they? However:
Ó has 2 codepoints and 3 codeunits.
Ó has 1 codepoint and 2 codeunits.
Amazing, ha!
I think you mistake codepoints for code units. This example shows one codepoint but three code units.
No mistake. That example had one codepoint but some other example could have more. Note also that encodings make no difference for that. Only codepoints are encoded!
More examples (codepoints are encoded with UTF-8):
ch=Ở has 2 codepoints and 4 codeunits.
ch=Ở has 1 codepoint and 3 codeunits.
ch=Ć̲ has 3 codepoints and 5 codeunits.
See also:
http://www.alanwood.net/unicode/combining_diacritical_marks.html