Recent

Author Topic: [solved] Best practise - foreign key versus own field  (Read 8238 times)

rvk

  • Hero Member
  • *****
  • Posts: 6726
Re: Best practise - foreign key versus own field
« Reply #60 on: February 02, 2024, 12:49:37 pm »
No more children, no more complexity.
And yes, there is more information about W.
But none, which needs joins.
If you don't need the name "Wheat" somewhere in your reports and you don't have other tables pointing to "W" field somehow, then it's an entity on its own and you can leave it as VARCHAR(2).

alpine

  • Hero Member
  • *****
  • Posts: 1387
Re: Best practise - foreign key versus own field
« Reply #61 on: February 02, 2024, 01:40:32 pm »
How this makes difference at the file storage level (CHAR vs VARCHAR)? Explain please.
AFAIK they differ in returned values to the client, CHAR being padded to its field length and VARCHAR being not. But on a storage level they must accommodate the biggest possible length , which is max number of chars x4 for UTF-8. Is it somehow related to the MVCC?
No, the VARCHAR doesn't need to reserve all the space. It just saves the record with the length of the string +2 characters for length.
It's in the link I gave.

Quote
VARCHAR is the basic string type for storing texts of variable length, up to a maximum of 32,765 bytes. The stored structure is equal to the actual size of the data plus 2 bytes where the length of the data is recorded.
https://firebirdsql.org/file/documentation/chunk/en/refdocs/fblangref30/fblangref30-datatypes-chartypes.html
I have read this. Also I've read https://www.firebirdsql.org/file/documentation/html/en/firebirddocs/firebirdinternals/firebird-internals.html

While it is not very exhaustive, I can imagine that for each new version of the record, VARCHARs were written with their actual length, i.e. the row is 'packed' and it is only with the significant data with it. But why this doesn't apply for the CHARs too? The tail padding can be skipped also, thus way making them again 'packed'. But:
Quote
Keep in mind that one UTF8 character occupies up to 4 bytes, thus limiting the size of CHAR fields to 8,191 characters (32,767/4).
That is in para 3.5.1 of your link.

"I'm sorry Dave, I'm afraid I can't do that."
—HAL 9000

rvk

  • Hero Member
  • *****
  • Posts: 6726
Re: Best practise - foreign key versus own field
« Reply #62 on: February 02, 2024, 02:01:26 pm »
Quote
Keep in mind that one UTF8 character occupies up to 4 bytes, thus limiting the size of CHAR fields to 8,191 characters (32,767/4).
That is in para 3.5.1 of your link.
Yes. That means you can't define a CHAR(8192) or larger when your database character set is UTF-8.
( if you try to create something larger you get a "implementation limit exceeded" )
Note... if your character set is NONE or some ISO then you can go up to 32767 and although you can store UTF-8 in NONE fields, you'll get an error if the size exceeds the 8191.

I have read this. Also I've read https://www.firebirdsql.org/file/documentation/html/en/firebirddocs/firebirdinternals/firebird-internals.html

While it is not very exhaustive, I can imagine that for each new version of the record, VARCHARs were written with their actual length, i.e. the row is 'packed' and it is only with the significant data with it.
A nice snippet is here for showing the size of all fields:
(source https://stackoverflow.com/a/69415593/1037511)

Code: SQL  [Select][+][-]
  1. SELECT T.RDB$RELATION_NAME, RF.RDB$FIELD_NAME, RF.RDB$NULL_FLAG, F.RDB$FIELD_TYPE,
  2.     F.RDB$FIELD_LENGTH,
  3.     F.RDB$CHARACTER_LENGTH,
  4.     F.RDB$FIELD_LENGTH / RCS.RDB$BYTES_PER_CHARACTER AS calculated
  5. FROM RDB$RELATIONS T
  6. INNER JOIN RDB$RELATION_FIELDS RF
  7.   ON RF.RDB$RELATION_NAME = T.RDB$RELATION_NAME
  8. INNER JOIN RDB$FIELDS F
  9.   ON F.RDB$FIELD_NAME = RF.RDB$FIELD_SOURCE
  10. LEFT JOIN RDB$CHARACTER_SETS RCS
  11.   ON RCS.RDB$CHARACTER_SET_ID = F.RDB$CHARACTER_SET_ID
  12. WHERE T.RDB$VIEW_BLR IS NULL
  13. AND (T.RDB$SYSTEM_FLAG IS NULL OR T.RDB$SYSTEM_FLAG = 0)  
  14. AND (RF.RDB$SYSTEM_FLAG IS NULL OR RF.RDB$SYSTEM_FLAG = 0)  
  15. ORDER BY T.RDB$RELATION_NAME, RF.RDB$FIELD_NAME

But why this doesn't apply for the CHARs too?
Well... that's just how CHAR works. Same as ansistring and string[10]. Why are they different?
They just are ;)

Besides... for short strings like 1 or 2 characters, CHAR(1) is more efficient. They still have their purpose.
BTW. Don't forget Firebird didn't have a boolean type until recently, so you could use CHAR(1) DEFAULT 'T' CHECK (VALUE IN ('F','T')) NOT NULL for that.

I still have those as domain in my databases:
Code: SQL  [Select][+][-]
  1. CREATE DOMAIN D_BOOLEAN_T  AS CHAR(1) DEFAULT 'T' CHECK (VALUE IN ('F','T')) NOT NULL^
  2. CREATE DOMAIN D_BOOLEAN_F  AS CHAR(1) DEFAULT 'F' CHECK (VALUE IN ('F','T')) NOT NULL^

Quote
In practical terms consider just this one rule: only use CHARs if strings of few characters are to be stored; the exception to the rule being when working with intermediate tables that are required to export data to fixed length prn files. Then the fixed length field will be a positive advantage.
https://www.ibexpert.net/ibe/pmwiki.php?n=Doc.CHAR

Nicole

  • Hero Member
  • *****
  • Posts: 1101
Re: [solved] Best practise - foreign key versus own field
« Reply #63 on: February 03, 2024, 05:51:38 pm »
Who is interested about the outcome in my case:

I checked all links you posted carefully and my result was, that my VarChar(2) shall take 6 bytes each.
Those I can substitute by a foreign key of SmallInt, which is 2 bytes.

Firebase got the job to count my tuples and one of my huuuge tables has about 250.000.
This makes a possible saving a little bit less than 1 MB a table.
The database is about 70 MB.
If I have 2 very huge tables, this means, that a lot of work and a lot of joins-writing later - would reduce my DB size - by about 2 percent.
I decided to keep 2 MB more and have 2 stand alone tables without joins and without cascade-needs.

Thank you for all comments of which I hope you enjoyed the exchange with me and each other as much as me.

 

TinyPortal © 2005-2018