Recent

Author Topic: [solved] Best practise - foreign key versus own field  (Read 9357 times)

rvk

  • Hero Member
  • *****
  • Posts: 6922
Re: Best practise - foreign key versus own field
« Reply #45 on: January 31, 2024, 11:15:05 am »
Then use TEXT DataType in SQLite, and store everything in ISO-Format as a String and be done with it. SQLite understands datetime-strings natively.
You don't even need to use TEXT. You can just use DATE.

When you use FLOAT or REAL or whatever you can even still just push in TEXT into that field.
Quote
SQLite is "typeless". This means that you can store any kind of data you want in any column of any table, regardless of the declared datatype of that column.
https://www.sqlite.org/datatypes.html

That's what makes SQLite a real odd duck  :D

Joanna

  • Hero Member
  • *****
  • Posts: 1414
Re: Best practise - foreign key versus own field
« Reply #46 on: January 31, 2024, 11:55:17 am »
You are right, tdatetime is a double and that’s how it’s stored. It’s true that viewing the database outside the program is rather confusing...

I don’t have any plans to convert it to strings however, I like keeping it as a tdatetime and using tdatetime functions. I format the dates unto something readable before displaying them to users of course..

To be honest I don’t know very much about sql, just enough to get by. I get too distracted with the pascal related stuff..
I discovered a few years back that a program that worked great with a small test database choked on a real database. So I had to start all over again..

Zvoni

  • Hero Member
  • *****
  • Posts: 3227
Re: Best practise - foreign key versus own field
« Reply #47 on: January 31, 2024, 11:55:57 am »
You don't even need to use TEXT. You can just use DATE.
There is no DATE-type in SQLite. Only storage-classes for Date/Times (TEXT, REAL and Integer), and its Affinities SQLite TRANSLATES to a Storage-Class during CREATION of a Table.
And i've seen enough threads where people complained getting Garbage, just because they think "DateTime" is a valid Datatype in SQLite.
You should read about how SQLite PARSES the CREATE TABLE-Statement, and then assigns Storage-Classes to the Datatypes found.

Even a Typo like "  MyDateColumn DATETME" would get a Storage-class because the Parser found "DATE" in it

Quote
When you use FLOAT or REAL or whatever you can even still just push in TEXT into that field.
Quote
SQLite is "typeless". This means that you can store any kind of data you want in any column of any table, regardless of the declared datatype of that column.
https://www.sqlite.org/datatypes.html

That's what makes SQLite a real odd duck  :D
I know that.
Still think it's better practice to use the "native" DataTypes/Storage-Classes, than whatever SQLite thinks is appropriate.
It's the Users/Developers Obligation to make sure, that any Data gets stored in the correct column

In that Sense, SQLite is very comparable to an Excel-Spreadsheet:
I can define the Format for a Column to be numeric, but noone prevents me to enter any text there
« Last Edit: January 31, 2024, 12:00:16 pm by Zvoni »
One System to rule them all, One Code to find them,
One IDE to bring them all, and to the Framework bind them,
in the Land of Redmond, where the Windows lie
---------------------------------------------------------------------
Code is like a joke: If you have to explain it, it's bad

Nicole

  • Hero Member
  • *****
  • Posts: 1303
Re: Best practise - foreign key versus own field
« Reply #48 on: January 31, 2024, 09:05:18 pm »
No, the DB should not be that huge.
I was really surprised as I saw this.

................... wait. I checked something:
This codepage thing: It was 4096 and now it is 8192 ?!
I never was asked for this and trusted to be 2048 or so as default.

About this codepage thing.
My ideas are very vage about it.
What is "good"?
IF the database is quicker by this, the huger sizes would be ok to me.
But to be honest, I am not sure, what codepages is good for.
Is this something like those sectors in the FAT and NTFIS?
If yes, how to chose the best size?

And how can I change it?
Must I rewrite the whole database again or can I do it by backup and play back?

rvk

  • Hero Member
  • *****
  • Posts: 6922
Re: Best practise - foreign key versus own field
« Reply #49 on: January 31, 2024, 09:14:02 pm »
Is this something like those sectors in the FAT and NTFIS?
Yes, it's something like that.
https://www.ibexpert.net/ibe/pmwiki.php?n=Doc.PageSize
https://ibdeveloper.blogspot.com/2008/09/what-page-size-i-should-use-in-my.html?m=1
Also see point 15 here
https://ib-aid.com/en/articles/23-more-ways-to-speed-up-firebird/

Default increased to 8192 in FB 3.

If yes, how to chose the best size?
It depends on how many space is wasted and how large the DB get (larger DBs are better with larger pagesizes).

And how can I  it?
Must I rewrite the whole database again or can I do it by backup and play back?
You can change it by backing up the database and restoring it with a specified pagesize.

But if you are certain that is the cause of the large FB (you can check with gstat or backup and look at the backup file) and the larger DB is not a problem, you can keep it at that size.

alpine

  • Hero Member
  • *****
  • Posts: 1412
Re: Best practise - foreign key versus own field
« Reply #50 on: January 31, 2024, 09:39:15 pm »
Wasn't it the new encoding UTF-8 that makes the difference?
"I'm sorry Dave, I'm afraid I can't do that."
—HAL 9000

rvk

  • Hero Member
  • *****
  • Posts: 6922
Re: Best practise - foreign key versus own field
« Reply #51 on: January 31, 2024, 09:56:58 pm »
Wasn't it the new encoding UTF-8 that makes the difference?
UTF-8 shouldn't make any difference. Most normal characters take just 1 byte (just as the ISO codings). UTF-16 on the other hand would have made a difference (minimum 2 bytes per character).

alpine

  • Hero Member
  • *****
  • Posts: 1412
Re: Best practise - foreign key versus own field
« Reply #52 on: February 01, 2024, 05:47:51 am »
UTF-8 shouldn't make any difference. Most normal characters take just 1 byte (just as the ISO codings).
Yours "normal". But how many bytes Firebird allocates for a CHAR fields? I'm not a Firebird expert, but I suspect char fields will be quadrupled in allocation size.
« Last Edit: February 01, 2024, 07:34:15 am by alpine »
"I'm sorry Dave, I'm afraid I can't do that."
—HAL 9000

rvk

  • Hero Member
  • *****
  • Posts: 6922
Re: Best practise - foreign key versus own field
« Reply #53 on: February 01, 2024, 08:28:44 am »
UTF-8 shouldn't make any difference. Most normal characters take just 1 byte (just as the ISO codings).
Yours "normal". But how many bytes Firebird allocates for a CHAR fields? I'm not a Firebird expert, but I suspect char fields will be quadrupled in allocation size.
For the CHAR fields you are correct. But I never use CHAR for strings. I use VARCHAR. It has a length part (2 bytes) and is only as long as the string in bytes (+2).

But it might be another thing with indexes on strings. Those can quadruple.
That on combination with that increased page will definitely make an impact.
So you shouldn't throw on indexes where they are not needed.
And prefer to use the generated ID above a string for foreign and primary key  ;)

https://firebirdsql.org/file/documentation/chunk/en/refdocs/fblangref30/fblangref30-datatypes-chartypes.html
« Last Edit: February 01, 2024, 08:31:06 am by rvk »

alpine

  • Hero Member
  • *****
  • Posts: 1412
Re: Best practise - foreign key versus own field
« Reply #54 on: February 02, 2024, 09:54:41 am »
For the CHAR fields you are correct. But I never use CHAR for strings. I use VARCHAR. It has a length part (2 bytes) and is only as long as the string in bytes (+2).

https://firebirdsql.org/file/documentation/chunk/en/refdocs/fblangref30/fblangref30-datatypes-chartypes.html
How this makes difference at the file storage level (CHAR vs VARCHAR)? Explain please.
AFAIK they differ in returned values to the client, CHAR being padded to its field length and VARCHAR being not. But on a storage level they must accommodate the biggest possible length , which is max number of chars x4 for UTF-8. Is it somehow related to the MVCC?
"I'm sorry Dave, I'm afraid I can't do that."
—HAL 9000

Nicole

  • Hero Member
  • *****
  • Posts: 1303
Re: Best practise - foreign key versus own field
« Reply #55 on: February 02, 2024, 10:25:28 am »
Yes, I increased the number of digits of my character in some of my fields, as VarChar, not as Char.
The main idea was, that rare cases of longer entries should work as well.

Most of my fields are defined as VarChar(2).

This UTF8 - I thought, this 8 stands for 8 bits = 1 byte?
I thought, my VarChar(2) would take 2 bytes.
Is this correct?



rvk

  • Hero Member
  • *****
  • Posts: 6922
Re: Best practise - foreign key versus own field
« Reply #56 on: February 02, 2024, 10:40:10 am »
How this makes difference at the file storage level (CHAR vs VARCHAR)? Explain please.
AFAIK they differ in returned values to the client, CHAR being padded to its field length and VARCHAR being not. But on a storage level they must accommodate the biggest possible length , which is max number of chars x4 for UTF-8. Is it somehow related to the MVCC?
No, the VARCHAR doesn't need to reserve all the space. It just saves the record with the length of the string +2 characters for length.
It's in the link I gave.

Quote
VARCHAR is the basic string type for storing texts of variable length, up to a maximum of 32,765 bytes. The stored structure is equal to the actual size of the data plus 2 bytes where the length of the data is recorded.
https://firebirdsql.org/file/documentation/chunk/en/refdocs/fblangref30/fblangref30-datatypes-chartypes.html

Yes, I increased the number of digits of my character in some of my fields, as VarChar, not as Char.
The main idea was, that rare cases of longer entries should work as well.

Most of my fields are defined as VarChar(2).
I'm not sure if that's very efficient. I would define it as CHAR(2).
VARCHAR(2) is 2 bytes for length and 8 bytes for the characters UTF-8.
While CHAR(2) is just only the 8 bytes.
(BTW. I wouldn't even store that, what was it? region digit?, in a CHAR or VARCHAR, I would make a separate table for it. If it's a region, branche or something, you may want to add something else specific about that entity. And that's where the separate table comes in. Just linked with a BIGINT.

For CHAR(x) you do need to take into account that the string is padded with spaces to the length of 2.

This UTF8 - I thought, this 8 stands for 8 bits = 1 byte?
I thought, my VarChar(2) would take 2 bytes. Is this correct?
Yes and No. UTF-8 does stand for 8 bits encoding but characters can be encoded to multiple bytes.
So while a to z just take 1 byte, other characters (for example Chinese or some accent characters) are encoded in multiple characters.
See an example here: https://en.wikipedia.org/wiki/UTF-8
So £ is saved as $C2 + $A3

Quote
code points in the ASCII range (0-127) are represented by a single byte
code points in the range (128-2047) are represented by two bytes
code points in the range (2048-65535) are represented by three bytes
and code points in the range (65536-1114111) are represented by four bytes.
(This may seem like a lot of possible characters, but keep in mind that in Chinese alone, there are 100,000s of characters.)
https://www.freecodecamp.org/news/what-is-utf-8-character-encoding

So UTF-8 will be maximum 4 bytes long and Firebird needs to reserve those bytes for a (per) CHAR.

And just to make it more confusing... it's possible to reach even more characters by combining the code points.
Quote
A "character" can take more than 4 bytes because it is made of more than one code point. For instance a national flag character takes 8 bytes since it is "constructed from a pair of Unicode scalar values" both from outside the BMP.
So if you save 1 national flag character, both code points of your CHAR(2) will be occupied and you can't add anything more (you will get a string too long error when saving).
« Last Edit: February 02, 2024, 10:42:32 am by rvk »

Nicole

  • Hero Member
  • *****
  • Posts: 1303
Re: Best practise - foreign key versus own field
« Reply #57 on: February 02, 2024, 11:49:18 am »
I have only a handful of different VarChar(2).

My problem was, that there may be sometimes only one digit if using Char(2)
Then there was e.g. " W" oder "W " instead of "W".
This may seem easy to you and you may think "just use trim()".
However it was a huge problem and not that easy. You do not want to know details.

I thought to solve this by VarChar. Then there will not be one blank inside troubling me.

An alternative would be SmallInt als fk:
Then I give my VarChars into their own table and set a foreign key to their primary key.
And yes, each VarChar has some more fields.
The advantage of saving them in an own table was more tables were "stand alone".
But it seemed, that I erred in really many points about the storage (thank you so much for taking time to explain)

My structure is like this:
sum of these varchars: about 30 or 40 all in all
digits: 1 or 2
when they are needed as field  / possible fk: hundred thousands of times.

Would you recommend, that I put them into a small table and give it a smallInt as primary key?
Then I can change those 100.000s of fields to a foreign key.

Will it be the same, if I do not give those small table a 2 digit-ASCII as primary key?
Or will this increase the storage of the mass of foreign keys?
The advantage of the ASCII would be, that I know on looking at them, what they mean without checking a figure for it.

rvk

  • Hero Member
  • *****
  • Posts: 6922
Re: Best practise - foreign key versus own field
« Reply #58 on: February 02, 2024, 12:12:37 pm »
Then there was e.g. " W" oder "W " instead of "W".
What does W stand for ???

Would you recommend, that I put them into a small table and give it a smallInt as primary key?
Then I can change those 100.000s of fields to a foreign key.
It depends on what "W" exactly is. If it's something like F for false and T for true then yes, by all means leave them there.

If it means W for World, D for Germany, NL for Netherlands, then putting them in their own table could be advantagues in the long run because you can add more description or other fields to the table with W, D and NL (which you can't if they are in that other table). And if "W" is also a foreign key where you ALSO store the "W" then it's better to use ID. Because if you don't you would need to work with ON UPDATE CASCADE for when the code changes. That's why it's sometimes better to use ID as PK. But we still haven't seen much of your DB (DDL) design so it's hard to tell.

The advantage of the ASCII would be, that I know on looking at them, what they mean without checking a figure for it.
That's where the INNER JOINs come in (note: inner not left join).
If you know the "W" then you would always do
Code: SQL  [Select][+][-]
  1. SELECT A.* FROM TABLE1 A
  2. INNER JOIN TABLE2 A ON A.ID=B.ID_TB2
  3. WHERE B.NAME = 'W'
or even
Code: SQL  [Select][+][-]
  1. SELECT A.* FROM TABLE1 A
  2. INNER JOIN TABLE2 A ON A.ID=B.ID_TB2 AND B.NAME = 'W'

But again... it depends on the rest of your DDL (which we know nothing about).

Nicole

  • Hero Member
  • *****
  • Posts: 1303
Re: Best practise - foreign key versus own field
« Reply #59 on: February 02, 2024, 12:42:57 pm »
LOL, your fantasy of my work is far too huge. There is not much special about it.
You know it all in the meanwhile.

W stands for Wheat.

And there is really not much more mystery in it.
The Wheat price is - I think so - taken from the harbor in Minneapolis.
And contains the price back for a really long period of time.

The table says a little bit more than:
the new pk   ***    'W'   ***   date as TDate   ***   price a bushel as TCurrency

No more children, no more complexity.
And yes, there is more information about W.
But none, which needs joins.


The table which holds the wheat looks e.g.  like this

pk as TSmallInt   ***    "W"  ***   info1  ***   figure 1   ***   aTime1 as TTime *.... and not too much more

« Last Edit: February 02, 2024, 12:47:17 pm by Nicole »

 

TinyPortal © 2005-2018