Recent

Author Topic: Lazarus package to humanize numbers, time and more  (Read 1633 times)

Gustavo 'Gus' Carreno

  • Hero Member
  • *****
  • Posts: 1337
  • Professional amateur ;-P
Lazarus package to humanize numbers, time and more
« on: June 27, 2025, 11:52:46 am »
Hey Y'All,

While I was waiting for help debugging our favourite HTTP Client, I started putzing around with Go, just to make sure that the download drop issues are not present in other languages.
While doing so, I found this: Go Humanize.
Of course I ahd to spend some time to see if I could bring that to the Object Pascal Community !!! :D

And here it is: Free Pascal Humanize.
It very light on tests, but it has a GUI Demo app that is provided in the download section of the latest release on GitHub. This demo is quite comprehensive in terms of the possibilities. Now to put all that in tests for full coverage  :-[

It's more of a play thing than anything else, so add a pinch of salt ;)
If I'm completely honest, I should put some thank to the Go author, cuz this is blatant plagiarism :D

At the moment we can humanize/format:
  • Bytes: Both base 2 and base 10 - B, KB... B, Kib... with different precision
  • Comma:
    • Thousands separator for integers and reals( with different precision )
    • Turn an array of strings into a comma separated string with sorting
  • CommaAnd: Turn an array of strings into a comma separated string with sorting, with the last 2 items having a and between them
  • Ordinals: 1st, 2nd, 3rd, Nth
  • Time: 1 second ago, 1 second from now, etc

Enjoy!!

Cheers,
Gus

CM630

  • Hero Member
  • *****
  • Posts: 1579
  • Не съм сигурен, че те разбирам.
    • http://sourceforge.net/u/cm630/profile/
Re: Lazarus package to humanize numbers, time and more
« Reply #1 on: June 27, 2025, 01:01:53 pm »
Thanks for sharing, I took a glimpse.

1. Maybe you should take a look at this thread: https://forum.lazarus.freepascal.org/index.php?topic=69206.0
2. I seriously doubt that 1st, 2nd, 3rd, Nth is applicable for many languages except English.
3.  These should be in the resourcestring section:
Code: Pascal  [Select][+][-]
  1. cUnits: array of String = ('B', 'KiB', 'MiB', 'GiB', 'TiB');
  2. cUnitsBase10: array of String = ('B', 'KB', 'MB', 'GB', 'TB');
4. I do not see a way to set the decimal and thousand delimiters. I guess they are taken from the OS settings, which does not seem sufficient from the end user point of view. Also, it won't hurt mentioning this: https://forum.lazarus.freepascal.org/index.php/topic,71429.msg557360.html
Лазар 4,4 32 bit (sometimes 64 bit); FPC3,2,2

Gustavo 'Gus' Carreno

  • Hero Member
  • *****
  • Posts: 1337
  • Professional amateur ;-P
Re: Lazarus package to humanize numbers, time and more
« Reply #2 on: June 28, 2025, 02:45:24 pm »
Hey CM630
Thanks for sharing, I took a glimpse.

More than welcome !! :D

1. Maybe you should take a look at this thread: https://forum.lazarus.freepascal.org/index.php?topic=69206.0

I'm using FormatFloat for the numbers. This will pick the correct formatting settings.once I'm able to suss out how to trigger the translation on the package.

2. I seriously doubt that 1st, 2nd, 3rd, Nth is applicable for many languages except English.

Yeah, this is a head scratcher. I'm Portuguese and ordinals are gender sensible... So yeah, I'll need to think about it.
I have some resource strings for this. No a good solution, but it's a start.

3.  These should be in the resourcestring section:
Code: Pascal  [Select][+][-]
  1. cUnits: array of String = ('B', 'KiB', 'MiB', 'GiB', 'TiB');
  2. cUnitsBase10: array of String = ('B', 'KB', 'MB', 'GB', 'TB');

I've added resource strings for those. You can see it when I release v0.0.6

4. I do not see a way to set the decimal and thousand delimiters. I guess they are taken from the OS settings, which does not seem sufficient from the end user point of view. Also, it won't hurt mentioning this: https://forum.lazarus.freepascal.org/index.php/topic,71429.msg557360.html

The main problem is that the package is not picking up the language from the OS.
I need to set the language on the package. I'm already doing so on the Demo( when v0.0.6 comes out ) and the demo is translated into Portuguese( the only language I've added ). When I find out how to translate the package, this will most definitely change!!

Cheers,
Gus

CM630

  • Hero Member
  • *****
  • Posts: 1579
  • Не съм сигурен, че те разбирам.
    • http://sourceforge.net/u/cm630/profile/
Re: Lazarus package to humanize numbers, time and more
« Reply #3 on: June 28, 2025, 11:43:19 pm »
...The main problem is that the package is not picking up the language from the OS.
I need to set the language on the package. I'm already doing so on the Demo( when v0.0.6 comes out ) and the demo is translated into Portuguese( the only language I've added ). When I find out how to translate the package, this will most definitely change!!
...
Maybe you misunderstood me. The decimal and thousands separator should be settable somehow. Taken from the OS might not always be okay. For example in some languages the thousands seprator might be a punctuation sign (comma, apostrophy, etc.), but it is unacceptable by the SI system.
Лазар 4,4 32 bit (sometimes 64 bit); FPC3,2,2

Gustavo 'Gus' Carreno

  • Hero Member
  • *****
  • Posts: 1337
  • Professional amateur ;-P
Re: Lazarus package to humanize numbers, time and more
« Reply #4 on: June 29, 2025, 12:29:59 am »
Hey CM630,

Maybe you misunderstood me. The decimal and thousands separator should be settable somehow. Taken from the OS might not always be okay. For example in some languages the thousands seprator might be a punctuation sign (comma, apostrophy, etc.), but it is unacceptable by the SI system.

From my experience, if your system is well configured, any Free Pascal application, be it CLI or GUI, will pick up the correct format settings.

What you have to understand is that Localisation via .po files will NOT change the formatting settings!!!
The .po file system and the formatting settings are 2 very separate things !!!

You may not agree to what was chosen by the people that setup the format settings for each OS.
But those are the official choices.

Now, when it comes to fp-humanize, it will pick up the OS formatting settings for the language that is installed.
So, if it's on the English spectrum we have periods for decimals and commas for thousands separator.
If it's Portuguese, or any other Latin based language, and possibly most European languages, then you have comma for decimals and periods for thousands separator.

What you may not agree on is the way that dates are formatted. My system is en_GB and I HATE the default short date format!!!
But that's something I can only bicker about... It's been decided by someone else and it's been approved.

Hope I'm making sense here...

Cheers,
Gus

CM630

  • Hero Member
  • *****
  • Posts: 1579
  • Не съм сигурен, че те разбирам.
    • http://sourceforge.net/u/cm630/profile/
Re: Lazarus package to humanize numbers, time and more
« Reply #5 on: June 30, 2025, 08:44:57 am »
I agree, that the default delimiters shall be taken by the regional options, but the user should also be able to override them.

...
If it's Portuguese, or any other Latin based language, and possibly most European languages, then you have comma for decimals and periods for thousands separator.
...
If you mean period = dot, then this is completely unacceptable by SI. What if you need to export the data to an official report/protocol? Also, a comma on the line is perfectly fine in a regular English text, but it might not be okay in some other document. SI allows both comma and dot as decimal separators, as long as they are not mixed in a single document. IEC and ISO standards use only the comma on the line.
Also NBSP is a better thousands separator than the simple space, but the regionalisation might provide a simple space. I hope I did not bother you with redundant info.
Лазар 4,4 32 bit (sometimes 64 bit); FPC3,2,2

Gustavo 'Gus' Carreno

  • Hero Member
  • *****
  • Posts: 1337
  • Professional amateur ;-P
Re: Lazarus package to humanize numbers, time and more
« Reply #6 on: June 30, 2025, 09:27:15 am »
Hey CM630,

I agree, that the default delimiters shall be taken by the regional options, but the user should also be able to override them.

They can be overridden. The programmer only needs to change the appropriate values on DefaultFormattingSettings. This is a global constant that is a record. The programmer only needs to alter the appropriate fields. If the programmer so chooses he can provide the interface to change that, to the user.

If you mean period = dot, then this is completely unacceptable by SI. What if you need to export the data to an official report/protocol? Also, a comma on the line is perfectly fine in a regular English text, but it might not be okay in some other document. SI allows both comma and dot as decimal separators, as long as they are not mixed in a single document. IEC and ISO standards use only the comma on the line.
Also NBSP is a better thousands separator than the simple space, but the regionalisation might provide a simple space.

This is good to know. I thank you for it!!
I'll keep this in mind when I do the remaining SI functions that I haven't ported from the Go Humanize package.

I hope I did not bother you with redundant info.

Not at all. You have given me very good information about some stuff I wasn't aware of. And for that I thank you!!

Cheers,
Gus

wp

  • Hero Member
  • *****
  • Posts: 13329
Re: Lazarus package to humanize numbers, time and more
« Reply #7 on: June 30, 2025, 09:34:56 am »
They can be overridden. The programmer only needs to change the appropriate values on DefaultFormattingSettings. This is a global constant that is a record. The programmer only needs to alter the appropriate fields.
I think the real problem is that the *Separator fields of the TFormatSettings are of type char, but they should be string to allow for utf8, the non-breaking-space mentioned by CM630 for example.

Gustavo 'Gus' Carreno

  • Hero Member
  • *****
  • Posts: 1337
  • Professional amateur ;-P
Re: Lazarus package to humanize numbers, time and more
« Reply #8 on: June 30, 2025, 09:56:40 am »
Hey WP,

They can be overridden. The programmer only needs to change the appropriate values on DefaultFormattingSettings. This is a global constant that is a record. The programmer only needs to alter the appropriate fields.
I think the real problem is that the *Separator fields of the TFormatSettings are of type char, but they should be string to allow for utf8, the non-breaking-space mentioned by CM630 for example.

AH, Yes, I see now !! And yes, I fully agree !!
I'm guessing that in languages like Hebrew, Chinese and Japanese, just to quote a few, the DefaultFormattingSettings doesn't work so well :(

Cheers,
Gus

CM630

  • Hero Member
  • *****
  • Posts: 1579
  • Не съм сигурен, че те разбирам.
    • http://sourceforge.net/u/cm630/profile/
Re: Lazarus package to humanize numbers, time and more
« Reply #9 on: June 30, 2025, 10:02:45 am »
I am referencing ISO 80000-1, items 7.1.4; 7.3.1 and 7.3.2. I am not quoting them, for I might infringe some copyrights.
Note that there are regional versions of the standards, even though harmonised, there might be some differences.

Shortly, writing both ways is acceptable:
The voltage is 1 234,567 89 V or less.
The voltage is 1 234.567 89 V or less.

ISO and IEC standards use only the first way (ISO/IEC Directives, Part 2, 2004, Rules for the structure and drafting of International Standards).
But if one types this with normal spaces, these values might be wrapped automatically, so in my opinion, NBSP is safer.
« Last Edit: June 30, 2025, 10:21:09 am by CM630 »
Лазар 4,4 32 bit (sometimes 64 bit); FPC3,2,2

Gustavo 'Gus' Carreno

  • Hero Member
  • *****
  • Posts: 1337
  • Professional amateur ;-P
Re: Lazarus package to humanize numbers, time and more
« Reply #10 on: June 30, 2025, 10:19:23 am »
Hey CM630,

I am referencing ISO 80000-1, items 7.1.4; 7.3.1 and 7.3.2. I am not quoting them, for I might infringe some copyrights.
Note that there are regional versions of the standards, even though harmonised, there might be some differences.

Shortly, writing both ways is acceptable:
The voltage is 1 234,567 89 V or less.
The voltage is 1 234.567 89 V or less.

ISO and IEC standards use only the first way (ISO/IEC Directives, Part 2, 2004, Rules for the structure and drafting of International Standards).

Again, many thanks for these!!

But if one types this with normal spaces, these values might be wrapped automatically, so in my opinion, NBSP is safer.

I think we can trick it by using #127 ? It is shown as a space, but it's not #32.

Cheers,
Gus

CM630

  • Hero Member
  • *****
  • Posts: 1579
  • Не съм сигурен, че те разбирам.
    • http://sourceforge.net/u/cm630/profile/
Re: Lazarus package to humanize numbers, time and more
« Reply #11 on: June 30, 2025, 10:21:58 am »
A checked CP1250 to CP1258, all of them have NBSP at #$A0. But recently, someone complained in the forum about a problem with using it as a char. I will try to find the thread (EDIT: here it is: https://forum.lazarus.freepascal.org/index.php/topic,71428.msg557355.htm)
#127 is DEL, not a space?
« Last Edit: June 30, 2025, 10:30:13 am by CM630 »
Лазар 4,4 32 bit (sometimes 64 bit); FPC3,2,2

Gustavo 'Gus' Carreno

  • Hero Member
  • *****
  • Posts: 1337
  • Professional amateur ;-P
Re: Lazarus package to humanize numbers, time and more
« Reply #12 on: June 30, 2025, 12:18:46 pm »
Hey CM630,

A checked CP1250 to CP1258, all of them have NBSP at #$A0. But recently, someone complained in the forum about a problem with using it as a char. I will try to find the thread (EDIT: here it is: https://forum.lazarus.freepascal.org/index.php/topic,71428.msg557355.htm)

Thanks for this!!
I see the problem and I would need a ton of conditionals for non Latin based locales... I'll think about it...
Nonetheless, this gives me more info if I decide to opt for the NBSP route!!

#127 is DEL, not a space?

Yes it is!! Good catch!! Totally mea culpa!!!

Cheers,
Gus

 

TinyPortal © 2005-2018