Recent

Author Topic: A (Debian) Quality Proposal  (Read 13856 times)

dbannon

  • Hero Member
  • *****
  • Posts: 2786
    • tomboy-ng, a rewrite of the classic Tomboy
A (Debian) Quality Proposal
« on: January 18, 2022, 10:23:29 am »
A very open ended question !

Debian, as part of their QA process, flag  problems in source packages. The recently packaged FPC 3.2.2 has quite few of them (so do many apps). Now, superficially you may see these as mostly quite theoretical or nit-picking but it is a quality question, so, do we do quality for its own sake ?

Or because we'd like to keep in Debian's good books or even to help the long suffering packaging team ?  And most of these things are a problem even if a very minor one.

What are they ?  https://lintian.debian.org/sources/fpc

Things like test files not being UTF8 (or ascii) seem to be about one third of the total -

Quote
A file is not valid UTF-8.
Debian has used UTF-8 for many years. Support for national encodings is being phased out. This file probably appears to users in mangled characters (also called mojibake).  Packaging control files must be encoded in valid UTF-8.

Spelling also figures highly, they have a list of common misspelling and scan against it.

There are also a host of things we probably would argue is the way it should be.

So, my question, would the FPC developers accepts fixes to some of these issues ? Its unlikely it will improve the product significantly but any change carries a risk of breaking something. My approach would be, perhaps -

Identify the ones that can easily be fixed, examine each one in context (absolutely no automatic changes proposed here folks!) and fix it. Get, maybe ten such fixes in a batch, build the source do some superficial test hopefully focused on the possible areas affected.  Submit that batch of fixes as a pull request.  (or patches, or whatever you prefer).

It would be a slow process, done as a spare time activity not a key project.

So, if it worthwhile or not ?

Davo

« Last Edit: January 19, 2022, 02:50:25 am by dbannon »
Lazarus 3, Linux (and reluctantly Win10/11, OSX Monterey)
My Project - https://github.com/tomboy-notes/tomboy-ng and my github - https://github.com/davidbannon

Thaddy

  • Hero Member
  • *****
  • Posts: 14197
  • Probably until I exterminate Putin.
Re: A (Debain) Quality Proposal
« Reply #1 on: January 18, 2022, 10:34:39 am »
Most if not all have to do that FPC supports so many platforms.
It would be very hard to make an exception for Debian Linux in the sourcecode of the compiler/rtl.
The Debian team seems to recognize that, otherwise fpc would not be included in the standard distribution.
Specialize a type, not a var.

MarkMLl

  • Hero Member
  • *****
  • Posts: 6676
Re: A (Debain) Quality Proposal
« Reply #2 on: January 18, 2022, 11:06:29 am »
I'd have thought that fixing the highlighted spelling errors would be in everybody's interest.

Codepage issues should take into account (a) the target platform and (b) the provenance of the file in which the offending cavaliers appear: it would be reasonable to treat a target which predated UTF-8 and where an API definition file had a non-Unicode markup convention as a special case.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

dbannon

  • Hero Member
  • *****
  • Posts: 2786
    • tomboy-ng, a rewrite of the classic Tomboy
Re: A (Debain) Quality Proposal
« Reply #3 on: January 18, 2022, 11:25:58 am »
I don't think too many (and I have only looked at six or seven) are deliberate internationalization More historical accidents IMHO

Quote
errln('error setting exception n�'+hexstr(i,2));.

That line is enough for that file to be declared an ISO-8859.  I found a couple that had a mystery symbol where I'd expect the copyright symbol to be used.  Would it be fair to say internationalization belongs in the PO system ?

Similarly, spelling, Debian's list of "frequently misspelt words" (sic) would have been built up over time based on its own very broad coverage.

There are heaps of other things too, some judgement would need to be applied, thus my comment about no automatic changes ....

Davo
Lazarus 3, Linux (and reluctantly Win10/11, OSX Monterey)
My Project - https://github.com/tomboy-notes/tomboy-ng and my github - https://github.com/davidbannon

MarkMLl

  • Hero Member
  • *****
  • Posts: 6676
Re: A (Debain) Quality Proposal
« Reply #4 on: January 18, 2022, 11:36:30 am »
Quote
errln('error setting exception n�'+hexstr(i,2));.

If that's intended to be something like nr, nm or no then it definitely needs fixing IMO since it can be done by converting to conventional ASCII.

MarkMLl
« Last Edit: January 18, 2022, 02:01:41 pm by MarkMLl »
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

munair

  • Hero Member
  • *****
  • Posts: 798
  • compiler developer @SharpBASIC
    • SharpBASIC
Re: A (Debain) Quality Proposal
« Reply #5 on: January 18, 2022, 12:17:53 pm »
Most if not all have to do that FPC supports so many platforms.
It would be very hard to make an exception for Debian Linux in the sourcecode of the compiler/rtl.

If Debian complains about files not being UTF8 then the problem would apply to all platforms natively using UTF8 (probably all OSs with a Linux kernel). It is just that Debian has high quality control checking, which would not be the right reason to make an exception for Debian Linux.
keep it simple

PascalDragon

  • Hero Member
  • *****
  • Posts: 5446
  • Compiler Developer
Re: A (Debain) Quality Proposal
« Reply #6 on: January 18, 2022, 01:56:58 pm »
So, my question, would the FPC developers accepts fixes to some of these issues ?

In principle, yes, though in the end it will be a case-by-case basis, so "batching" as you suggested is not necessarily the correct way (though it depends on the issue in question).

I don't think too many (and I have only looked at six or seven) are deliberate internationalization More historical accidents IMHO

Quote
errln('error setting exception n�'+hexstr(i,2));.

That line is enough for that file to be declared an ISO-8859.  I found a couple that had a mystery symbol where I'd expect the copyright symbol to be used.

And this is one were the context is important (would have been nice if you mentioned the filename, I had to search it myself... ::) ): this is a file for go32v2. That platform has no knowledge about UTF-8 and uses legacy codepages. In this specific case it's a "ø" (and my Lazarus on Windows displays it correctly).

Would it be fair to say internationalization belongs in the PO system ?

The code that's provided by FPC itself does not care about po files. The only applications that provide internationalization support are the compiler and the textmode IDE and both use their own mechanisms.

By the way: please fix the "Debain" in the title :P

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11382
  • FPC developer.
Re: A (Debain) Quality Proposal
« Reply #7 on: January 18, 2022, 03:26:11 pm »
Some might also be fixed meanwhile (most notably ocurred occurred). But it of course can also be more of the same.

Also check trunk to be sure.

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 11382
  • FPC developer.
Re: A (Debain) Quality Proposal
« Reply #8 on: January 18, 2022, 05:30:27 pm »
If Debian complains about files not being UTF8 then the problem would apply to all platforms natively using UTF8 (probably all OSs with a Linux kernel).

Not if they convert on startup. Keep in mind that Dos has no native codepage conversion routines. Mostly they are used for older parts of the system like the textmode IDE. On the modern utf8 targets, they mostly use Lazarus

Quote
It is just that Debian has high quality control checking, which would not be the right reason to make an exception for Debian Linux.

It generates a lot, but quality can be debated.  Many things it lists that are not spelling mistakes are deliberate though. (like the static linking, the binary in a versioned dir in  lib etc)

« Last Edit: January 18, 2022, 05:58:25 pm by marcov »

dbannon

  • Hero Member
  • *****
  • Posts: 2786
    • tomboy-ng, a rewrite of the classic Tomboy
Re: A (Debain) Quality Proposal
« Reply #9 on: January 19, 2022, 02:44:01 am »
In principle, yes, though in the end it will be a case-by-case basis, so "batching" as you suggested is not necessarily the correct way (though it depends on the issue in question).
Could make a lot of work for the devs but I now understand it can be more subtle than I thought.


Quote
errln('error setting exception n�'+hexstr(i,2));.
.....this is a file for go32v2. That platform has no knowledge about UTF-8 and uses legacy codepages. In this specific case it's a "ø" (and my Lazarus on Windows displays it correctly).
Right, that does put a different slant on it. I did not realise we could have an OS that does not, itself, support UTF8.  But thinking about it, there must be lots.  :(



Quote
By the way: please fix the "Debain" in the title :P

Damm, that was a deliberate mistake I was going to refer to in the fist post and forgot. Sigh ....

It does sound like far more than I would have expected are un-fixable. Maybe I need to take a bigger sample. Its possible within the debian system to add exceptions that suppress particular lintian warnings (but practicable?). Further research is indicated.

I would like to fix at least some of them, just to prove we are trying to be helpful.

Davo
Lazarus 3, Linux (and reluctantly Win10/11, OSX Monterey)
My Project - https://github.com/tomboy-notes/tomboy-ng and my github - https://github.com/davidbannon

dbannon

  • Hero Member
  • *****
  • Posts: 2786
    • tomboy-ng, a rewrite of the classic Tomboy
Re: A (Debian) Quality Proposal
« Reply #10 on: January 19, 2022, 02:49:28 am »
Further on the OS that don't support UTF8 issue, it seems to me that it would be possible to keep such OS specific files to just ascii in at least some cases. That might, for example mean replacing a ISO-8859 copyright symbol with (c) ?

Would that be an acceptable solution perhaps ?

Again, obviously, on a case by case basis. Where a symbol appears in compilable code would be a different matter ....

Davo
« Last Edit: January 19, 2022, 03:04:56 am by dbannon »
Lazarus 3, Linux (and reluctantly Win10/11, OSX Monterey)
My Project - https://github.com/tomboy-notes/tomboy-ng and my github - https://github.com/davidbannon

PascalDragon

  • Hero Member
  • *****
  • Posts: 5446
  • Compiler Developer
Re: A (Debain) Quality Proposal
« Reply #11 on: January 19, 2022, 09:51:07 am »
Quote
errln('error setting exception n�'+hexstr(i,2));.
.....this is a file for go32v2. That platform has no knowledge about UTF-8 and uses legacy codepages. In this specific case it's a "ø" (and my Lazarus on Windows displays it correctly).
Right, that does put a different slant on it. I did not realise we could have an OS that does not, itself, support UTF8.  But thinking about it, there must be lots.  :(

Essentially all the legacy systems that FPC supports.

Quote
By the way: please fix the "Debain" in the title :P

Damm, that was a deliberate mistake I was going to refer to in the fist post and forgot. Sigh ....

Even if it is a deliberate mistake it makes the topic harder to find by only title.

Further on the OS that don't support UTF8 issue, it seems to me that it would be possible to keep such OS specific files to just ascii in at least some cases. That might, for example mean replacing a ISO-8859 copyright symbol with (c) ?

Would that be an acceptable solution perhaps ?

No. Changing encoding especially for files that are intended for legacy systems anyway just to keep Debian's linting happy is not a solution. Not to mention that e.g. box drawing characters used on DOS etc. are not ASCII either.

For example the case with the "nø" might not have a real use, cause it's simply error output, but changing the "ø" to e.g. "o" just for the linting is not a solution either (essentially it's a change for the sake of change).

If there are spelling mistakes or encoding errors in files that are supposed to be UTF-8 then that's a different topic.

dbannon

  • Hero Member
  • *****
  • Posts: 2786
    • tomboy-ng, a rewrite of the classic Tomboy
Re: A (Debian) Quality Proposal
« Reply #12 on: January 19, 2022, 11:21:36 am »

Thanks PascalDragon, one of the really cool things about FPC is the breadth of platforms covered, I would not suggest anything that compromises that.

This is not just to pacify Debian, we are talking about errors in a lot of cases. Sometimes unreadable text. On my Linux system most of the characters Lintian objects to are shown as the blackbox with question mark. The 'file' command identifies the file as one of the 16 possible variations of ISO-8859, we cannot know for sure what the character in question really looks like even on a system that tries to map ISO-8859.

I fully understand your concern about converting a file from ISO-8859-xx to ascii but I would only do that if the affected characters were only in comments. I also found one case where accented characters in a comment were important to understanding the comment, again, don't touch.

Alternatively, converting to UTF8 will, perhaps preserve some information and there is already quite a lot (78) UTF8 files in there, so apparently the non UTF8 Operating Systems can cope, something I don't know for sure ??

I have now looked at about 30 items a lot closer. I reckon we can divide up the Lintian warnings up into three categories.

I. The easy ones. Spelling mistakes, hardening (thats for Debian to do),  using (c) for copyright symbol, and a small part of the National_Encoding ones, ones where the problem is in comments.

2. Ones that we can do nothing about so I/we need to provide a lintian override file. Easy.

3. The hard ones. Might require a policy decision from the developers. For example, where a contributor's name contains "accented characters" or a header is written in non-English. Two possible approaches -

3a.   A translation to UTF8 will probably make it readable by more of today's systems and not break compilability I THINK ??.

3b. Conversely, translating to ascii is safer but might just offend someone (quite reasonably). For example, ä becomes a. Wrong but safe.

Lets be clear, I am absolutely NOT suggesting changing anything in a file that has its issue in compilable code or in a comment where eg the accented character is critical to the sense of the comment. Looks ugly on some systems but play safe !

Sorry I am pushing so hard on this but i think 'we' can actually improve things here if we get it right.

Davo




Lazarus 3, Linux (and reluctantly Win10/11, OSX Monterey)
My Project - https://github.com/tomboy-notes/tomboy-ng and my github - https://github.com/davidbannon

MarkMLl

  • Hero Member
  • *****
  • Posts: 6676
Re: A (Debian) Quality Proposal
« Reply #13 on: January 19, 2022, 12:38:18 pm »
Sorry I am pushing so hard on this but i think 'we' can actually improve things here if we get it right.

I for one think your suggestions are reasonable. Comments- as I said earlier- can be a problem where particular character usage echoes the platform's conventions for library definitions etc.

Allowing that Pascal is an English-based programming language and that English has long been this project's working language, I see absolutely no reason why exotic characters should be tolerated in error messages and license texts: /except/ where they are required to render a contributor's name or affiliation correctly (which I consider to be a fundamental respect). I'd also refer to previous discussion where non-ASCII character sets such as EBCDIC were considered, where the core developers made it quite clear that even if this were supported by a cross-compiler FPC itself was- and would remain- ASCII based: there is quite simply no justification for the continued use of e.g. nø in error messages unless these are fully-localised.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

Kays

  • Hero Member
  • *****
  • Posts: 569
  • Whasup!?
    • KaiBurghardt.de
Re: A (Debian) quality proposal
« Reply #14 on: January 21, 2022, 12:30:58 am »
[…] So, my question, would the FPC developers accepts fixes to some of these issues ? […]
No, there’s a certain moment of inertia: “If it ain’t broke don’t fix it.” I once suggested to replace German-language identifiers but there was no favorable majority. If the Lintian hints aren’t part of a bug report (i. e. something does not work), they don’t get fixed. Certainly developers can consider the hints, but ultimately the project’s goal is to write a working compiler. Resolving those linting hints is, in a manner of speaking, not within the project scope.
Yours Sincerely
Kai Burghardt

 

TinyPortal © 2005-2018