Recent

Author Topic: Pascal Security  (Read 20727 times)

MarkMLl

  • Hero Member
  • *****
  • Posts: 8572
Re: Pascal Security
« Reply #15 on: November 02, 2021, 09:46:33 am »
Official documentation: https://www.freepascal.org/docs-html/current/ref/refse4.html#x15-140001.4.

Which is the page I cited near the start of the thread, and says nothing about what characters are acceptable in positions after the first.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Logitech, TopSpeed & FTL Modula-2 on bare metal (Z80, '286 protected mode).
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

PascalDragon

  • Hero Member
  • *****
  • Posts: 6398
  • Compiler Developer
Re: Pascal Security
« Reply #16 on: November 02, 2021, 02:40:39 pm »
Official documentation: https://www.freepascal.org/docs-html/current/ref/refse4.html#x15-140001.4.

Which is the page I cited near the start of the thread, and says nothing about what characters are acceptable in positions after the first.

FPC does not yet support characters that are neither digits (0-9) nor characters (a-z and A-Z) nor an underscore in identifiers. The wording of the documentation could be cleared up however.

MarkMLl

  • Hero Member
  • *****
  • Posts: 8572
Re: Pascal Security
« Reply #17 on: November 02, 2021, 02:47:18 pm »
FPC does not yet support characters that are neither digits (0-9) nor characters (a-z and A-Z) nor an underscore in identifiers. The wording of the documentation could be cleared up however.

Thanks for that. There's one place on the wiki page that refers to "English alphabet", but otherwise I think that people have seen "limits identifiers to 127 characters" in the context of their length and misinterpreted it to mean ANSI (or "strict ASCII", or whatever ones chosen term) i.e. below 0x80.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Logitech, TopSpeed & FTL Modula-2 on bare metal (Z80, '286 protected mode).
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

Warfley

  • Hero Member
  • *****
  • Posts: 2067
Re: Pascal Security
« Reply #18 on: November 02, 2021, 11:13:34 pm »
It's funny but there are a lot of situations in which having bad accessibility can increase the security, not only w.r.t. programming language only allowing english characters.
For example on android, if you disallow screen and touch recordings in your app, disability tools don't work anymore, but this can substantially increase the security of your app, e.g. if it requires you entering a ping in your banking app

So at last we can claim that the bad accessibility is a security feature of the language :P

Grahame Grieve

  • Sr. Member
  • ****
  • Posts: 379
Re: Pascal Security
« Reply #19 on: November 03, 2021, 09:43:23 am »
Well, I wrote that code above into my ci-build script so that it checks for these unicode characters. But it would be better for FPC to check constants and strings as well. I'll live in hope ;-)

MarkMLl

  • Hero Member
  • *****
  • Posts: 8572
Re: Pascal Security
« Reply #20 on: November 03, 2021, 09:54:09 am »
Well, I wrote that code above into my ci-build script so that it checks for these unicode characters. But it would be better for FPC to check constants and strings as well. I'll live in hope ;-)

Note PascalDragon's comment:

Quote
FPC does not yet support characters that are neither digits (0-9) nor characters (a-z and A-Z) nor an underscore in identifiers. The wording of the documentation could be cleared up however.

I think it's reasonable to assume that the compiler will explicitly reject things like the bidi overrides now that their abuse is a known issue.

My own personal feeling is that sticking to the basic English character set is entirely reasonable, but that needs to be explicitly documented.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Logitech, TopSpeed & FTL Modula-2 on bare metal (Z80, '286 protected mode).
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

Grahame Grieve

  • Sr. Member
  • ****
  • Posts: 379
Re: Pascal Security
« Reply #21 on: November 04, 2021, 10:19:15 am »
I certainly hope that the FPC compiler will be updated. In the meantime, I won't get any new compiler features for a while unless patches are issued on old versions/branches. So hence I put it in my ci-build script

dbannon

  • Hero Member
  • *****
  • Posts: 3826
    • tomboy-ng, a rewrite of the classic Tomboy
Re: Pascal Security
« Reply #22 on: November 10, 2021, 08:44:48 am »
Note that Debian's lintian scans for these scary characters.  And find lots off them, mostly in .po files and .xml files

https://lintian.debian.org/tags/unicode-trojan

davo
Lazarus 4, Linux (and reluctantly Win10/11, OSX Monterey)
My Project - https://github.com/tomboy-notes/tomboy-ng and my github - https://github.com/davidbannon

Warfley

  • Hero Member
  • *****
  • Posts: 2067
Re: Pascal Security
« Reply #23 on: November 10, 2021, 07:46:29 pm »
The thing about these characters is, it's not that they are bad or malicious characters, but are normal characters that are *required* for some languages.

So if an arabic programmer wants to write programs for an arabic audience, they will use arabic text in strings, and possibly also comments. And arabic text simply requires these characters.
Filtering these characters would result in all programs containing some amount of arabic text or comments triggering these filters, which could pose some usability issues when people get warnings that their pretty normal programs all look suspect to the compiler.

I think that just the fact that pascal does not allow non english characters in identifiers is already an accessibility issue on it's own, as especially programming beginners will usually use their own language for identifiers, which they simply can't with pascal, sealing the language off for like half the planets population.
But then also having warnings (or even worse errors) thrown every time you write a program in your language will just shy away potential pascal developers.

It's easy to say that we should just use the latin alphabet, I mean for me as a german there are literally only 4 characters ä, ö, ü and ß that can easiely be replaced by ae oe ue and ss, but for some other people learning a programming language should not require learning a new natural language

MarkMLl

  • Hero Member
  • *****
  • Posts: 8572
Re: Pascal Security
« Reply #24 on: November 10, 2021, 08:11:46 pm »
The thing about these characters is, it's not that they are bad or malicious characters, but are normal characters that are *required* for some languages.

And- if I'm correct- the .po files contain language-specific ("internationalised") strings etc.

Quote
I think that just the fact that pascal does not allow non english characters in identifiers is already an accessibility issue on it's own, as especially programming beginners will usually use their own language for identifiers, which they simply can't with pascal, sealing the language off for like half the planets population.

I've given this some thought in the past in the context of some of my own R&D activities, and broadly agree with you. /However/, if Modern Pascal and Lazarus/FPC are to continue to hew to the principles established by Wirth, I think it's fair to point out that despite being Swiss he was entirely happy to design and document the language assuming the English alphabet and made no provision for either non-English identifiers or localised reserved words.

Quote
It's easy to say that we should just use the latin alphabet, I mean for me as a german there are literally only 4 characters ä, ö, ü and ß that can easiely be replaced by ae oe ue and ss, but for some other people learning a programming language should not require learning a new natural language

Quite frankly I find ß in particular pleasantly quaint, even if it tends to look somewhat jarring in most typefaces.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Logitech, TopSpeed & FTL Modula-2 on bare metal (Z80, '286 protected mode).
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

Warfley

  • Hero Member
  • *****
  • Posts: 2067
Re: Pascal Security
« Reply #25 on: November 10, 2021, 09:28:45 pm »
And- if I'm correct- the .po files contain language-specific ("internationalised") strings etc.
Sure, if you enable internationalization/i18n, but if you develop an application as a local only for other locals, using i18n just adds additional work and debugging overhead for you, personally I write most of my applications in english and don't care about i18n because it is simply a lot of extra work which I don't think is justified for my 1 man projects.
And even without strings, there are still comments, I regularly see non english comments, in a korean project I've seen all the comments where in korean. Generally speaking people just like to use their native language.

And the question isn't even if thats a good or a bad idea, but if people like using their native language and pascal makes it harder or doesn't allow them to, well then the result will be that simply in those countries, people will not use pascal but rather languages like C++ or C# that are fully unicode. And personally I like seeing when people start using pascal, I'd love to see it being used around the world.
Especially here in europe it is generally considered a beginners language often taught at schools. This is of course completely off the table in a lot of other countries, because when you want to teach it to children that possible aren't that fluent in english, they will probably choose a programming language where they can name their identifiers in their own language.

Quote
I've given this some thought in the past in the context of some of my own R&D activities, and broadly agree with you. /However/, if Modern Pascal and Lazarus/FPC are to continue to hew to the principles established by Wirth, I think it's fair to point out that despite being Swiss he was entirely happy to design and document the language assuming the English alphabet and made no provision for either non-English identifiers or localised reserved words.
I think it's always easy to say as someone who is comes from a general european background. Sure  Wirth was swiss and depending on where he was from he either spoke italian, german or french (I think it was german swiss but I am not sure right now). All of these languages just have a few chars differently, not a whole different alphabeth, with even different syntactic rules (as arabic for example does not have vocals but vocals come from the context of which the consonants are written).
Also it must be noted that for someone who is native in one european language (maybe except hungarian) has a much easier time learning another like english than someone who comes from a completely different linguistic background
« Last Edit: November 10, 2021, 09:31:18 pm by Warfley »

MarkMLl

  • Hero Member
  • *****
  • Posts: 8572
Re: Pascal Security
« Reply #26 on: November 10, 2021, 09:55:01 pm »
And- if I'm correct- the .po files contain language-specific ("internationalised") strings etc.
Sure, if you enable internationalization/i18n,

The reason I was pointing that out is that it very likely explains why checking software was finding "exotic" Unicode in certain types of file: basically, there was no nefarious intent and the results were false positives.

Quote
I think it's always easy to say as someone who is comes from a general european background. Sure  Wirth was swiss and depending on where he was from he either spoke italian, german or french (I think it was german swiss but I am not sure right now).

Zurich born and bred. I checked, because I was prepared to make disparaging comments about the chauvinism of certain ethnicities who /demand/ that their orthographic foibles be accommodated whatever the cost.

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Logitech, TopSpeed & FTL Modula-2 on bare metal (Z80, '286 protected mode).
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

Warfley

  • Hero Member
  • *****
  • Posts: 2067
Re: Pascal Security
« Reply #27 on: November 10, 2021, 10:07:15 pm »
The reason I was pointing that out is that it very likely explains why checking software was finding "exotic" Unicode in certain types of file: basically, there was no nefarious intent and the results were false positives.
Ah ok you where referencing the post earlier.

Quote
Zurich born and bred. I checked, because I was prepared to make disparaging comments about the chauvinism of certain ethnicities who /demand/ that their orthographic foibles be accommodated whatever the cost.

MarkMLl
Well I don't think you will necessarily get that here, because the people here already use pascal and are therefore proabably fine with the decision. But there are a lot of people who will never learn pascal simply because it's for them much more inconvinient. As I stated in my earlier post, here in europe Pascal is often used in education to teach programming, and personally I think if there is one thing where Pascal really shines it is teaching (because it covers pretty much all areas of basic programming, from fully procedural console applications to OOP gui applications, with high level classes but also low level pointer logic etc.) but as programming is tought in school at ages where the children will probably not be so comfortable with going fully english yet, it will probably not be an option.

I remember back at my school when we learned Pascal, we were also using german words for the identifiers. This works because german is very closely related to english, and also, germany is on good terms with the anglo saxon countries. But in other countries with other languages, and maybe even political tensions such that using english would be considered unpatriotic or something similar, this rules out pascal as a learning language.

MarkMLl

  • Hero Member
  • *****
  • Posts: 8572
Re: Pascal Security
« Reply #28 on: November 11, 2021, 09:18:12 am »
maybe even political tensions such that using english would be considered unpatriotic or something similar, this rules out pascal as a learning language.

Yes, there was a BASIC interpreter at one point with all the keywords translated to Welsh. But I pity anybody condemned to writing a lexer (or for that matter a filesystem) using variable-length UTF8 characters or an orthography where the direction can (and does) change... which gets us back on topic :-)

MarkMLl
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Logitech, TopSpeed & FTL Modula-2 on bare metal (Z80, '286 protected mode).
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

PascalDragon

  • Hero Member
  • *****
  • Posts: 6398
  • Compiler Developer
Re: Pascal Security
« Reply #29 on: November 11, 2021, 09:26:04 am »
I think that just the fact that pascal does not allow non english characters in identifiers is already an accessibility issue on it's own, as especially programming beginners will usually use their own language for identifiers, which they simply can't with pascal, sealing the language off for like half the planets population.

Sooner or later we'll probably implement support for Unicode identifiers, cause Delphi allows them as well.

 

TinyPortal © 2005-2018