Recent

Author Topic: Permuted index for RTL and FCL  (Read 624 times)

MarkMLl

  • Hero Member
  • *****
  • Posts: 8213
Permuted index for RTL and FCL
« on: January 22, 2025, 10:48:36 am »
I've uploaded scripts to process the documentation files into a permuted index to https://github.com/MarkMLl/fpc-ptx This shows the name of each function, whether it comes from the RTL or FCL, a trimmed and permuted description, and the version of FPC at which it was introduced.

The key thing is that all string functions (including helpers) are indexed together, all socket functions together, all creation operations together and so on.

This has lots of problems which are not going to be fixed in the short term, including that the lack of archived CHM-format files for older than 2.6.0 means that I have been unable to give a precise introduction version for many functions. When I originally looked at this in the 2010s I worked from the PDF files, but that work is almost certainly irretrievable.

I can't attach output here even as a .zip file since it exceeds the 500k limit by about 15%. However the tools are easily run (albeit not very fast), needing only Perl, shell and (FPC's) chmls; for the time being I've put a copy of the output at http://www.kdginstruments.co.uk/public/fpc-ptx.zip

MarkMLl
« Last Edit: January 22, 2025, 04:43:27 pm by MarkMLl »
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Logitech, TopSpeed & FTL Modula-2 on bare metal (Z80, '286 protected mode).
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

marcov

  • Administrator
  • Hero Member
  • *
  • Posts: 12037
  • FPC developer.
Re: Permuted index for RTL and FCL
« Reply #1 on: January 22, 2025, 11:02:56 am »
Pity it is not written in FPC, that makes maintenance harder.

Maybe that CHMs for older versions can be generated using modern fpdoc + old sources and doc sources.

There are possible breaking points:
  • rtl/fcl doc sources didn't change that much for CHM support, but better errorhandling was added, and that found some problems
  • fcl-passrc might be more strict and not be able parse old sources (things like duplicate use of identifiers in objfpc mode) as well as older calling conventions or constructs

The only way of knowing is trying I guess.

P.s. working via the CHM classes instead of the minor frontend CHMLS (or simply extracting all, and then read htmls ) might be a lot faster, because CHM is not initialized every read anymore.
« Last Edit: January 22, 2025, 11:14:03 am by marcov »

MarkMLl

  • Hero Member
  • *****
  • Posts: 8213
Re: Permuted index for RTL and FCL
« Reply #2 on: January 22, 2025, 11:17:37 am »
Pity it is not written in FPC, that makes maintenance harder.

Note that the bottleneck is parsing the HTML files embedded in the CHMs to plain text so that features can be extracted, which would be relatively slow in any language.

Apart from that... it's open source Marco, feel free to rewrite it >:-)

If Pascal were better than Perl for this I'd have used it. It's not: Perl was specifically written for text manipulation.

Quote
Maybe that CHMs for older versions can be generated using modern fpdoc + old sources and doc sources.

There are possible breaking points:
  • rtl/fcl doc sources didn't change that much for CHM support, but better errorhandling was added, and that found some problems
  • fcl-passrc might be more strict and not be able parse old sources (things like duplicate use of identifiers in objfpc mode) as well as older calling conventions or constructs

The only way of knowing is trying I guess.

We've obviously discussed this elsewhere, but I suspect that I originally did it in the 2.6 (or possibly even 2.4) era which is presumably why I started with PDFs. I ran into a problem when I started trying to look at the "when-introduced" issue: it looked as though I needed to bring up specific FPC versions with specific OS prerequisites, and while I was already using VMs and paravirtualisation such things were less mature than they are today.

I only discovered that the CHMs were missing after I'd indexed 3.x and started working backwards: while I'd be very surprised if anybody was still using anything older than 2.6 I still find my inability to do this properly irritating.

Updated: I've uploaded a generated index to www.kdginstruments.co.uk/public/fpc-ptx.zip

MarkMLl
« Last Edit: Today at 12:39:29 pm by MarkMLl »
MT+86 & Turbo Pascal v1 on CCP/M-86, multitasking with LAN & graphics in 128Kb.
Logitech, TopSpeed & FTL Modula-2 on bare metal (Z80, '286 protected mode).
Pet hate: people who boast about the size and sophistication of their computer.
GitHub repositories: https://github.com/MarkMLl?tab=repositories

Thaddy

  • Hero Member
  • *****
  • Posts: 16523
  • Kallstadt seems a good place to evict Trump to.
Re: Permuted index for RTL and FCL
« Reply #3 on: January 22, 2025, 03:52:14 pm »
Claiming eggs with eggs again, Mark.
Name the species.
But I am sure they don't want the Trumps back...

 

TinyPortal © 2005-2018