Recent

Author Topic: PasLLM - LLM Inference Engine in Pure Pascal  (Read 3681 times)

valdir.marcos

  • Hero Member
  • *****
  • Posts: 1285
Re: PasLLM - LLM Inference Engine in Pure Pascal
« Reply #15 on: March 08, 2026, 01:11:25 am »
I've just released PasLLM, an LLM inference engine written completely in Object Pascal. It allows you to run models such as Llama 3.x, Qwen 2.5, Qwen 3, Phi-3, Mixtral, Gemma 1, DeepSeek R1 and others locally, without Python or external dependencies at inference-runtime.

It works with Delphi 11.2+ and FreePascal 3.3.1+ on all major modern operating system targets. I've implemented custom 4-bit quantization formats that get very close to full precision quality while keeping model sizes manageable. CLI and GUI versions are included (FMX, VCL, LCL). Pre-quantized models are available for download. PasLLM can also be integrated as a unit directly into your own Object Pascal projects.

Right now it's CPU-only. GPU acceleration via PasVulkan is planned but will take significant time. I mainly test only 64-bit builds, compiling for 32-bit might work, but isn't officially supported and may run into memory limitations with larger models.

The repository is at https://github.com/BeRo1985/pasllm (synced from my private server where development takes place). It's AGPL 3.0 licensed for opensource usage and with commercial licenses available if needed.
Interesting.

myisjwj

  • Full Member
  • ***
  • Posts: 104
Re: PasLLM - LLM Inference Engine in Pure Pascal
« Reply #16 on: April 19, 2026, 04:54:41 am »
It is better to directly call Llama.dll

Thaddy

  • Hero Member
  • *****
  • Posts: 19165
  • Glad to be alive.
Re: PasLLM - LLM Inference Engine in Pure Pascal
« Reply #17 on: April 19, 2026, 06:57:20 am »
It is better to directly call Llama.dll
No it is not.
This is not simply an API wrapper, what you seem to think.
objects are fine constructs. You can even initialize them with constructors.

 

TinyPortal © 2005-2018