I've just released PasLLM, an LLM inference engine written completely in Object Pascal. It allows you to run models such as Llama 3.x, Qwen 2.5, Qwen 3, Phi-3, Mixtral, Gemma 1, DeepSeek R1 and others locally, without Python or external dependencies at inference-runtime.It works with Delphi 11.2+ and FreePascal 3.3.1+ on all major modern operating system targets. I've implemented custom 4-bit quantization formats that get very close to full precision quality while keeping model sizes manageable. CLI and GUI versions are included (FMX, VCL, LCL). Pre-quantized models are available for download. PasLLM can also be integrated as a unit directly into your own Object Pascal projects.Right now it's CPU-only. GPU acceleration via PasVulkan is planned but will take significant time. I mainly test only 64-bit builds, compiling for 32-bit might work, but isn't officially supported and may run into memory limitations with larger models.The repository is at https://github.com/BeRo1985/pasllm (synced from my private server where development takes place). It's AGPL 3.0 licensed for opensource usage and with commercial licenses available if needed.