Recent

Author Topic: Call the Llama.cpp dynamic library to implement this AI call functionality.  (Read 667 times)

myisjwj

  • Full Member
  • ***
  • Posts: 104
Directly call the dynamic library without using the Html API and run it completely locally. Faster, safer and more reliable.
https://www.cnblogs.com/jwjss/p/19897010
If using a non-Windows platform, change the DLL_NAME of the LlamaCppApi unit to 'llama.dll'; Just use llama.so will do.

LeP

  • Sr. Member
  • ****
  • Posts: 306
There are lot of libs missing...
Take a look here: https://github.com/Embarcadero/llama-cpp-delphi

May be this help.
Un Sistema per domarli, un IDE per trovarli, un codice per ghermirli e nel framework incatenarli.
An operating system to tame them, an IDE to find them, a code to catch them and in the framework chain them.

myisjwj

  • Full Member
  • ***
  • Posts: 104
The attachment cannot handle such large files. Even the CPU's cache alone is over 2M in size. The CUDA library is 500 megabytes in size. You also need to compile Llama.cpp by yourself.
The library files for the CPU are located here: for win7_X64
http://bbs.2ccc.com/attachments/2026/jwj76_202641793547.rar
« Last Edit: April 22, 2026, 03:55:26 am by myisjwj »

Thausand

  • Hero Member
  • *****
  • Posts: 545
Is no use for tell myisjwj  :)

If not have know for delphi header is old > 1 year is no work then is no understand how is work llama (can read github llama API break). Delphi header is work when install old llama version only. 1 year AI/LLM frame time is same time as life for human.

It is give more control when also have tensor library header translate for Pascal but that no require.

If want make more better and more platform support then have llama library load dynamic (https://www.freepascal.org/docs-html/rtl/dynlibs/loadlibrary.html). llama also have load dynamic library when need example support special CPU/GPU/NPU.
A docile goblin always follow HERMES.md

LeP

  • Sr. Member
  • ****
  • Posts: 306
Is no use for tell myisjwj  :)

If not have know for delphi header is old > 1 year is no work then is no understand how is work llama (can read github llama API break). Delphi header is work when install old llama version only. 1 year AI/LLM frame time is same time as life for human.

It is give more control when also have tensor library header translate for Pascal but that no require.

If want make more better and more platform support then have llama library load dynamic (https://www.freepascal.org/docs-html/rtl/dynlibs/loadlibrary.html). llama also have load dynamic library when need example support special CPU/GPU/NPU.

Delphi 13.1 - Embarcadero llamacpp project with qwen3.5 for example ... (see attach screenshoot). It's only a test. I normally use Ollama, not llama directly.

Tale care that there is a support for all environments (see my first link) ... in the screenshoot you will se that I include support for CUDA.

[EDIT]: insert update information of model ...
« Last Edit: April 22, 2026, 04:30:46 pm by LeP »
Un Sistema per domarli, un IDE per trovarli, un codice per ghermirli e nel framework incatenarli.
An operating system to tame them, an IDE to find them, a code to catch them and in the framework chain them.

LeP

  • Sr. Member
  • ****
  • Posts: 306
But, my post is not about Delphi is better or not, is simple to help others to have some indicating of environments already done, not llamacpp but all others libraries that you can find there.
Llamacpp dll can be compiled from yourself and you can use any wrapper about that.

And find something usefull inside is not so far.

P.S.: take care of MIT license as indicating.
« Last Edit: April 22, 2026, 04:55:40 pm by LeP »
Un Sistema per domarli, un IDE per trovarli, un codice per ghermirli e nel framework incatenarli.
An operating system to tame them, an IDE to find them, a code to catch them and in the framework chain them.

Thausand

  • Hero Member
  • *****
  • Posts: 545
Yes is good LeP but that not why is use llama library. Library is use for direct infer. If run llama server then no need for library and can have use server protocol (any one can make in fpc or other program language  for communicate llama server. Is not only for llama but also other server/agent).

But I may be not have understand you write correct ?

PS: most header Delphi is for platform windows and then is no good use for FPC. Is not say for discredit but is practise and is make many problem because use wrong type when converse and have example c-code.
A docile goblin always follow HERMES.md

LeP

  • Sr. Member
  • ****
  • Posts: 306
Yes is good LeP but that not why is use llama library. Library is use for direct infer. If run llama server then no need for library and can have use server protocol (any one can make in fpc or other program language  for communicate llama server. Is not only for llama but also other server/agent).
Like I told I only answered to give more instruments for who needed them. I think you will see that can be used like chat or internally for inference or like you want ... but this is not the point.
PS: most header Delphi is for platform windows and then is no good use for FPC. Is not say for discredit but is practise and is make many problem because use wrong type when converse and have example c-code.
It's enough this for you or you want that I show you a linux machine with running software ?
It's not for discredit or others things ... my posts are there only to help and NOT TO DEMOSTRATE THAT SOMETHING IS BETTER OR WORSE.
« Last Edit: April 22, 2026, 07:30:32 pm by LeP »
Un Sistema per domarli, un IDE per trovarli, un codice per ghermirli e nel framework incatenarli.
An operating system to tame them, an IDE to find them, a code to catch them and in the framework chain them.

Thausand

  • Hero Member
  • *****
  • Posts: 545
@LeP:
You have miss point and not have address practical problem. Then there no use for discuss further. No problem, I stop reply.

I just say thank for myisjwj and make write for not tell all story for inform.
A docile goblin always follow HERMES.md

 

TinyPortal © 2005-2018