By the way, are you able to run the examples? Have you encountered any usage issues?
It took me some time to get familiar with your repository.
Resulting build & run log-file attached.
The log shows that it failed on the examples making use of portaudio. That is because I forgot to install the libportaudio dev-package before running the tests

BTW: with regards to the MLS model(s). I experimented a bit more with them and when you provide longer sequences of text then the model seems to pick up on pronunciation after several words/sentences.
@VisualLab:I do not know for sure what might be the culprit there (I am (also) new when it comes to sherpa-onnx) but better voice training usually yield better results. How you can do that can f.e. be seen in
this video (that channel is interesting anyway if you are interested in this kind of software).
That speed and volume differs might be caused by training as well. In case you did not already do realize that some language are spoken must faster/louder then you might be used to (or slower/softer in case you are using a fast paste language). Some engines solve that better (e.g. automatically) than others (where you have change things manually).
The API allows for setting details about the voice model/output that can't be set by the command-line programs.