I am trying to do large scale (vector, matrix based) calculations (not only matrix operations, but other element-wise calculations, like square root, exponential, trigonometric, etc.) with GPU acceleration. I am looking for a good library with Pascal binding.
"Good" would mean, easy to switch between CPU and GPU, large operation set, active support, fast. I did a lot of work with TensorFlow (that can do non-AI stuff as well), but it is extremely complicated to build graphs, run sessions, etc. and the C binding (and the Pascal binding based on it) is not actively supported. I was recommended things like JAX (but as I see it only has Python binding), CuDA libraries (very low level) and ArrayFire (with C, Fortran binding that can be transformed into a Pascal binding, but I did not find anyone doing it).
(Whoever I talk to recommends changing to Python, but one does not easily leave a long-term love:-) )
Any better idea?