I'm playing with fphttp to create an API and comparing it to my usual stack of OpenBD (CFML engine running on the JVM), I also have a simple Java Spring API to compare with.
I'm comparing them locally with WRK2 using
wrk2 -t2 -c100 -d30s -R2000 --latency http://localhost:7080/status/
In the process I've done both x86 and x64 with and without threads and had some odd results.
Both architectures have about 13% less throughput when setting multithreading to true, and the x64 is a tad slower than the x86 version, but only by a little.
Compared to my usual stack both have about 26% less throughput.
I'm surprised that they're slower than the Java versions at all, that's a big disappointment as I was hoping for better performance with a native binary solution compared to Java, and I'm surprised that threading caused a drop in throughput.
I ran each test a good handful of times and the average I got was:
OpenBD - 1995 requests/sec
Java Spring API - 1997 requests/sec
FPC fphttp
No threading
x86 - 1560 requests/sec
x64 - 1450 requests/sec
FPC fphttp
With threading
x86 - 1480 requests/sec
x64 - 1370 requests/sec
Tested on my Macbook Pro.