Recent

Author Topic: RTL patch to speedup 350% server app (tested on I7 quad core)  (Read 1490 times)

guest59697

  • Guest
RTL patch to speedup 350% server app (tested on I7 quad core)
« on: February 16, 2019, 08:36:22 pm »
hello,

I did patches for MM and RTL of Win64, making Delphi windows server app flying

you can check this cool library thread https://github.com/winddriver/Delphi-Cross-Socket/issues/39#

 

Rio 10.3.1 default

Server Software: CrossHttpServer/2.0
Server Hostname: 192.168.1.166
Server Port: 8000
Document Path: /hello
Document Length: 11 bytes
Concurrency Level: 100
Time taken for tests: 3.703 seconds
Complete requests: 100000
Failed requests: 0
Keep-Alive requests: 100000
Total transferred: 14200000 bytes
HTML transferred: 1100000 bytes
Requests per second: **27002.22 [#/sec] (mean)**
Time per request: 3.703 [ms] (mean)
Time per request: 0.037 [ms] (mean, across all concurrent requests)
Transfer rate: 3744.45 [Kbytes/sec] received

 

Rio 10.3.1 with RDP patches

Server Software: CrossHttpServer/2.0
Server Hostname: 192.168.1.166
Server Port: 8000
Document Path: /hello
Document Length: 11 bytes
Concurrency Level: 100
Time taken for tests: 1.094 seconds
Complete requests: 100000
Failed requests: 0
Keep-Alive requests: 100000
Total transferred: 14200000 bytes
HTML transferred: 1100000 bytes
Requests per second: **91442.20 [#/sec] (mean)**
Time per request: 1.094 [ms] (mean)
Time per request: 0.011 [ms] (mean, across all concurrent requests)
Transfer rate: 12680.46 [Kbytes/sec] received

Please check my Pos() routines, should be ok

If you use my patches please put a link to my website [www.dellapasqua.com](http://www.dellapasqua.com) and please, if you like, forward me some jobs internet related, fullstack, cloud, embedded, sql, I'm glad to collaborate with smart people, Delphi companies
Thank you

Roberto Della Pasqua

---
Btw. dear FPC users, you should adapt those sources to your architecture, or tell me if I have to do it
Btw2. I have also zlib SIMD 5x faster than the system gzip library, I'll post next time

guest59697

  • Guest
Re: RTL patch to speedup 350% server app (tested on I7 quad core)
« Reply #1 on: February 17, 2019, 09:37:30 am »
UPDATED with ITBB from 27k op/sec to 98k op/sec on quad core cpu

Server Software: CrossHttpServer/2.0
Server Hostname: 192.168.1.166
Server Port: 8000
Document Path: /hello
Document Length: 11 bytes
Concurrency Level: 100
Time taken for tests: 1.015 seconds
Complete requests: 100000
Failed requests: 0
Keep-Alive requests: 100000
Total transferred: 14200000 bytes
HTML transferred: 1100000 bytes
Requests per second: **98514.50** [#/sec] (mean)
Time per request: 1.015 [ms] (mean)
Time per request: 0.010 [ms] (mean, across all concurrent requests)
Transfer rate: 13661.19 [Kbytes/sec] received

So:
default 27K/s
win2016 heap 91K/s
**intel TBB 98K/s**

But RDPMM64 as first unit in project source

guest59697

  • Guest
Re: RTL patch to speedup 350% server app (tested on I7 quad core)
« Reply #2 on: February 17, 2019, 11:28:06 am »
please rem the Pos() patches, because I have found a bug, will correct it soon

asdf121

  • New member
  • *
  • Posts: 35
Re: RTL patch to speedup 350% server app (tested on I7 quad core)
« Reply #3 on: February 17, 2019, 02:58:29 pm »
Looks like you used Intel Integrated Performance Primitives (Intel IPP) which uses SIMD and AVX instructions.
Sorry, but your first post sounds like you that you created something special on your own but you just used a proprietary library. And your examples are Windows only...  ::)

Next time please write your own Pos() function in ASM code, this can then be integrated into FPC and be used if the CPU supports SIMD&AVX ;)
And for MM, there is a FastMM version with AVX support

guest59697

  • Guest
Re: RTL patch to speedup 350% server app (tested on I7 quad core)
« Reply #4 on: February 17, 2019, 05:51:52 pm »
you are allright
I have written in the header the IPP and TBB version,
it's a matter of a good config for the library exportation, good theory approach, and trying to do state-of-the-art delphi pascal wrapper over them
those Intel libraries are a strong foundation, industry proven, but I understand that opensource projects as FPC wants to be all free and all self-dependant
btw. I'm not so smart to do asm simd routines! :D
btw2. I have read that fastmm-avx has troubles under some circumstances, not yet solved (I have tested a dozen of MM, TBB is resulted the quicker)
btw3. OFFTOPIC have you seen the excellent LLVM pascal compiler of mseLang? why do not try to use it, we can build a group
btw4. I'm finishing also the Linux version (but x64 of course, only)

guest59697

  • Guest