Recent

Author Topic: Tests results of several pascal based JSON parsers  (Read 2785 times)

sysrpl

  • Sr. Member
  • ****
  • Posts: 315
    • Get Lazarus
Tests results of several pascal based JSON parsers
« on: August 30, 2019, 10:20:04 am »
I've posted a new page that tests the speed and correctness of several pascal based JSON parsers.

https://www.getlazarus.org/json/tests/

In full disclosure I am the author of the new open source JsonTools library, and even though my parser seems to a big improvement over the other alternatives, my tests were not biased.

If anyone would like help in replication the tests, let me know and I'll see what I can do.

Also, to be thorough, you should read through both the article I posted at the top this message, and my original page which has been updated with more information. Both pages took some time to write, and I promise if you read through them some of your questions will be answered without having to ask others for help or insight.

howardpc

  • Hero Member
  • *****
  • Posts: 4144
Re: Tests results of several pascal based JSON parsers
« Reply #1 on: August 30, 2019, 12:59:56 pm »
Thank you for sharing this elegantly and compactly coded tool, which is most impressive.
However, I could not get the tests.lpr in the /tests folder to compile. It looks as though tests.lpr was designed for a different incarnation of jsontools.pas?

Thaddy

  • Hero Member
  • *****
  • Posts: 14197
  • Probably until I exterminate Putin.
Re: Tests results of several pascal based JSON parsers
« Reply #2 on: August 30, 2019, 01:24:21 pm »
Your parser seems not to handle 4 byte UTF16 codepoints... It seems limited to the 2 byte subset of UTF16 a.k.a. UCS2? I still have to test it, though, observation based on reading your code.
« Last Edit: August 30, 2019, 01:26:04 pm by Thaddy »
Specialize a type, not a var.

sysrpl

  • Sr. Member
  • ****
  • Posts: 315
    • Get Lazarus
Re: Tests results of several pascal based JSON parsers
« Reply #3 on: August 30, 2019, 03:29:40 pm »
Thaddy, it supports UTF8, but strictly adheres to the JSON spec which syas you can only use 4 character hex encodes with \u. It also says that it supports UTF8.  What this means is that if you want a 4 byte unicode, then don't try to encode it in hex, rather just use the 4 byte character directly.

So to do this your would write:

{ "name": "𠜎" }

Instead of trying to write:

{ "name": "\u2070E" }

Does that make any sense?

howardpc,

Osrry about that, I forgot to add tests.lpr back into the git repo and only had been adding the jsontools.pas unit. I've pushed the newer version of test.lpr and it's fixed. Thanks for noticing.

Thaddy

  • Hero Member
  • *****
  • Posts: 14197
  • Probably until I exterminate Putin.
Re: Tests results of several pascal based JSON parsers
« Reply #4 on: August 30, 2019, 03:52:20 pm »
Thaddy, it supports UTF8, but strictly adheres to the JSON spec
That's a contradiction in terms: JSON specifies UTF16, Final draft  ECMA-404 second revision ad 4: JSON text
It is also implied from the ECMA script specification, which is also UTF16.

So even if UTF8 is supported its format should be UTF16
Specialize a type, not a var.

sysrpl

  • Sr. Member
  • ****
  • Posts: 315
    • Get Lazarus
Re: Tests results of several pascal based JSON parsers
« Reply #5 on: August 30, 2019, 04:11:53 pm »
From the current specification and also noted on Wikipedia on https://tools.ietf.org/html/rfc8259:

8.1.  Character Encoding

   JSON text exchanged between systems that are not part of a closed
   ecosystem MUST be encoded using UTF-8 [RFC3629].

   Previous specifications of JSON have not required the use of UTF-8
   when transmitting JSON text.  However, the vast majority of JSON-
   based software implementations have chosen to use the UTF-8 encoding,
   to the extent that it is the only encoding that achieves
   interoperability.

   Implementations MUST NOT add a byte order mark (U+FEFF) to the
   beginning of a networked-transmitted JSON text.  In the interests of
   interoperability, implementations that parse JSON texts MAY ignore
   the presence of a byte order mark rather than treating it as an
   error.

From https://www.json.org/ :

string -> \ -> u -> 4 hex digits

UTF-8 allows for decoding of 4 byte characters.

ASBzone

  • Hero Member
  • *****
  • Posts: 678
  • Automation leads to relaxation...
    • Free Console Utilities for Windows (and a few for Linux) from BrainWaveCC
Re: Tests results of several pascal based JSON parsers
« Reply #6 on: August 30, 2019, 08:14:27 pm »
I've posted a new page that tests the speed and correctness of several pascal based JSON parsers.

https://www.getlazarus.org/json/tests/

In full disclosure I am the author of the new open source JsonTools library, and even though my parser seems to a big improvement over the other alternatives, my tests were not biased.

If anyone would like help in replication the tests, let me know and I'll see what I can do.

Also, to be thorough, you should read through both the article I posted at the top this message, and my original page which has been updated with more information. Both pages took some time to write, and I promise if you read through them some of your questions will be answered without having to ask others for help or insight.

Thanks for the code and also the test page.  I found it intriguing.   There is one typo that I noticed on the second test:

"100,00 times."

I expect this to be: "100,000 times."
-ASB: https://www.BrainWaveCC.com/

Lazarus v2.2.7-ada7a90186 / FPC v3.2.3-706-gaadb53e72c
(Windows 64-bit install w/Win32 and Linux/Arm cross-compiles via FpcUpDeluxe on both instances)

My Systems: Windows 10/11 Pro x64 (Current)

sysrpl

  • Sr. Member
  • ****
  • Posts: 315
    • Get Lazarus
Re: Tests results of several pascal based JSON parsers
« Reply #7 on: August 30, 2019, 09:57:41 pm »
100,000 fixed. Thanks for noticing.

Thaddy

  • Hero Member
  • *****
  • Posts: 14197
  • Probably until I exterminate Putin.
Re: Tests results of several pascal based JSON parsers
« Reply #8 on: September 03, 2019, 10:06:49 am »
From the current specification and also noted on Wikipedia on https://tools.ietf.org/html/rfc8259:
The only reliable source is 8.2. not 8.1. and it is not accepted.

Furthermore: it will not be accepted because what is described in 8.1. Usually one is a lot smarter than giving in to inappropriate use or deviation from underlying standards.
It would be a first: it has no technical merit.

To paraphrase:
We are not all elderly British that try to destroy their children's future.., Some English that bury heads in the sand about Brexit and listening to a blond with dual nationality... - not Swedish   8-) -(temporarily on  topic...today...  8-)  I am still a political scientist too..  :-X )

Because he can't https://www.nytimes.com/2017/02/08/world/europe/britain-boris-johnson-renounces-american-citizenship.html according to a certain average golfer...
And so he proves himself an opportunist...which we can translate to certain prediction algorithms... (Yes, it IS on topic... just.... ;D ;D :D :'( )

Thank you to everyone that is actually following my reasoning... You do not have to agree..
« Last Edit: September 03, 2019, 11:13:34 am by Thaddy »
Specialize a type, not a var.

lainz

  • Hero Member
  • *****
  • Posts: 4460
    • https://lainz.github.io/
Re: Tests results of several pascal based JSON parsers
« Reply #9 on: December 10, 2019, 01:00:31 am »
Thanks, hope this can speed up our application.  :)

sysrpl

  • Sr. Member
  • ****
  • Posts: 315
    • Get Lazarus
Re: Tests results of several pascal based JSON parsers
« Reply #10 on: December 10, 2019, 06:46:00 am »
Thanks, hope this can speed up our application.  :)
Original author here. Let me know if my library works better for you.

avra

  • Hero Member
  • *****
  • Posts: 2514
    • Additional info
Re: Tests results of several pascal based JSON parsers
« Reply #11 on: December 10, 2019, 10:40:14 am »
It would be nice to add this lib as a package and make it available through OPM.

Library is dual license as GPL3 and LGPL3. As I understand LGPL3 forces derivative work (including modifications or anything statically linked to the library) to be only redistributed under LGPL3. Simply including your unit makes a static linking by default, so it seams that my application would then be forced to be LGPL3? You do not allow linking exceptions like FPC and LAZ, do you?

Anyway thanks for sharing the library.  ;)
ct2laz - Conversion between Lazarus and CodeTyphon
bithelpers - Bit manipulation for standard types
pasettimino - Siemens S7 PLC lib

lainz

  • Hero Member
  • *****
  • Posts: 4460
    • https://lainz.github.io/
Re: Tests results of several pascal based JSON parsers
« Reply #12 on: December 11, 2019, 12:54:47 am »
Thanks, hope this can speed up our application.  :)
Original author here. Let me know if my library works better for you.

Thanks, really impressive, now is super fast!!! =)

 

TinyPortal © 2005-2018