A new design for a JSON Parser

Gustavo 'Gus' Carreno

Hero Member
Posts: 1120
Professional amateur ;-P

Re: A new design for a JSON Parser

« Reply #45 on: July 26, 2021, 07:59:07 pm »

Hey A.Bouchez,

Quote from: abouchez on July 25, 2021, 06:49:39 pm

If you want a fast JSON parser for FPC, you may try what mORMot 2 offers.

Is there a link you can provide that gives a simple example on how to start with using mORMot 2's JSON parser only?
Something that will give you a simple set of instructions to only install the parser and not have to depend on the entirety of mORMot's code.

I would be eternally grateful for that!!

Cheers,
Gus

Logged

Lazarus 3.99(main) FPC 3.3.1(main) Ubuntu 23.10 64b Dark Theme
Lazarus 3.0.0(stable) FPC 3.2.2(stable) Ubuntu 23.10 64b Dark Theme
http://github.com/gcarreno

Okoba

Hero Member
Posts: 533

Re: A new design for a JSON Parser

« Reply #46 on: July 27, 2021, 10:33:37 am »

To get you started:
- Use mORMot2, and it has a package for Lazarus: https://github.com/synopse/mORMot2
- Remember that some methods are renamed in version 2, but read the comments, it always helps what you should use next
- Always read the comments, they have instructions
- Start with variant version as it is quick, easy and still very fast
- For a more structured code, use record or class way
- For record and class ways, you will hit some issues when you use custom types, you will need to register them like I did or register for custom events and other stuff. Read the blog and docs and search the forum if you need it. Almost are questions that are already answered.
- There are more JSON methods for arrays (eg JsonArrayCount) and custom field reading. Read the code for more info, but you probably will not need them for daily stuff.

- Forum: https://synopse.info/forum/viewforum.php?id=2
- Docs: https://synopse.info/files/html/Synopse%20mORMot%20Framework%20SAD%201.18.html#TITLE_237
- Blog: https://blog.synopse.info/?tag/JSON/

Here is a sample:

Code: Pascal [Select][+]

program project1;
 
{$mode objfpc}{$H+}
 
uses
  mormot.core.base,
  mormot.core.text,
  mormot.core.json,
  mormot.core.variants;
 
type
  TTestClass = class(TSynAutoCreateFields)
  private
    FX: Integer;
    FY: String;
    FZ: TBooleanDynArray;
  published
    property X: Integer read FX write FX;
    property Y: String read FY write FY;
    property Z: TBooleanDynArray read FZ write FZ;
  end;
 
  TTestRecord = packed record //Need to be packed
    X: Integer;
    Y: String;
    Z: TBooleanDynArray;
  end;
const
  __TTestRecord = 'X: Integer;  Y: String; Z: TBooleanDynArray';
 
  procedure Decode;
  var
    S: RawUtf8;
    J: Variant;
    V: array[0..1] of TValuePUTF8Char;
    C: TTestClass;
    R: TTestRecord;
  begin
    S := '{"X":1,Y:"Test",Z:[false,true]}';
    J := TDocVariant.NewJson(S);
 
    //Variant way
    WriteLn(J.X);
    WriteLn(J.Y);
    WriteLn(J.Z._(0));
 
    //TDocVariantData way
    WriteLn(TDocVariantData(J).S['Y']);
 
    //ObjectLoadJson Way
    C := TTestClass.Create;
    WriteLn(ObjectLoadJson(C, S));
    WriteLn(C.X);
    WriteLn(C.Z[0]);
    C.Free;
 
    RecordLoadJson(R, S, TypeInfo(TTestRecord));
    WriteLn(R.X);
    WriteLn(R.Z[0]);
 
    //JsonDecode way (Warning: Inplace and changes S)
    JsonDecode(S, ['X', 'Y'], @V);
    WriteLn(V[0].ToCardinal);
  end;
 
  procedure Encode;
  var
    C: TTestClass;
    R: TTestRecord;
  begin
    //ObjectToJson way
    C := TTestClass.Create;
    C.X := 1;
    C.Y := 'Test';
    C.Z := [False, True];
    WriteLn(ObjectToJson(C));
    C.Free;
 
    //RecordSaveJson way
    R.X := 1;
    R.Y := 'Test';
    R.Z := [False, True];
    WriteLn(RecordSaveJson(R, TypeInfo(TTestRecord)));
 
    //JsonEncode way
    WriteLn(JsonEncode(['X', 1, 'Y', 'Test', 'Z', '[', False, True, ']']));
  end;
 
begin
  //Only needed once
  TRttiJson.RegisterFromText(TypeInfo(TTestRecord), __TTestRecord, [], []);
 
  Decode;
  Encode;
  ReadLn;
end.

Logged

Gustavo 'Gus' Carreno

Hero Member
Posts: 1120
Professional amateur ;-P

Re: A new design for a JSON Parser

« Reply #47 on: July 27, 2021, 04:42:26 pm »

Hey Okoba,

Quote from: Okoba on July 27, 2021, 10:33:37 am

To get you started:
[...]
- Forum: https://synopse.info/forum/viewforum.php?id=2
- Docs: https://synopse.info/files/html/Synopse%20mORMot%20Framework%20SAD%201.18.html#TITLE_237
- Blog: https://blog.synopse.info/?tag/JSON/

This is freakin AWESOME, thank you SOOOO much Okoba!!!

I'll pour into all the code and blog posts you provided to get my head around the entirety of what is needed to wrap my head around a different paradigm of doing JSON.

I have to admit, that from the code you provided, it is quite a paradigm shift from the approach that fpjson takes you

Again, thank you SOO much for all the detailed info!!

Cheers,
Gus

Logged

Lazarus 3.99(main) FPC 3.3.1(main) Ubuntu 23.10 64b Dark Theme
Lazarus 3.0.0(stable) FPC 3.2.2(stable) Ubuntu 23.10 64b Dark Theme
http://github.com/gcarreno

Okoba

Hero Member
Posts: 533

Re: A new design for a JSON Parser

« Reply #48 on: July 27, 2021, 04:47:11 pm »

Welcome!
If you like fpjson approach, you may like to use the Variant way.

Logged

Gustavo 'Gus' Carreno

Hero Member
Posts: 1120
Professional amateur ;-P

Re: A new design for a JSON Parser

« Reply #49 on: July 27, 2021, 04:52:51 pm »

Hey Okoba,

Quote from: Okoba on July 27, 2021, 04:47:11 pm

If you like fpjson approach, you may like to use the Variant way.

It's not that I like it per se. It's the fact that it's the only one I've been exposed to up til now. But I'll keep it in mind

I don't mind change and I'm actually really curious to learn this new approach, so again, many thanks for giving me a guide on how to tackle this new challenge

Cheers,
Gus

Logged

Lazarus 3.99(main) FPC 3.3.1(main) Ubuntu 23.10 64b Dark Theme
Lazarus 3.0.0(stable) FPC 3.2.2(stable) Ubuntu 23.10 64b Dark Theme
http://github.com/gcarreno

alpine

Hero Member
Posts: 1064

Re: A new design for a JSON Parser

« Reply #50 on: July 27, 2021, 09:32:32 pm »

Quote from: Okoba on July 27, 2021, 10:33:37 am

To get you started:
- Use mORMot2, and it has a package for Lazarus: https://github.com/synopse/mORMot2
- Remember that some methods are renamed in version 2, but read the comments, it always helps what you should use next
- Always read the comments, they have instructions
- Start with variant version as it is quick, easy and still very fast

May I politely ask what are the advantages of using a Variant instead of fpjson.TJSONData and descendants?

Quote from: Okoba on July 27, 2021, 10:33:37 am

- For a more structured code, use record or class way
- For record and class ways, you will hit some issues when you use custom types, you will need to register them like I did or register for custom events and other stuff.
*snip*

What is the point when we have fine fpjsonrtti unit with the TJSONStreamer and TJSONDeStreamer?

Sorry for being out of topic, but I don't really see a big difference.

Logged

"I'm sorry Dave, I'm afraid I can't do that."
—HAL 9000

engkin

Hero Member
Posts: 3112

Re: A new design for a JSON Parser

« Reply #51 on: July 27, 2021, 09:58:19 pm »

Quote from: y.ivanov on July 27, 2021, 09:32:32 pm

I don't really see a big difference.

I am also interested. According to reply #43, it is 13 to 50 times faster. Would be nice to see the benchmark code.

Logged

Okoba

Hero Member
Posts: 533

Re: A new design for a JSON Parser

« Reply #52 on: July 28, 2021, 04:42:50 am »

Variant version is faster, not much for being variant, because the underlining JSON parsing of mORMot. Being variant makes it simpler to use to some tastes. If you need a more structured code, you should use the record or class way.
I am not much experienced with TJSONStreamer but the mORMot version, has options like:
- Auto creating and destroying fields (if you inherit from TSynAutoCreateFields
- Supports records
- Much more options for handling custom types, enums, comments, keyword names in JSON (type, class)

The key thing to choose between them is if you need more speed or more options or Delphi support, then mORMot seems the better option.

The benchmark code:
https://github.com/synopse/mORMot2/blob/087f740c577a0e38f83f8193874a343ed789fb46/test/test.core.data.pas#L2840

Logged

engkin

Hero Member
Posts: 3112

Re: A new design for a JSON Parser

« Reply #53 on: July 28, 2021, 04:51:02 am »

Quote from: Okoba on July 28, 2021, 04:42:50 am

The benchmark code:
https://github.com/synopse/mORMot2/blob/087f740c577a0e38f83f8193874a343ed789fb46/test/test.core.data.pas#L2840

Thank you.

Logged

abouchez

Full Member
Posts: 111

Re: A new design for a JSON Parser

« Reply #54 on: July 28, 2021, 09:44:01 am »

I tried to include jsontools to the benchmark.
I downloaded the current version from https://github.com/sysrpl/JsonTools

Sadly, this library doesn't seem very well tested.
TryParse('["XS\"\"\"."]') fails, whereas this is valid JSON.

After a quick fix, I run the benchmark tests:

Code: [Select]

  Some numbers on FPC 3.2 + Linux x86_64:
  - JSON benchmark: 100,299 assertions passed  810.30ms
     StrLen() in 820us, 23.3 GB/s
     IsValidUtf8(RawUtf8) in 1.46ms, 13 GB/s
     IsValidUtf8(PUtf8Char) in 2.23ms, 8.5 GB/s
     IsValidJson(RawUtf8) in 27.23ms, 719.8 MB/s
     IsValidJson(PUtf8Char) in 25.87ms, 757.6 MB/s
     JsonArrayCount(P) in 25.26ms, 775.9 MB/s
     JsonArrayCount(P,PMax) in 25.04ms, 783 MB/s
     JsonObjectPropCount() in 8.40ms, 1.3 GB/s
     TDocVariant in 118.81ms, 165 MB/s
     TDocVariant dvoInternNames in 145.08ms, 135.1 MB/s
     TOrmTableJson GetJsonValues in 22.88ms, 376.8 MB/s (write)
     TOrmTableJson expanded in 41.26ms, 475.1 MB/s
     TOrmTableJson not expanded in 21.44ms, 402.2 MB/s
     DynArrayLoadJson in 62.02ms, 316 MB/s
     fpjson in 79.36ms, 24.7 MB/s
     jsontools in 51.41ms, 38.1 MB/s
     SuperObject in 187.79ms, 10.4 MB/s

So mORMot 2 DynArrayLoadJson() is almost 10 times faster than jsontools, and TDocVariant is 5 times faster.

The fix is a dirty goto (the fastest to write):

Code: Pascal [Select][+]

  if C^ = '"'  then
  begin
    repeat
fix:  Inc(C);
      if C^ = '\' then
      begin
        Inc(C);
        if C^ = '"' then
          goto fix
        else if C^ = 'u' then
 

I would not use a library with so limited testing, anyway.

Logged

alpine

Hero Member
Posts: 1064

Re: A new design for a JSON Parser

« Reply #55 on: July 28, 2021, 09:53:51 am »

@Okoba,
Thank you for the info.

Quote from: Okoba on July 28, 2021, 04:42:50 am

Variant version is faster, not much for being variant, because the underlining JSON parsing of mORMot. Being variant makes it simpler to use to some tastes.

By "simpler" I guess you mean writing J.X instead of C.Integers['X'], both of them require a lookup, but as the former depends on some compiler magic to skip quotes, the latter has at least a run-time type check. Both ways will require a Find('X') to ensure the attribute is present and there won't be a "bang".

So, the latter is for my taste, it's just not so crafty.

Quote from: Okoba on July 28, 2021, 04:42:50 am

If you need a more structured code, you should use the record or class way.
I am not much experienced with TJSONStreamer but the mORMot version, has options like:
- Auto creating and destroying fields (if you inherit from TSynAutoCreateFields

The mere existence of TSynAutoCreateFields is something that worries me. Hacking with the RTTI is a bummer and how it can be justified? What if RTTI layout changes? Portable?

Quote from: Okoba on July 28, 2021, 04:42:50 am

- Supports records
- Much more options for handling custom types, enums, comments, keyword names in JSON (type, class)
*snip*

IMHO that framework tends to shift Pascal paradigm to something dynamically-typed like i.e. Python, something I don't agree with. But that is my personal opinion.

« Last Edit: July 28, 2021, 09:55:32 am by y.ivanov »

Logged

"I'm sorry Dave, I'm afraid I can't do that."
—HAL 9000

abouchez

Full Member
Posts: 111

Re: A new design for a JSON Parser

« Reply #56 on: July 28, 2021, 10:11:17 am »

Some hints:
- the mORMot custom variant type with is just a way of using it - you are not required to use late binding - and in fact, I prefer to use directly the TDocVariantData record and only typecast it into a variant when I want to transmit it as such;
- the mORMOt custom variant type is just a convenient way to store some object/array document, with built-in JSON support, and automatic memory management by the compiler, like any variant or record; the mORMot ORM also uses such document variants to store any JSON/BSON in a SQL/NoSQL database, or handle dynamic content from client/server SOA using interfaces; on Delphi (I hope with fpdebug soon) you can even see the JSON content when you inspect any such variant value in the debugger - much appreciated, and impossible to do with a class or an interface;
- the more "pascalish" is to use records and array of records and mORMot JSON serialization: there will be no lookup, minimal memory consumption, and best performance (>300MB/s instead of 24MB/s for fpjson), with no compiler magic - just plain efficient pascal code;
- mORMot doesn't change the RTTI - TSynAutoCreateFields is just a way to auto-initiate nested published classes instances in a class, which is very handy in some cases; what mORMot does, is to cache the RTTI for efficiency, and in a cross-platform way.

« Last Edit: July 28, 2021, 10:24:13 am by abouchez »

Logged

abouchez

Full Member
Posts: 111

Re: A new design for a JSON Parser

« Reply #57 on: July 28, 2021, 10:22:53 am »

The \" parsing issue I found is known since october 2019.
https://github.com/sysrpl/JsonTools/issues/11

But the https://github.com/sysrpl/JsonTools/issues/12 decimal dot problem is even more concerning.

« Last Edit: July 28, 2021, 10:36:34 am by abouchez »

Logged

alpine

Hero Member
Posts: 1064

Re: A new design for a JSON Parser

« Reply #58 on: July 28, 2021, 10:40:17 am »

Quote from: abouchez on July 28, 2021, 10:11:17 am

*snip*
- mORMot doesn't change the RTTI - TSynAutoCreateFields is just a way to auto-initiate nested published classes instances in a class, which is very handy in some cases; what mORMot does, is to cache the RTTI for efficiency, and in a cross-platform way.

I see.
You're building it, not changing it. Does it make a difference?

in mormot.core.json:

Code: Pascal [Select][+]

procedure AutoCreateFields(ObjectInstance: TObject);
var
  rtti: TRttiJson;
  n: integer;
  p: ^PRttiCustomProp;
begin
  // inlined ClassPropertiesGet
  rtti := PPointer(PPAnsiChar(ObjectInstance)^ + vmtAutoTable)^;
  if (rtti = nil) or
     not (rcfAutoCreateFields in rtti.Flags) then
    rtti := DoRegisterAutoCreateFields(ObjectInstance);
  p := pointer(rtti.fAutoCreateClasses);
  if p = nil then
    exit;
  // create all published class fields
  n := PDALen(PAnsiChar(p) - _DALEN)^ + _DAOFF; // length(AutoCreateClasses)
  repeat
    with p^^ do
      PPointer(PAnsiChar(ObjectInstance) + OffsetGet)^ :=
        TRttiJson(Value).fClassNewInstance(Value);
    inc(p);
    dec(n);
  until n = 0;
end;

and a lot of internals definitions in mormot.core.base.pas :

Code: Pascal [Select][+]

/// cross-compiler negative offset to TDynArrayRec.high/length field
  // - to be used inlined e.g. as
  // ! PDALen(PAnsiChar(Values) - _DALEN)^ + _DAOFF
  // - both FPC and Delphi uses PtrInt/NativeInt for dynamic array high/length
  _DALEN = SizeOf(TDALen);
 
  /// cross-compiler adjuster to get length from TDynArrayRec.high/length field
  _DAOFF = {$ifdef FPC} 1 {$else} 0 {$endif};
  
  /// cross-compiler negative offset to TDynArrayRec.refCnt field
  // - to be used inlined e.g. as PRefCnt(PAnsiChar(Values) - _DAREFCNT)^
  _DAREFCNT = Sizeof(TRefCnt) + _DALEN;
 
 // ... and a lot more FPC/Delphi internal layouts ... 
 

I believe those defs aren't for patching, right?

Logged

"I'm sorry Dave, I'm afraid I can't do that."
—HAL 9000

abouchez

Full Member
Posts: 111

Re: A new design for a JSON Parser

« Reply #59 on: July 28, 2021, 10:56:05 am »

> You're building it, not changing it. Does it make a difference?

I am not sure I understand what you mean.
We are not building it, we are using it.
In the AutoCreateFields() we don't build anything, we just cache the RTTI and its published properties classes the first time we use this class.
Then fClassNewInstance() is a very efficient way of creating each needed class instance, with the proper virtual constructor if needed.

The FPC internal layouts are used to bypass the RTL when it makes a difference.
See mormot.core.rtti.pas about how we use the official typinfo unit as source, but encapsulate it into a Delphi/FPC compatible wrapper, and also introduce some RTTI cache as TRttiCustom/TRttiJson classes, with ready-to-use methods and settings.

mORMot users don't need to deal into those details. They just use the high level methods like JSON, ORM or SOA, letting the low level framework do its work.
Most of the low level code is deeply optimized, with a lot of pointer arithmetic for sure, sometimes with huge amount of asm (up to AVX2/BMI SIMD), but it is transparent to the user, and cross-platform.

If you look at the AutoCreateFields() function generated, once inlined into the class constructor, you will see:

Code: [Select]

MORMOT.CORE.JSON$_$TSYNAUTOCREATEFIELDS_$__$$_CREATE$$TSYNAUTOCREATEFIELDS PROC
        push    rbx                                     ; 0000 _ 53
.....
        mov     rax, qword ptr [rsp+8H]                 ; 0072 _ 48: 8B. 44 24, 08
        mov     rax, qword ptr [rax]                    ; 0077 _ 48: 8B. 00
        mov     rbx, qword ptr [rax+48H]                ; 007A _ 48: 8B. 58, 48
        test    rbx, rbx                                ; 007E _ 48: 85. DB
        jz      ?_2462                                  ; 0081 _ 74, 09
        test    dword ptr [rbx+3CH], 4000H              ; 0083 _ F7. 43, 3C, 00004000
        jnz     ?_2463                                  ; 008A _ 75, 0D
?_2462: mov     rdi, qword ptr [rsp+8H]                 ; 008C _ 48: 8B. 7C 24, 08
        call    MORMOT.CORE.JSON_$$_DOREGISTERAUTOCREATEFIELDS$TOBJECT$$TRTTIJSON; 0091 _ E8, 00000000(PLT r)
        mov     rbx, rax                                ; 0096 _ 48: 89. C3
?_2463: mov     r12, qword ptr [rbx+0DCH]               ; 0099 _ 4C: 8B. A3, 000000DC
        test    r12, r12                                ; 00A0 _ 4D: 85. E4
        jz      ?_2465                                  ; 00A3 _ 74, 35
        mov     rax, qword ptr [r12-8H]                 ; 00A5 _ 49: 8B. 44 24, F8
        lea     rbx, ptr [rax+1H]                       ; 00AA _ 48: 8D. 58, 01
ALIGN   8
?_2464: mov     r13, qword ptr [r12]                    ; 00B0 _ 4D: 8B. 2C 24
        mov     rdi, qword ptr [r13]                    ; 00B4 _ 49: 8B. 7D, 00
        mov     rax, qword ptr [r13]                    ; 00B8 _ 49: 8B. 45, 00
        call    qword ptr [rax+0D4H]                    ; 00BC _ FF. 90, 000000D4
        mov     rcx, qword ptr [rsp+8H]                 ; 00C2 _ 48: 8B. 4C 24, 08
        mov     rdx, qword ptr [r13+8H]                 ; 00C7 _ 49: 8B. 55, 08
        add     rdx, rcx                                ; 00CB _ 48: 01. CA
        mov     qword ptr [rdx], rax                    ; 00CE _ 48: 89. 02
        add     r12, 8                                  ; 00D1 _ 49: 83. C4, 08
        sub     ebx, 1                                  ; 00D5 _ 83. EB, 01
        jnz     ?_2464                                  ; 00D8 _ 75, D6
?_2465: mov     qword ptr [rsp+10H], 1                  ; 00DA _ 48: C7. 44 24, 10, 00000001
.....

The resulting asm is really optimized, as fast as it could be with manually written asm, even if it was written in plain pascal.
It may be confusing to read, but it is how we achieve best performance.
But it is still real cross-platform pascal, and the very same code works on ARM32 or AARCH64 with no problem, and good performance.

In the mORMot core, we use the pascal language as a "portable assembler", as C is used in the Linux kernel or SQlite3 library for instance.
It may be confusing, but it is similar to what is done is the lowest part of the FPC RTL.
This is how we achieved our JSON parsing to be magnitude times faster than FPC/Delphi alternatives, in plain pascal code: by looking deeply at the generated assembly and aggressively profiling the code, following https://www.agner.org/optimize reference material.

« Last Edit: July 28, 2021, 10:59:45 am by abouchez »

Logged

Lazarus

Bookstore

Search

Recent

Author Topic: A new design for a JSON Parser (Read 43318 times)

Gustavo 'Gus' Carreno

Re: A new design for a JSON Parser

Okoba

Re: A new design for a JSON Parser

Gustavo 'Gus' Carreno

Re: A new design for a JSON Parser

Okoba

Re: A new design for a JSON Parser

Gustavo 'Gus' Carreno

Re: A new design for a JSON Parser

alpine

Re: A new design for a JSON Parser

engkin

Re: A new design for a JSON Parser

Okoba

Re: A new design for a JSON Parser

engkin

Re: A new design for a JSON Parser

abouchez

Re: A new design for a JSON Parser

alpine

Re: A new design for a JSON Parser

abouchez

Re: A new design for a JSON Parser

abouchez

Re: A new design for a JSON Parser

alpine

Re: A new design for a JSON Parser

abouchez

Re: A new design for a JSON Parser

	Computer Math and Games in Pascal (preview)
	Lazarus Handbook