I am unsure how fast or slow it is, but I didn't design it for speed. That's not to say it's slow, but it's meant to be small, with powerful features, and just one class / unit.I'm not sure how LkJSON works internally but I assume it also creates objects for each node as you refer to them via Field identifier.
With regard to speed, I am creating 1 pascal object for every node. If I wanted to make it fast I would getmem for many objects at once, and neither create nor destroy them. Instead I would put or get object memory from that pool and not heap allocation / deallocation an object for each node parsed.
That said, would it really be worth it? Do your programs spend most of their time parsing JSON? Are you writing a heavy traffic outward facing service that parses JSON frequently?No, personally I don't need much JSON parsing but others may do. E.g. if your library is 100x times slower than LkJSON (I don't know if there are any other Delphi+FPC JSON Parsers) many people wouldn't use your version as the only 'pro' would be the different syntax but as you only need to write code once...
If this is the case and speed / scalability is a concern then you probably want to switch to nodejs which is optimized for heavy traffic and parallelization, and is based on JSON to boot. Many smart engineers have designed nodejs for exactly this use case.
I don't know if there are any other Delphi+FPC JSON Parsers
https://www.getlazarus.org/json/I could not connect https because of invalid certificate. I could not connect http because OpenDNS flagged site as malware.
I think I missed the difference between the two. From your example, you go for the second after setting N to stuff node and in this case, stuff node is the root? Then what is current node here? If instead you go for the first, what will you get?
... AnyNode.Find('search/for/name'); // returns a node 3 levels from the current node AnyNode.Find('/search/for/name'); // returns a node 3 levels from the root node
It the same as if you were typing a file system path. If your path string brings with a forward slash, then the path identities an item starting at the root of the files system. If it does not start with a forward slash, then the path evaluates from the current directory.If that's so, recalling your example:
I know the FCL already has a capable JSON parser, but I am writing some Amazon web service interfacing projects and wanted a smaller easier to use JSON parser to assist. I've create a new design for a JSON parser that is pretty small, yet powerful.
Is it just me or is your page getlazarus always redirecting to youtube?
Maybe the site has been hacked?
getlazarus has never been an official source.
Any and all feedback is welcome.
I know the FCL already has a capable JSON parser, but I am writing some Amazon web service interfacing projects and wanted a smaller easier to use JSON parser to assist. I've create a new design for a JSON parser that is pretty small, yet powerful.
If your interested, I've posted the code under GPLv3 and a write up of my thought process and the workflow of using an single small class to work with JSON:
https://www.getlazarus.org/json/
Any and all feedback is welcome.
In poking into the fpjson code I see it uses FloatToStr and TryStrToFloat which internationalize, assuming DefaultFormatSettings is initialized.
"SubData":{...}And what I want is:
{...}
In poking into the fpjson code I see it uses FloatToStr and TryStrToFloat which internationalize, assuming DefaultFormatSettings is initialized.
Floating point numbers are really broken
Never use FloatToStr. Besides the format settings, it is printing 15 digit numbers, which is not enough to encode a double precisely. Use Str directly
Do you know of bug reports, or third party tools?
Do you know of bug reports, or third party tools?
For JSONTools, if that is what you are referring to, I believe you can report bugs here: JSONTools Github Issues (https://github.com/sysrpl/JsonTools/issues).
Do you know of bug reports, or third party tools?
Do you know of bug reports, or third party tools?
here: https://bugs.freepascal.org/view.php?id=29531
- JSON benchmark: 100,267 assertions passed 1.31s
IsValidUtf8() in 16.63ms, 1.1 GB/s
IsValidJson(RawUtf8) in 24.78ms, 790.8 MB/s
IsValidJson(PUtf8Char) in 23.22ms, 843.9 MB/s
JsonArrayCount(P) in 23.26ms, 842.7 MB/s
JsonArrayCount(P,PMax) in 22.74ms, 862 MB/s
JsonObjectPropCount() in 9.28ms, 1.1 GB/s
TDocVariant in 140.43ms, 139.6 MB/s
TDocVariant dvoInternNames in 156.73ms, 125 MB/s
TOrmTableJson GetJsonValues in 24.98ms, 345.1 MB/s
TOrmTableJson expanded in 37.36ms, 524.7 MB/s
TOrmTableJson not expanded in 20.96ms, 411.2 MB/s
fpjson in 810.40ms, 10.6 MB/s
In short, mORMot 2 JSON parser is from 13 times to 50 times faster than fpjson - and I guess JSON tools.
If you want a fast JSON parser for FPC, you may try what mORMot 2 offers.
To get you started:
[...]
- Forum: https://synopse.info/forum/viewforum.php?id=2
- Docs: https://synopse.info/files/html/Synopse%20mORMot%20Framework%20SAD%201.18.html#TITLE_237
- Blog: https://blog.synopse.info/?tag/JSON/
If you like fpjson approach, you may like to use the Variant way.
To get you started:May I politely ask what are the advantages of using a Variant instead of fpjson.TJSONData and descendants?
- Use mORMot2, and it has a package for Lazarus: https://github.com/synopse/mORMot2
- Remember that some methods are renamed in version 2, but read the comments, it always helps what you should use next
- Always read the comments, they have instructions
- Start with variant version as it is quick, easy and still very fast
- For a more structured code, use record or class way
- For record and class ways, you will hit some issues when you use custom types, you will need to register them like I did or register for custom events and other stuff.
*snip*
I don't really see a big difference.
The benchmark code:
https://github.com/synopse/mORMot2/blob/087f740c577a0e38f83f8193874a343ed789fb46/test/test.core.data.pas#L2840
Some numbers on FPC 3.2 + Linux x86_64:
- JSON benchmark: 100,299 assertions passed 810.30ms
StrLen() in 820us, 23.3 GB/s
IsValidUtf8(RawUtf8) in 1.46ms, 13 GB/s
IsValidUtf8(PUtf8Char) in 2.23ms, 8.5 GB/s
IsValidJson(RawUtf8) in 27.23ms, 719.8 MB/s
IsValidJson(PUtf8Char) in 25.87ms, 757.6 MB/s
JsonArrayCount(P) in 25.26ms, 775.9 MB/s
JsonArrayCount(P,PMax) in 25.04ms, 783 MB/s
JsonObjectPropCount() in 8.40ms, 1.3 GB/s
TDocVariant in 118.81ms, 165 MB/s
TDocVariant dvoInternNames in 145.08ms, 135.1 MB/s
TOrmTableJson GetJsonValues in 22.88ms, 376.8 MB/s (write)
TOrmTableJson expanded in 41.26ms, 475.1 MB/s
TOrmTableJson not expanded in 21.44ms, 402.2 MB/s
DynArrayLoadJson in 62.02ms, 316 MB/s
fpjson in 79.36ms, 24.7 MB/s
jsontools in 51.41ms, 38.1 MB/s
SuperObject in 187.79ms, 10.4 MB/s
Variant version is faster, not much for being variant, because the underlining JSON parsing of mORMot. Being variant makes it simpler to use to some tastes.By "simpler" I guess you mean writing J.X instead of C.Integers['X'], both of them require a lookup, but as the former depends on some compiler magic to skip quotes, the latter has at least a run-time type check. Both ways will require a Find('X') to ensure the attribute is present and there won't be a "bang".
If you need a more structured code, you should use the record or class way.The mere existence of TSynAutoCreateFields is something that worries me. Hacking with the RTTI is a bummer and how it can be justified? What if RTTI layout changes? Portable?
I am not much experienced with TJSONStreamer but the mORMot version, has options like:
- Auto creating and destroying fields (if you inherit from TSynAutoCreateFields
- Supports recordsIMHO that framework tends to shift Pascal paradigm to something dynamically-typed like i.e. Python, something I don't agree with. But that is my personal opinion.
- Much more options for handling custom types, enums, comments, keyword names in JSON (type, class)
*snip*
*snip*I see.
- mORMot doesn't change the RTTI - TSynAutoCreateFields is just a way to auto-initiate nested published classes instances in a class, which is very handy in some cases; what mORMot does, is to cache the RTTI for efficiency, and in a cross-platform way.
MORMOT.CORE.JSON$_$TSYNAUTOCREATEFIELDS_$__$$_CREATE$$TSYNAUTOCREATEFIELDS PROC
push rbx ; 0000 _ 53
.....
mov rax, qword ptr [rsp+8H] ; 0072 _ 48: 8B. 44 24, 08
mov rax, qword ptr [rax] ; 0077 _ 48: 8B. 00
mov rbx, qword ptr [rax+48H] ; 007A _ 48: 8B. 58, 48
test rbx, rbx ; 007E _ 48: 85. DB
jz ?_2462 ; 0081 _ 74, 09
test dword ptr [rbx+3CH], 4000H ; 0083 _ F7. 43, 3C, 00004000
jnz ?_2463 ; 008A _ 75, 0D
?_2462: mov rdi, qword ptr [rsp+8H] ; 008C _ 48: 8B. 7C 24, 08
call MORMOT.CORE.JSON_$$_DOREGISTERAUTOCREATEFIELDS$TOBJECT$$TRTTIJSON; 0091 _ E8, 00000000(PLT r)
mov rbx, rax ; 0096 _ 48: 89. C3
?_2463: mov r12, qword ptr [rbx+0DCH] ; 0099 _ 4C: 8B. A3, 000000DC
test r12, r12 ; 00A0 _ 4D: 85. E4
jz ?_2465 ; 00A3 _ 74, 35
mov rax, qword ptr [r12-8H] ; 00A5 _ 49: 8B. 44 24, F8
lea rbx, ptr [rax+1H] ; 00AA _ 48: 8D. 58, 01
ALIGN 8
?_2464: mov r13, qword ptr [r12] ; 00B0 _ 4D: 8B. 2C 24
mov rdi, qword ptr [r13] ; 00B4 _ 49: 8B. 7D, 00
mov rax, qword ptr [r13] ; 00B8 _ 49: 8B. 45, 00
call qword ptr [rax+0D4H] ; 00BC _ FF. 90, 000000D4
mov rcx, qword ptr [rsp+8H] ; 00C2 _ 48: 8B. 4C 24, 08
mov rdx, qword ptr [r13+8H] ; 00C7 _ 49: 8B. 55, 08
add rdx, rcx ; 00CB _ 48: 01. CA
mov qword ptr [rdx], rax ; 00CE _ 48: 89. 02
add r12, 8 ; 00D1 _ 49: 83. C4, 08
sub ebx, 1 ; 00D5 _ 83. EB, 01
jnz ?_2464 ; 00D8 _ 75, D6
?_2465: mov qword ptr [rsp+10H], 1 ; 00DA _ 48: C7. 44 24, 10, 00000001
.....
The resulting asm is really optimized, as fast as it could be with manually written asm, even if it was written in plain pascal.*snip*Patching variants, strings, dynarrays, bypassing RTL, caching RTTI (OK, you named it), alternate ways of creating instances - that is what I meant. In other words - hacks: https://en.wikipedia.org/wiki/Hack_(computer_science)
The FPC internal layouts are used to bypass the RTL when it makes a difference.
See mormot.core.rtti.pas about how we use the official typinfo unit as source, but encapsulate it into a Delphi/FPC compatible wrapper, and also introduce some RTTI cache as TRttiCustom/TRttiJson classes, with ready-to-use methods and settings.
mORMot users don't need to deal into those details. They just use the high level methods like JSON, ORM or SOA, letting the low level framework do its work.Until they hit the curb! I've been there.
*snip*
If you have any bug examples with JsonTools, or feature requests, please post them here.
Variant version is faster, not much for being variant, because the underlining JSON parsing of mORMot. Being variant makes it simpler to use to some tastes.By "simpler" I guess you mean writing J.X instead of C.Integers['X'], both of them require a lookup, but as the former depends on some compiler magic to skip quotes, the latter has at least a run-time type check. Both ways will require a Find('X') to ensure the attribute is present and there won't be a "bang".
Variant version is faster, not much for being variant, because the underlining JSON parsing of mORMot. Being variant makes it simpler to use to some tastes.By "simpler" I guess you mean writing J.X instead of C.Integers['X'], both of them require a lookup, but as the former depends on some compiler magic to skip quotes, the latter has at least a run-time type check. Both ways will require a Find('X') to ensure the attribute is present and there won't be a "bang".
There is no compiler magic that "skips quotes", but there is compiler magic that will replace J.X by (pseudocode) TDocVariantDataInstance.DispInvoke(@J, 'X'). So it is a bit more indirect, but in the end both will do the same to determine whether X exists.
Also this is not a hack, but a well defined feature of the Object Pascal language.I'm not calling that a hack.
Patching variants, strings, dynarrays, bypassing RTL, caching RTTI (OK, you named it), alternate ways of creating instances - that is what I meant. In other words - hacksWhat I'm saying is that the mORMot source is full of hacks. All justified with one word: speed.
Hmm, maybe I don't understand something, but is thisthe valid JSON?
[ -01001, ,- , , ,42.e]
Anyway, Mormot.Core.Json.IsValidJson() claims yes.
...
About [[[[[[[[[......[[[[[[[[[ it is a nice catch.
...
If you want a fast JSON parser for FPC, you may try what mORMot 2 offers.I have certain doubts about the numbers written and their meanings. As far as I understand this is the output from the tests\mormot2tests program. For the specific tests cited, they don't do the same thing the last one does, i.e. full JSON parsing: fpjson := GetJSON(people, {utf8=}true).
Some numbers, parsing a JSON array of 8000 objects, for a bit more than 1MB:Code: [Select]- JSON benchmark: 100,267 assertions passed 1.31s
IsValidUtf8() in 16.63ms, 1.1 GB/s
IsValidJson(RawUtf8) in 24.78ms, 790.8 MB/s
IsValidJson(PUtf8Char) in 23.22ms, 843.9 MB/s
JsonArrayCount(P) in 23.26ms, 842.7 MB/s
JsonArrayCount(P,PMax) in 22.74ms, 862 MB/s
JsonObjectPropCount() in 9.28ms, 1.1 GB/s
TDocVariant in 140.43ms, 139.6 MB/s
TDocVariant dvoInternNames in 156.73ms, 125 MB/s
TOrmTableJson GetJsonValues in 24.98ms, 345.1 MB/s
TOrmTableJson expanded in 37.36ms, 524.7 MB/s
TOrmTableJson not expanded in 20.96ms, 411.2 MB/s
fpjson in 810.40ms, 10.6 MB/s
In short, mORMot 2 JSON parser is from 13 times to 50 times faster than fpjson - and I guess JSON tools.Figures are: TDocVariant is 15.9 MB/s vs 4 MB/s for fpjson. That is 4:1. Not 50:1! Not 13:1!
- JSON benchmark: 100,307 assertions passed 843.40ms
StrLen() in 826us, 23.1 GB/s
IsValidUtf8(RawUtf8) in 1.46ms, 13 GB/s
IsValidUtf8(PUtf8Char) in 2.29ms, 8.3 GB/s
IsValidJson(RawUtf8) in 20.74ms, 0.9 GB/s
IsValidJson(PUtf8Char) in 20.95ms, 0.9 GB/s
JsonArrayCount(P) in 20.12ms, 0.9 GB/s
JsonArrayCount(P,PMax) in 19.97ms, 0.9 GB/s
JsonObjectPropCount() in 10.98ms, 1 GB/s
TDocVariant in 123.71ms, 158.4 MB/s
TDocVariant dvoInternNames in 146.39ms, 133.9 MB/s
TOrmTableJson GetJsonValues in 24.31ms, 354.5 MB/s
TOrmTableJson expanded in 39.12ms, 501 MB/s
TOrmTableJson not expanded in 20.89ms, 412.6 MB/s
DynArrayLoadJson in 61.68ms, 317.8 MB/s
fpjson in 79.39ms, 24.6 MB/s
jsontools in 50.50ms, 38.8 MB/s
SuperObject in 184.59ms, 10.6 MB/s
The numbers slightly change during each call on my Core i5 laptop, but the order of magnitude remains. - Encode decode JSON: 430,145 assertions passed 97.31ms
- JSON benchmark: 100,307 assertions passed 1.03s
StrLen() in 813us, 23.5 GB/s
IsValidUtf8(RawUtf8) in 11.08ms, 1.7 GB/s
IsValidUtf8(PUtf8Char) in 11.91ms, 1.6 GB/s
IsValidJson(RawUtf8) in 23.44ms, 836.2 MB/s
IsValidJson(PUtf8Char) in 21.99ms, 891.6 MB/s
JsonArrayCount(P) in 21.29ms, 920.6 MB/s
JsonArrayCount(P,PMax) in 21.38ms, 917 MB/s
JsonObjectPropCount() in 10.54ms, 1 GB/s
TDocVariant in 200.88ms, 97.6 MB/s
TDocVariant dvoInternNames in 196.29ms, 99.8 MB/s
TOrmTableJson GetJsonValues in 24.37ms, 353.9 MB/s
TOrmTableJson expanded in 44.57ms, 439.8 MB/s
TOrmTableJson not expanded in 30.68ms, 281 MB/s
DynArrayLoadJson in 88.07ms, 222.6 MB/s
fpjson in 74.31ms, 26.3 MB/s
jsontools in 61.22ms, 32 MB/s
SuperObject in 178.94ms, 10.9 MB/s
On Win32, DynArrayLoadJson is still 8 times faster than fpjson and 7 times faster than jsontools. TDocVariant in 122.94ms, 159.4 MB/s
TDocVariant no guess in 127.56ms, 153.7 MB/s
TDocVariant dvoInternNames in 146.02ms, 134.2 MB/s
fpjson in 82.91ms, 23.6 MB/s
TDocVariant sample.json in 38.94ms, 16.8 MB/s
TDocVariant sample.json no guess in 31.93ms, 410.6 MB/s
fpjson sample.json in 11.20ms, 116.9 MB/s
So with this option, TDocVariant is faster than fpjson. TDocVariant sample.json in 1.70ms, 384.3 MB/s
TDocVariant sample.json no guess in 30.77ms, 426 MB/s
fpjson sample.json in 11.18ms, 117.2 MB/s
@y.ivanovBoth "patterns", as you call them, are included in RFC8259. So, your routines are fast, but only on a half of the specification.
To read {} you can use RecordLoadJson of course: it is another pattern.
Did you run the tests on x86_64 with -O3 ? I don't cheat the numbers, just copy&paste from my terminal.I wouldn't try it. On the contrary - I intent to disable as much of your optimizations, inline assembly and other 'hacks' and to evaluate what impact they have at overall. My initial guess is that they speed-up no more than 20-30%. Not by x13-x50.
Post #72 was on Win32.
The best numbers, and the one which matter most because it is for a server process, are on x86_64 with our memory manager (JSON parsing is always fast enough on client side). Our framework is specifically optimized for CPUs with a lot of registers (like x86_64 or ARM/AARCH64 - i386 lags behind).You didn't mention those requirements (64-bit, your own memory manager) with your initial claims of x13-50 times supremacy over fpjson.
On x86_64 the ratio is more than 6 times faster (159.4 / 23.6 = 6.754237288 for the last numbers I took):Good. Now we're arguing about 4-6 times against fpjson. What is the reduction over your initial claim? Tenfold?
*snip*
(people.json = array of 8227 ORM objects, for 1MB file)
StrLen() in 828us, 23.1 GB/s
IsValidUtf8(RawUtf8) in 1.45ms, 13.1 GB/s
IsValidUtf8(PUtf8Char) in 2.21ms, 8.6 GB/s
IsValidJson(RawUtf8) in 21.36ms, 917.7 MB/s
IsValidJson(PUtf8Char) in 20.63ms, 0.9 GB/s
JsonArrayCount(P) in 19.70ms, 0.9 GB/s
JsonArrayCount(P,PMax) in 20.14ms, 0.9 GB/s
JsonObjectPropCount() in 10.54ms, 1 GB/s
TDocVariant in 121.68ms, 161.1 MB/s
TDocVariant no guess in 127.85ms, 153.3 MB/s
TDocVariant dvoInternNames in 147.27ms, 133.1 MB/s
TOrmTableJson expanded in 37.57ms, 521.8 MB/s
TOrmTableJson not expanded in 20.36ms, 423.5 MB/s
(here the time is relevant because the JSON size is smaller: 20.36 ms instead of 777.7 ms)
DynArrayLoadJson in 62.38ms, 314.3 MB/s
fpjson in 77.78ms, 25.2 MB/s
(run 10 times less because it is slower - and yes, the length is also div 10 and correct I hope)
(sample.json with a lot of nested documents)
TDocVariant sample.json in 32.32ms, 405.7 MB/s
TDocVariant sample.json no guess in 31.93ms, 410.6 MB/s
fpjson sample.json in 11.25ms, 116.4 MB/s
@sysrpl, did I understand correctly, if any key in JSON contains a slash, then it will be impossible to find this key using TJsonNode.Find()?As it is currently yes. The forward slash is used as a name separator, much like with XPATH. If you wanted to use a free text search using any keys, then you'd need to provide a list of string keys.
{
"TestArray": [
"Test1",
"Test2",
"Test3",
"Test4",
"Test5",
"Test6",
"Test7",
"Test8",
"Test9",
"Test10"
]
}
Now, simply use add, ignoring the first param, to add the values one by one:
Arr.Add('',s);