Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement JSON string escaping using SIMD (ARM + X86) #769

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

radiospiel
Copy link

@radiospiel radiospiel commented Mar 16, 2025

It integrates a simd.h shim extracted from Postgresql (src) Postgresql is licensed under a MIT/BSD-style license (link). This shim is available for ARM and x86, and also comes with a pure C implementation.

Portions of this code are extracted from a Postgres patch by dgrowleyml(at)gmail(dot)com's code at pgsql-hackers.

As a result of these changes I see a 55% speedup on Apple Silicon M1 for a string set of benchmarks.

== Encoding strings (2524333 bytes)
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
        json (local)    62.000 i/100ms
Calculating -------------------------------------
        json (local)    662.248 (± 4.5%) i/s    (1.51 ms/i) -      3.348k in   5.065760s

Normalize to 393501 byte
== Encoding strings (2524333 bytes)
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
       json (2.10.2)    43.000 i/100ms
Calculating -------------------------------------
       json (2.10.2)    425.189 (± 6.4%) i/s    (2.35 ms/i) -      2.150k in   5.079077s

These benchmarks are run via a script (link) which is based on the gem's benchmark/encoder.rb file. There are probably better ways to run benchmarks :) My version allows to combine multiple test cases into a single one.

The dumps benchmark, which covers the JSON files in benchmark/data/*.json – with the exception of canada.json – , reported a speedup of ~13% on Apple M1.

It integrates a `simd.h` shim extracted from Postgresql ([src](https://github.com/postgres/postgres/blob/REL_17_4/src/include/port/simd.h)) Postgresql is licensed under a MIT/BSD-style license ([link](https://github.com/postgres/postgres/blob/REL_17_4/COPYRIGHT)). This shim is available for ARM and x86, and also comes with a pure C implementation. 

As a result I see a 55% speedup on Apple Silicon M1 for a string set of benchmarks. 


```
== Encoding strings (2524333 bytes)
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
        json (local)    62.000 i/100ms
Calculating -------------------------------------
        json (local)    662.248 (± 4.5%) i/s    (1.51 ms/i) -      3.348k in   5.065760s

Normalize to 393501 byte
== Encoding strings (2524333 bytes)
ruby 3.4.1 (2024-12-25 revision 48d4efcb85) +YJIT +PRISM [arm64-darwin24]
Warming up --------------------------------------
       json (2.10.2)    43.000 i/100ms
Calculating -------------------------------------
       json (2.10.2)    425.189 (± 6.4%) i/s    (2.35 ms/i) -      2.150k in   5.079077s
```

These benchmarks are run via a script ([link](https://gist.github.com/radiospiel/04019402726a28b31616df3d0c17bd1c)) which is based on the gem's `benchmark/encoder.rb` file. There are probably better ways to run benchmarks :) My version allows to combine multiple test cases into a single one.

The `dumps` benchmark, which covers the JSON files in `benchmark/data/*.json` – with the exception of `canada.json` – , reported a speedup of ~13% on Apple M1.
@radiospiel radiospiel marked this pull request as draft March 16, 2025 21:04
@radiospiel
Copy link
Author

radiospiel commented Mar 16, 2025

Unfortunately I only saw #743 once the work here was finished. Also, initially I worked against 2.9.1, and I think the changes in that area should prepare for #743, but they are cannot be easily reused for my code. Instead, I had to forward-port the convert_UTF8_to_JSON_wo_simd and one of the escape_tables from 2.9.1. The convert_UTF8_to_JSON and some related entities are no longer used; if the PR would be merged they could be removed.

I don't understand enough of the code in the other PR to understand why the gains here seem to be a bit larger (13% vs 7%, as per comment) but there is a lot of numbers in that conversation, maybe the performance is similar after all. It would be interesting to understand the differences between the two.

This PR, in any case, is that this is also available on x86. I'll post some x86/linux numbers in the next days.

@radiospiel
Copy link
Author

radiospiel commented Mar 16, 2025

@byroot here is again another license to take a look.

  • The simd.h shim is part of the postgresql source code, released under this license
  • The search algo is pretty straight forward, but still it is a modified version of some code by dgrowleyml(at)gmail(dot)com's code at pgsql-hackers. I am happy to reach out to David for explicit approval, should that become necessary.

This was referenced Mar 16, 2025
@radiospiel
Copy link
Author

Note: see this for discussing licensing concerns.

@byroot
Copy link
Member

byroot commented Mar 17, 2025

Aside from the licensing concern, SIMD use requires runtime checking of the CPU capabilities, we shouldn't assume the CPU has NEON or SSE2 (I know most do, but still).

@radiospiel
Copy link
Author

I can pick this up if we could address the licensing concerns.

@byroot
Copy link
Member

byroot commented Mar 17, 2025

I think the licensing is OK, I need to read the PG license a second time, but it does indeed seem very BSD like, hence should allow sub-licensing.

In other word, we can vendor it, with the license header, and keep distributing the json gem without changing the license.

@byroot
Copy link
Member

byroot commented Mar 17, 2025

That said, out of your 3 PRs, I'd suggest to prioritize the float one. While float aren't super common, that's were ruby/json is the most behind.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants