feat: implement JsonSerializable and __serialize/__unserialize#41
Conversation
Add modern serialization support to Judy arrays: - JsonSerializable interface: json_encode() returns appropriate format per type (flat array for BITSET, associative for others) - __serialize/__unserialize: native PHP serialize()/unserialize() support with type-preserving roundtrip for all 5 Judy array types - Shared judy_build_data_array() helper to avoid code duplication - json extension dependency added to module deps and build configs - 12 new test files covering all types, empty arrays, and error handling
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the Judy extension by integrating standard PHP serialization and JSON serialization capabilities. It allows Judy arrays to be natively serialized using Highlights
🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console. Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
Test Results
Benchmark Summary (Judy vs PHP Array)Ratio = Judy / Array. Bold = Judy wins (≤0.95x). Plain = Array is faster/smaller. Time (Write / Read) — Linux
Memory — Linux
Time (Write / Read) — Windows
Memory — Windows
Raw benchmark dataWrite Time — Linux
Read Time — Linux
Memory — Linux
Write Time — Windows
Read Time — Windows
Memory — Windows
|
There was a problem hiding this comment.
Code Review
This pull request implements JsonSerializable and __serialize/__unserialize for the Judy class, enabling native JSON and PHP serialization. A critical memory leak was identified in the __unserialize method when called on an already-initialized object, which could lead to Denial of Service via memory exhaustion. Additionally, other memory leaks related to incorrect reference counting when serializing mixed-value types were found. Please see the detailed comments for suggestions on how to fix these issues.
| JLF(PValue, intern->array, index); | ||
| while (PValue != NULL && PValue != PJERR) { | ||
| zval *value = JUDY_MVAL_READ(PValue); | ||
| Z_TRY_ADDREF_P(value); |
There was a problem hiding this comment.
The add_index_zval function on the next line already handles incrementing the reference count of the zval being added to the array. This explicit Z_TRY_ADDREF_P(value) call is therefore redundant and causes a double increment, leading to a memory leak. When the array returned by jsonSerialize or __serialize is destroyed, the values within it will not be freed due to the incorrect refcount.
| JSLF(PValue, intern->array, key); | ||
| while (PValue != NULL && PValue != PJERR) { | ||
| zval *value = JUDY_MVAL_READ(PValue); | ||
| Z_TRY_ADDREF_P(value); |
| intern->type = jtype; | ||
| intern->array = (Pvoid_t) NULL; |
There was a problem hiding this comment.
The __unserialize method overwrites the internal Judy array pointer (intern->array) and type (intern->type) without freeing the existing array if the object was already initialized. This can be triggered by calling __unserialize manually on an existing Judy object, leading to a memory leak that could be exploited for Denial of Service via memory exhaustion. Simply setting intern->array = NULL on line 1976 leaks the memory of the previous Judy array. To fix this, call judy_free_array_internal(intern) before re-initializing the object's state. This helper function will also reset intern->array and intern->counter, making the subsequent assignments on lines 1976 and 1977 redundant.
judy_free_array_internal(intern);
intern->type = jtype;
Summary
JsonSerializableinterface sojson_encode($judy)works natively, returning appropriate JSON format per type (flat array for BITSET, associative object for INT_TO_INT, INT_TO_MIXED, STRING_TO_INT, STRING_TO_MIXED)__serialize()/__unserialize()for native PHPserialize()/unserialize()support with type-preserving roundtrip across all 5 Judy array typesjudy_build_data_array()helper avoids code duplication betweenjsonSerializeand__serializejsonextension dependency to module deps (judy_deps[]),config.m4, andconfig.w32.phpttest files covering all types, empty arrays, roundtrip integrity, and error handlingImplementation details
JsonSerializable output format
[1, 5, 10](flat array of set indices){"0": 100, "5": 200}(index => value){"0": "hello", "5": [1,2]}(index => mixed){"foo": 100}(key => value){"foo": "bar"}(key => mixed)Serialization format
__serialize()returns['type' => int, 'data' => array]wheredatamatches thejsonSerialize()format.__unserialize()validates type, reconstructs the Judy array viajudy_object_write_dimension_helper().Test plan
pecl package package.xmlproduces valid.tgz