-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
An attempt to create flatc support, and using the Arrow.jl FlatBuffer submodule as the 'official' implementation #61
Comments
@jonalm I've used your code--both flatc and FlatBuffer.jl--to update Flatgeobuf.jl. Most things are smooth, but hit issues on definitions that are ERROR: ArgumentError: unsafe_wrap: pointer 0x7fb952ec2d0c is not properly aligned to 8 bytes
Stacktrace:
[1] #unsafe_wrap#89
@ ./pointer.jl:89 [inlined]
[2] unsafe_wrap
@ ./pointer.jl:89 [inlined]
[3] (FlatBuffers.Array{Float64})(t::FlatGeobuf.Gen.Geometry, off::UInt16)
@ FlatBuffers ~/.julia/packages/FlatBuffers/3YdKu/src/table.jl:107
[4] getproperty(x::FlatGeobuf.Gen.Geometry, field::Symbol)
@ FlatGeobuf.Gen ~/code/FlatGeobuf.jl/src/schema/generated.jl:35
[5] top-level scope
@ REPL[72]:1 The new code seems a huge improvement in terms of performance, as I don't need to allocate thousands of (empty) structs anymore. 👍🏻 Footnotes |
Thanks for the bug report @evetion! I added a test to the branch, using an array of doubles, which currently fails with the same error message that you describe. May have a chance to look a bit more into it later this week. |
I've been digging a bit. It seems to be an issue with all arrays of native types, I've added tests for reading and writing arrays of int32 floats and doubles, see here https://github.com/jonalm/FlatBuffers.jl/blob/af5a137609912f3ca46ca7fc2773239835b4fa17/test/runtests.jl#L114. Next step would be to investigate if the issue stems from writing or reading the buffer, could perhaps store the modified 'moster' buffer used in tests and parse it in python? I suspect that it is related to parsing, and that the issue is here somewhere https://github.com/jonalm/FlatBuffers.jl/blob/af5a137609912f3ca46ca7fc2773239835b4fa17/src/table.jl#L103. Note that this code has been copied from the Arrow.jl repo, and that particular function was last modified here apache/arrow-julia#234 , I got that to work almost by accident so I would not be surprised if it contains some buggy behaviour. @quinnj do you have any input here? I believe that this issue would be relevant for Arrow.jl. |
FWIW I'm parsing "official" flat(geo)buffer files, so the issue will at least be in reading/parsing. |
Well, a complete hack like this for L106 makes it work. # ptr = convert(Ptr{S}, pointer(bytes(t), a + 1))
# data = unsafe_wrap(Base.Array, ptr, vectorlen(t, off))
data = reinterpret(S, bytes(t)[a+1:a+sizeof(S)*vectorlen(t, off)]) In this case, one avoids the |
@evetion does that give correct numerical values for your official buffer files? I've updated the branch with your temporary solution, with tests. The test now run, but the fail in writing/reading the same data. Do you have access to some official buffers that are public? If so I can set up some more tests. |
They do. Tomorrow I'll check the original file and scan for the correct data to find the actual offset, to see how this compares to the internal buffer. |
@jonalm In your tests, you should replace |
Thanks, @evetion, I'll update this when I get to it. Also I moved this fork to it's own repo https://github.com/jonalm/FlatBuffers2.jl , this will allow us to track issues independently (I expect more), and to compare it with the current FlatBuffer.jl implementation. |
@jonalm, I'm happy to do a complete overhaul of the package code here and make you an admin if that would end up cleaner. We can just hold off on tagging the new breaking major release until you think things are ready. |
I'm opening this issue to discuss the prospect of transitioning into using the
Arrow.jl
FlatBuffer submodule as the 'official'julia
FlatBuffer implementation. The main motivation for doing so is two fold. First, the current implementation isn't really 'flat' in the sense that you need to parse the buffer by mapping the data intojulia
structs, rather than querying the buffer directly, which is a major motivation for using FlatBuffers in the first place. Second, as theArrow.jl
implementation is based on the officialgo
implementation of FlatBuffers, which makes it easy to implementflatc
support by modifying existing code. Havingflatc
support means that buffer access functionality is code generated. This is needed to be granted "official supported language" status, and to avoid having to manually implementjulia
code to match any given FlatBuffer schema.I have a WIP fork of the flatc code here, which generates
Arrow.jl
FlatBuffer compatiblejulia
code for fbs schemas. Some major current limitations is that it doesn't handle union types, and that all the generated code is put in a single module scope (I haven't figured out how to deal with the schema namespace yet).I also have a WIP fork of
FlatBuffer.jl
here, containing theArrow.jl
code, and some updated tests from the official Monster.fbs example schema. In particular, the code generation is made here, and the generated code example is here.The text was updated successfully, but these errors were encountered: