Skip to content

Releases: apache/arrow-rs

arrow 54.3.1

26 Mar 16:02
e62b212
Compare
Choose a tag to compare

Changelog

54.3.1 (2025-03-26)

Full Changelog

Fixed bugs:

  • Round trip encoding of list of fixed list fails when offset is not zero #7315

Merged pull requests:

* This Changelog was automatically generated by github_changelog_generator

arrow 54.3.0

17 Mar 20:57
57942c4
Compare
Choose a tag to compare

Changelog

54.3.0 (2025-03-17)

Full Changelog

Implemented enhancements:

  • Using column chunk offset index in InMemoryRowGroup::fetch #7300
  • Support reading parquet with modular encryption #7296 [parquet]
  • Add example for how to read/write encrypted parquet files #7281 [parquet]
  • Have writer return parsed ParquetMetadata #7254 [parquet]
  • feat: Support Utf8View in JSON reader #7244 [arrow]
  • StructBuilder should provide a way to get a &dyn ArrayBuilder of a field builder #7193 [arrow]
  • Support div_wrapping/rem_wrapping for numeric arithmetic kernels #7158 [arrow]
  • Improve RleDecoder performance #7195 [parquet] (Dandandan)
  • Improve arrow-json deserialization performance by 30% #7157 [arrow] (mwylde)
  • Add with_skip_validation flag to IPC StreamReader, FileReader and FileDecoder #7120 [arrow] (alamb)

Fixed bugs:

  • Archery integration CI test is failing on main: error: package half v2.5.0 cannot be built because it requires rustc 1.81 or newer, while the currently active rustc version is 1.77.2 #7291
  • MSRV CI check is failing on main #7289
  • Incorrect IPC schema encoding for multiple dictionaries #7058 [arrow] [arrow-flight]

Documentation updates:

Merged pull requests:

Read more

arrow 54.2.1

27 Feb 12:07
3f56468
Compare
Choose a tag to compare

Changelog

54.2.1 (2025-02-27)

Full Changelog

Fixed bugs:

  • Use chrono >= 0.4.34, < 0.4.40 to avoid breaking #7210

* This Changelog was automatically generated by github_changelog_generator

arrow 54.2.0

12 Feb 15:34
d4b9482
Compare
Choose a tag to compare

Changelog

54.2.0 (2025-02-12)

Full Changelog

Implemented enhancements:

  • Casting from Utf8View to Dict(k, Utf8View) #7114
  • Support creating map arrays with key metadata #7100 [arrow]
  • [parquet] Print Parquet BasicTypeInfo id when present #7081 [parquet]
  • Add arrow-ipc benchmarks for the IPC reader and writer #6968 [arrow]

Fixed bugs:

  • NullBufferBuilder::allocated_size Returns Size in Bits #7121 [arrow]
  • [Regression in 54.0.0]. Decimal cast to smaller precision gives invalid (off-by-one) result in some cases #7069 [arrow]
  • Minor: Fix deprecated note to point to the correct const #7067 [arrow]
  • incorrect error message for reading definition levels #7056 [parquet]
  • First None in ListArray panics in cast_with_options #7043 [arrow]

Documentation updates:

Merged pull requests:

* This Changelog was automatically generated by github_changelog_generator

arrow 54.1.0

29 Jan 13:41
3bf29a2
Compare
Choose a tag to compare

Changelog

54.1.0 (2025-01-29)

Full Changelog

Implemented enhancements:

  • Create GitHub releases automatically on tagging #7041
  • Add required methods to access inner builder for NullBufferBuilder #7002 [arrow]
  • Re-export NullBufferBuilder in the arrow crate #6975 [arrow]
  • arrow-string function should support binary input as well #6923 [arrow]
  • MMap support for IPC files #6709 [arrow]
  • fix: mark (Large)ListView as nested and support in equal data type #6995 [arrow] (rluvaton)
  • Expose min/max values for Decimal128/256 and improve docs #6992 [arrow] (alamb)
  • [Parquet] Improve speed of dictionary encoding NaN float values #6953 [parquet] (adamreeve)
  • Optimize BooleanBufferBuilder for non nullable columns #6973 [arrow]
  • arrow::compute::concat should merge dictionary type when concatenating list of dictionaries #6888 [arrow]
  • Improve error message for unsupported cast between struct and other types #6724 [arrow]
  • implement regexp_match, regexp_scalar_match and regexp_array_match for StringViewArray #6717 [arrow]
  • Speed up Parquet utf8 validation #6667 [parquet]

Fixed bugs:

  • Regression: Concatenating sliced ListArrays is broken #7034
  • PrimitiveDictionaryBuilder with specific value data type and capacity #7011 [arrow]
  • Arrow IPC Writer Panics for sliced nested arrays #6997 [arrow]
  • RecordBatch with no columns cannot be roundtripped through Parquet #6988 [parquet]
  • StringView: Using the Interleave kernel (and potentially others) results in many repeated buffers in variadic_buffers #6780 [arrow]
  • fix prefetch of page index #6999 [parquet] (adriangb)
  • fix: Parquet column writer Dictionary(_, Decimal128) and Dictionary(_, Decimal256) #6987 [parquet] (korowa)
  • Writing floating point values containing NaN to Parquet is slow when using dictionary encoding #6952 [parquet] [arrow]
  • Public API using private types: Buffer::from_bytes takes unexported Bytes #6754 [parquet] [arrow] [arrow-flight]
  • Some MSRVs are inaccurate #6741 [parquet] [arrow] [arrow-flight]

Documentation updates:

Merged pull requests:

Read more

53.4.0

27 Jan 12:08
d3fcb4b
Compare
Choose a tag to compare

Changelog

53.4.0 (2025-01-14)

Full Changelog

Merged pull requests:

  • fix clippy (#6791) (#6940)
  • fix: decimal conversion looses value on lower precision (#6836) (#6936)
  • perf: Use Cow in get_format_string in FFI_ArrowSchema (#6853) (#6937)
  • fix: Encoding of List offsets was incorrect when slice offsets begin …
  • [arrow-cast] Support cast numeric to string view (alternate) (#6816) (#…
  • Enable matching temporal as from_type to Utf8View (#6872) (#6956)
  • [arrow-cast] Support cast boolean from/to string view (#6822) (#6957)
  • [53.0.0_maintenance] Fix CI (#6964)
  • Add Array::shrink_to_fit(&mut self) to 53.4.0 (#6790) (#6817) (#6962)

Update version to 54.0.0, add CHANGELOG (#6894)

27 Jan 12:10
2887cc1
Compare
Choose a tag to compare

Changelog

54.0.0 (2024-12-18)

Full Changelog

Breaking changes:

Implemented enhancements:

  • Parquet schema hint doesn't support integer types upcasting #6891 [parquet]
  • Parquet UTF-8 max statistics are overly pessimistic #6867 [parquet]
  • Add builder support for Int8 keys #6844 [arrow]
  • Formalize the name of the nested Field in a list #6784 [parquet] [arrow] [arrow-flight]
  • Allow disabling the writing of Parquet Offset Index #6778 [parquet]
  • parquet::record::make_row is not exposed to users, leaving no option to users to manually create Row objects #6761 [parquet]
  • Avoid from_num_days_from_ce_opt calls in timestamp_s_to_datetime if we don't need #6746 [arrow]
  • Support Temporal -> Utf8View casting #6734 [arrow]
  • Add Option To Coerce List Type on Parquet Write #6733 [parquet] [arrow]
  • Support Numeric -> Utf8View casting #6714 [arrow]
  • Support Utf8View <=> boolean casting #6713 [arrow]

Fixed bugs:

  • Buffer::bit_slice loses length with byte-aligned offsets #6895 [arrow]
  • parquet arrow writer doesn't track memory size correctly for fixed sized lists #6839 [parquet]
  • Casting Decimal128 to Decimal128 with smaller precision produces incorrect results in some cases #6833 [arrow]
  • Should empty nullable dictionary be parsed as null from arrow-csv? #6821 [arrow]
  • Array take doesn't make fields nullable #6809
  • Arrow Flight Encodes a Slice's List Offsets If the slice offset is starts with zero #6803 [arrow]
  • Parquet readers incorrectly interpret legacy nested lists #6756 [parquet]
  • filter_bits under-allocates resulting boolean buffer #6750 [arrow]
  • Multi-language support issues with Arrow FlightSQL client's execute_update and execute_ingest methods #6545 [arrow] [arrow-flight]

Documentation updates:

Closed issues:

Merged pull requests:

Read more

Prepare for 53.3.0 release (#6739)

27 Jan 12:09
f5b51ff
Compare
Choose a tag to compare

Changelog

53.3.0 (2024-11-17)

Full Changelog

Implemented enhancements:

  • PartialEq of GenericByteViewArray (StringViewArray / ByteViewArray) that compares on equality rather than logical value #6679 [arrow]
  • Need a mechanism to handle schema changes due to dictionary hydration in FlightSQL server implementations #6672 [arrow] [arrow-flight]
  • Support encoding Utf8View columns to JSON #6642 [arrow]
  • Implement append_n for BooleanBuilder #6634 [arrow]
  • Some take optimizations #6621 [arrow]
  • Error Instead of Panic On Attempting to Write More Than 32769 Row Groups #6591 [parquet]
  • Make casting from a timestamp without timezone to a timestamp with timezone configurable #6555
  • Add record_batch! macro for easy record batch creation #6553 [arrow]
  • Support Binary --> Utf8View casting #6531 [arrow]
  • downcast_primitive_array and downcast_dictionary_array are not hygienic wrt imports #6400 [arrow]
  • Implement interleave_record_batch #6731 [arrow] (waynexia)
  • feat: record_batch! macro #6588 [arrow] (ByteBaker)

Fixed bugs:

  • Signed decimal e-notation parsing bug #6728 [arrow]
  • Add support for Utf8View -> numeric in can_cast_types #6715
  • IPC file writer produces incorrect footer when not preserving dict ID #6710 [arrow]
  • parquet from_thrift_helper incorrectly checks index #6693 [parquet]
  • Primitive REPEATED fields not contained in LIST annotated groups aren't read as lists by record reader #6648 [parquet]
  • DictionaryHandling does not recurse into Map fields #6644 [arrow] [arrow-flight]
  • Array writer output empty when no record is written #6613 [arrow]
  • Archery Integration Test with c# failing on main #6577 [arrow]
  • Potential unsoundness in filter_run_end_array #6569 [arrow]
  • Parquet reader can generate incorrect validity buffer information for nested structures #6510 [parquet]
  • arrow-array ffi: FFI_ArrowArray.null_count is always interpreted as unsigned and initialized during conversion from C to Rust. #6497 [arrow]

Documentation updates:

Performance improvements:

Closed issues:

  • Incorrect like results for pattern starting/ending with % percent and containing escape characters #6702 [arrow]

Merged pull requests:

Read more

Prepare for 53.2.0 release (#6603)

27 Jan 12:09
10c4059
Compare
Choose a tag to compare

Changelog

53.2.0 (2024-10-21)

Full Changelog

Implemented enhancements:

  • Implement arrow_json encoder for Decimal128 & Decimal256 DataTypes #6605 [arrow]
  • Support DataType::FixedSizeList in make_builder within struct_builder.rs #6594 [arrow]
  • Support DataType::Dictionary in make_builder within struct_builder.rs #6589 [arrow]
  • Interval parsing from string - accept "mon" and "mons" token #6548 [arrow]
  • AsyncArrowWriter API to get the total size of a written parquet file #6530 [parquet]
  • append_many for Dictionary builders #6529 [arrow]
  • Missing tonic GRPC_STATUS with tonic 0.12.1 #6515 [arrow] [arrow-flight]
  • Add example of how to use parquet metadata reader APIs for a local cache #6504 [parquet]
  • Remove reliance on raw-entry feature of Hashbrown #6498 [parquet] [arrow] [arrow-flight]
  • Improve page index metadata loading in SerializedFileReader::new_with_options #6491 [parquet]
  • Release arrow-rs / parquet minor version 53.1.0 (October 2024) #6340 [arrow]

Fixed bugs:

Documentation updates:

Closed issues:

Merged pull requests:

Read more

Prepare for 53.1.0 release (CHANGELOG and version) (#6501)

27 Jan 12:09
065c7b8
Compare
Choose a tag to compare

Changelog

53.1.0 (2024-10-02)

Full Changelog

Implemented enhancements:

  • Write null counts in Parquet statistics when they are known to be zero #6502 [parquet]
  • Make it easier to find / work with ByteView #6478 [arrow]
  • Update lexical-core version due to soundness issues with current version #6468
  • Add builder style API for manipulating ParquetMetaData #6465 [parquet]
  • ArrayData.align_buffers should support Struct data type / child data #6461 [arrow]
  • Add a method to return the number of skipped rows in a RowSelection #6428 [parquet]
  • Bump lexical-core to 1.0 #6397 [arrow]
  • Add union_extract kernel #6386 [arrow]
  • implement regexp_is_match_utf8 and regexp_is_match_utf8_scalar for StringViewArray #6370 [arrow]
  • Add support for BinaryView in arrow_string::length #6358 [arrow]
  • Add as_union to AsArray #6351
  • Ability to append non contiguous strings to StringBuilder #6347 [arrow]
  • Add Catalog DB Schema subcommands to flight_sql_client #6331 [arrow] [arrow-flight]
  • Add support for Utf8View in arrow_string::length #6305 [arrow]
  • Reading FIXED_LEN_BYTE_ARRAY columns with nulls is inefficient #6296 [parquet]
  • Optionally verify 32-bit CRC checksum when decoding parquet pages #6289 [parquet]
  • Speed up pad_nulls for FixedLenByteArrayBuffer #6297 [parquet] (etseidl)
  • Improve performance of set_bits by avoiding to set individual bits #6288 [arrow] (kazuyukitanimura)

Fixed bugs:

  • BitIterator panics when retrieving length #6480 [arrow]
  • Flight data retrieved via Python client (wrapping C++) cannot be used by Rust Arrow #6471 [arrow]
  • CI integration test failing: Archery test With other arrows #6448 [parquet] [arrow] [arrow-flight]
  • IPC not respecting not preserving dict ID #6443 [parquet] [arrow] [arrow-flight]
  • Failing CI: Prost requires Rust 1.71.1 #6436 [arrow] [arrow-flight]
  • Invalid struct arrays in IPC data causes panic during read #6416 [arrow]
  • REE Dicts cannot be encoded/decoded with streaming IPC #6398 [arrow]
  • Reading json map with non-nullable value schema doesn't error if values are actually null #6391
  • StringViewBuilder with deduplication does not clear observed values #6384 [arrow]
  • Cast from Decimal(p, s) to dictionary-encoded Decimal(p, s) loses precision and scale #6381 [arrow]
  • LocalFileSystem list operation returns objects in wrong order #6375
  • compute::binary_mut returns Err(PrimitiveArray<T>) only with certain arrays #6374 [arrow]
  • Exporting Binary/Utf8View from arrow-rs to pyarrow fails #6366 [arrow]
  • warning: methods as_any and next_batch are never used in parquet crate #6143 [parquet]

Documentation updates:

Closed issues:

  • Columnar json writer for arrow-json #6411
  • Primitive binary/unary are not as fast as they could be #6364 [arrow]
  • Different numeric type may be able to compare #6357

Merged pull requests:

  • fix: override size_hint for BitIterator to return the exact remaining size #6495 [arrow] (Beihao-Zhou)
  • Minor: Fix path in format command in CONTRIBUTING.md #6494 (etseidl)
  • Write null counts in Parquet statistics when they are known [#6490](htt...
Read more