Skip to content

v0.10.0

Compare
Choose a tag to compare
@percevalw percevalw released this 04 Dec 17:10
· 300 commits to master since this release

Changelog

Added

  • New add unified edsnlp.data api (json, brat, spark, pandas) and LazyCollection object
    to efficiently read / write data from / to different formats & sources.
  • New unified processing API to select the execution execution backends via data.set_processing(...)
  • The training scripts can now use data from multiple concatenated adapters
  • Support quantized transformers (compatible with multiprocessing as well !)

Changed

  • edsnlp.pipelines has been renamed to edsnlp.pipes, but the old name is still available for backward compatibility
  • Pipes (in edsnlp/pipes) are now lazily loaded, which should improve the loading time of the library.
  • to_disk methods can now return a config to override the initial config of the pipeline (e.g., to load a transformer directly from the path storing its fine-tuned weights)
  • The eds.tokenizer tokenizer has been added to entry points, making it accessible from the outside
  • Deprecate old connectors (e.g. BratDataConnector) in favor of the new edsnlp.data API
  • Deprecate old pipe wrapper in favor of the new processing API

Fixed

  • Support for pydantic v2
  • Support for python 3.11 (not ci-tested yet)

Pull Requests

Full Changelog: v0.9.1...v0.10.0