Skip to content

v0.16.0

Latest
Compare
Choose a tag to compare
@LucasDedieu LucasDedieu released this 27 Mar 10:34

Changelog

Added

  • Hyperparameter Tuning for EDS-NLP: introduced a new script edsnlp.tune for hyperparameter tuning using Optuna. This feature allows users to efficiently optimize model parameters with options for single-phase or two-phase tuning strategies. Includes support for parameter importance analysis, visualization, pruning, and automatic handling of GPU time budgets.
  • Provided a detailed tutorial on hyperparameter tuning, covering usage scenarios and configuration options.
  • ScheduledOptimizer (e.g., @core: "optimizer") now supports importing optimizers using their qualified name (e.g., optim: "torch.optim.Adam").
  • eds.ner_crf now computes confidence score on spans.

Changed

  • The loss of eds.ner_crf is now computed as the mean over the words instead of the sum. This change is compatible with multi-gpu training.
  • Having multiple stats keys matching a batching pattern now warns instead of raising an error.

Fixed

  • Support packaging with poetry 2.0
  • Solve pickling issues with multiprocessing when pytorch is installed
  • Allow deep attributes like a.b.c for span_attributes in Standoff and OMOP doc2dict converters
  • Fixed various aspects of stream shuffling:
    • Ensure the Parquet reader shuffles the data when shuffle=True
    • Ensure we don't overwrite the RNG of the data reader when calling stream.shuffle() with no seed
    • Raise an error if the batch size in stream.shuffle(batch_size=...) is not compatible with the stream
  • eds.split now keeps doc and span attributes in the sub-documents.

Pull Requests

  • fix: support packaging with poetry 2.0 by @percevalw in #362
  • Solve pickling issues with multiprocessing when pytorch is installed by @percevalw in #367
  • Feat: add hyperparameters tuning by @LucasDedieu in #361
  • Fix issue 368: Add metric parameter and write optimal config.yml at the end of tuning. by @LucasDedieu in #369
  • Fix issue 370: two-phase tuning now write phase 1 frozen best values into phase 2 results_summary.txt by @LucasDedieu in #371
  • fix: allow deep attributes in Standoff and OMOP doc2dict converters by @percevalw in #381
  • fix: improve various aspect of stream shuffling by @percevalw in #380
  • fix: eds.split now keeps doc and span attributes in the sub-documents by @percevalw in #363
  • feat: allow importing optims using qualified names in ScheduledOptimizer by @percevalw in #383
  • feat: compute eds.ner_crf loss as mean over words by @percevalw in #384
  • Fix issue 372: resulting tuning config file now preserve comments by @LucasDedieu in #373
  • Feat: add checkpoint management for tuning by @LucasDedieu in #385
  • feat: add ner confidence score by @LucasDedieu in #387
  • chore: bump version to 0.16.0 by @LucasDedieu in #393

New Contributors

Full Changelog: v0.15.0...v0.16.0