Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug fix #2008 unsloth issue - load_in_4bit = True + fast_inference = True #79

Merged
merged 3 commits into from
Mar 16, 2025

Conversation

void-mckenzie
Copy link
Contributor

@void-mckenzie void-mckenzie commented Mar 16, 2025

This PR addresses the unsloth issue : https://github.com/unslothai/unsloth/issues/2008

While the previous release fixes the vLLM component of the issue unslothai/unsloth#2008 , the process still errors out for custom models due to the on the fly bnb_config not being passed to the convert_vllm_to_huggingface method in unsloth_zoo's vllm_utils.py

This PR modifies the vllm_utils.py to take in the on-the-fly generated bnb_config and pass it on to the convert_vllm_to_huggingface method to be parsed for quantization configs.

I have chosen not to bundle it with the model config, since the custom models might also have their own bnb-configs, if they're 4bit quantized already. Hence, the if and elif for parsing the quantization_config and the generated bnb_config.

NOTE: This PR needs to be merged along with unslothai/unsloth#2039 in unsloth, where the llama.py is edited to handle this additional configuration.

Code Snippet:

from unsloth import FastLanguageModel
import torch
from dotenv import load_dotenv
import os

load_dotenv()
max_seq_length = 1024 # Can increase for longer reasoning traces
lora_rank = 32 # Larger rank = smarter, but slower

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "void-mckenzie/krikri-sft_compound_instruct",
    max_seq_length = max_seq_length,
    load_in_4bit = True, # False for LoRA 16bit
    fast_inference = True, # Enable vLLM fast inference
    max_lora_rank = lora_rank,
    gpu_memory_utilization = 0.8, # Reduce if out of memory
)

Output before fix:

🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
🦥 Unsloth Zoo will now patch everything to make training faster!
INFO 03-16 00:00:28 __init__.py:207] Automatically detected platform cuda.
==((====))==  Unsloth 2025.3.14: Fast Llama patching. Transformers: 4.49.0. vLLM: 0.7.3.
   \\   /|    NVIDIA GeForce RTX 4090. Num GPUs = 1. Max memory: 23.621 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.5.1+cu124. CUDA: 8.9. CUDA Toolkit: 12.4. Triton: 3.1.0
\        /    Bfloat16 = TRUE. FA [Xformers = 0.0.28.post3. FA2 = False]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!
Unsloth: vLLM loading void-mckenzie/krikri-sft_compound_instruct with actual GPU utilization = 74.61%
Unsloth: Your GPU has CUDA compute capability 8.9 with VRAM = 23.62 GB.
Unsloth: Using conservativeness = 1.0. Chunked prefill tokens = 1024. Num Sequences = 160.
Unsloth: vLLM's KV Cache can use up to 2.19 GB. Also swap space = 4 GB.
INFO 03-16 00:00:34 config.py:549] This model supports multiple tasks: {'generate', 'reward', 'score', 'classify', 'embed'}. Defaulting to 'generate'.
Unsloth: vLLM Bitsandbytes config using kwargs = {'load_in_8bit': False, 'load_in_4bit': True, 'bnb_4bit_compute_dtype': 'bfloat16', 'bnb_4bit_quant_storage': 'uint8', 'bnb_4bit_quant_type': 'fp4', 'bnb_4bit_use_double_quant': False, 'llm_int8_enable_fp32_cpu_offload': False, 'llm_int8_has_fp16_weight': False, 'llm_int8_skip_modules': [], 'llm_int8_threshold': 6.0}
INFO 03-16 00:00:34 llm_engine.py:234] Initializing a V0 LLM engine (v0.7.3) with config: model='void-mckenzie/krikri-sft_compound_instruct', speculative_config=None, tokenizer='void-mckenzie/krikri-sft_compound_instruct', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, override_neuron_config=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.bfloat16, max_seq_len=1024, download_dir=None, load_format=LoadFormat.BITSANDBYTES, tensor_parallel_size=1, pipeline_parallel_size=1, disable_custom_all_reduce=False, quantization=bitsandbytes, enforce_eager=False, kv_cache_dtype=auto,  device_config=cuda:0, decoding_config=DecodingConfig(guided_decoding_backend='xgrammar'), observability_config=ObservabilityConfig(otlp_traces_endpoint=None, collect_model_forward_time=False, collect_model_execute_time=False), seed=0, served_model_name=void-mckenzie/krikri-sft_compound_instruct, num_scheduler_steps=1, multi_step_stream_outputs=True, enable_prefix_caching=True, chunked_prefill_enabled=False, use_async_output_proc=True, disable_mm_preprocessor_cache=False, mm_processor_kwargs=None, pooler_config=None, compilation_config={"level":0,"splitting_ops":[],"compile_sizes":[],"cudagraph_capture_sizes":[160,152,144,136,128,120,112,104,96,88,80,72,64,56,48,40,32,24,16,8,4,2,1],"max_capture_size":160}, use_cached_outputs=False, 
INFO 03-16 00:00:38 cuda.py:229] Using Flash Attention backend.
[W316 00:00:38.381537842 CUDAAllocatorConfig.h:28] Warning: expandable_segments not supported on this platform (function operator())
INFO 03-16 00:00:38 model_runner.py:1110] Starting to load model void-mckenzie/krikri-sft_compound_instruct...
INFO 03-16 00:00:38 loader.py:1089] Loading weights with BitsAndBytes quantization.  May take a while ...
INFO 03-16 00:00:38 weight_utils.py:254] Using model weights format ['*.safetensors']
Loading safetensors checkpoint shards:   0% Completed | 0/4 [00:00<?, ?it/s]

INFO 03-16 00:00:42 model_runner.py:1115] Loading model weights took 5.6977 GB
INFO 03-16 00:00:42 punica_selector.py:18] Using PunicaWrapperGPU.
INFO 03-16 00:00:43 worker.py:267] Memory profiling takes 0.77 seconds
INFO 03-16 00:00:43 worker.py:267] the current vLLM instance can use total_gpu_memory (23.62GiB) x gpu_memory_utilization (0.75) = 17.62GiB
INFO 03-16 00:00:43 worker.py:267] model weights take 5.70GiB; non_torch_memory takes 0.09GiB; PyTorch activation peak memory takes 0.87GiB; the rest of the memory reserved for KV Cache is 10.97GiB.
INFO 03-16 00:00:43 executor_base.py:111] # cuda blocks: 5617, # CPU blocks: 2048
INFO 03-16 00:00:43 executor_base.py:116] Maximum concurrency for 1024 tokens per request: 87.77x
INFO 03-16 00:00:45 model_runner.py:1434] Capturing cudagraphs for decoding. This may lead to unexpected consequences if the model is not static. To run the model in eager mode, set 'enforce_eager=True' or use '--enforce-eager' in the CLI. If out-of-memory error occurs during cudagraph capture, consider decreasing `gpu_memory_utilization` or switching to eager mode. You can also reduce the `max_num_seqs` as needed to decrease memory usage.
Capturing CUDA graph shapes: 100%|██████████| 23/23 [00:07<00:00,  2.98it/s]INFO 03-16 00:00:52 model_runner.py:1562] Graph capturing finished in 8 secs, took 0.58 GiB
INFO 03-16 00:00:52 llm_engine.py:436] init engine (profile, create kv cache, warmup model) took 10.52 seconds

---------------------------------------------------------------------------
UnboundLocalError                         Traceback (most recent call last)
Cell In[1], line 10
      7 max_seq_length = 1024 # Can increase for longer reasoning traces
      8 lora_rank = 32 # Larger rank = smarter, but slower
---> 10 model, tokenizer = FastLanguageModel.from_pretrained(
     11     model_name = "void-mckenzie/krikri-sft_compound_instruct",
     12     max_seq_length = max_seq_length,
     13     load_in_4bit = True, # False for LoRA 16bit
     14     fast_inference = True, # Enable vLLM fast inference
     15     max_lora_rank = lora_rank,
     16     gpu_memory_utilization = 0.8, # Reduce if out of memory
     17     token='hf_',
     18 )

File ~/anaconda3/envs/llm/lib/python3.12/site-packages/unsloth/models/loader.py:351, in FastLanguageModel.from_pretrained(model_name, max_seq_length, dtype, load_in_4bit, load_in_8bit, full_finetuning, token, device_map, rope_scaling, fix_tokenizer, trust_remote_code, use_gradient_checkpointing, resize_model_vocab, revision, use_exact_model_name, fast_inference, gpu_memory_utilization, float8_kv_cache, random_state, max_lora_rank, disable_log_stats, *args, **kwargs)
    348     pass
    349 pass
--> 351 model, tokenizer = dispatch_model.from_pretrained(
    352     model_name        = model_name,
    353     max_seq_length    = max_seq_length,
    354     dtype             = _get_dtype(dtype),
    355     load_in_4bit      = load_in_4bit,
    356     token             = token,
    357     device_map        = device_map,
    358     rope_scaling      = rope_scaling,
    359     fix_tokenizer     = fix_tokenizer,
    360     model_patcher     = dispatch_model,
    361     tokenizer_name    = tokenizer_name,
    362     trust_remote_code = trust_remote_code,
    363     revision          = revision if not is_peft else None,
    364 
    365     fast_inference    = fast_inference,
    366     gpu_memory_utilization = gpu_memory_utilization,
    367     float8_kv_cache   = float8_kv_cache,
    368     random_state      = random_state,
    369     max_lora_rank     = max_lora_rank,
    370     disable_log_stats = disable_log_stats,
    371     *args, **kwargs,
    372 )
    374 if resize_model_vocab is not None:
    375     model.resize_token_embeddings(resize_model_vocab)

File ~/anaconda3/envs/llm/lib/python3.12/site-packages/unsloth/models/llama.py:1825, in FastLlamaModel.from_pretrained(model_name, max_seq_length, dtype, load_in_4bit, token, device_map, rope_scaling, fix_tokenizer, model_patcher, tokenizer_name, trust_remote_code, fast_inference, gpu_memory_utilization, float8_kv_cache, random_state, max_lora_rank, disable_log_stats, **kwargs)
   1823 # Convert to HF format
   1824 _, quant_state_dict = get_vllm_state_dict(llm, config = model_config)
-> 1825 model = convert_vllm_to_huggingface(quant_state_dict, model_config, dtype)
   1826 model.vllm_engine = llm
   1827 model.fast_generate = model.vllm_engine.generate

File ~/anaconda3/envs/llm/lib/python3.12/site-packages/torch/utils/_contextlib.py:116, in context_decorator.<locals>.decorate_context(*args, **kwargs)
    113 @functools.wraps(func)
    114 def decorate_context(*args, **kwargs):
    115     with ctx_factory():
--> 116         return func(*args, **kwargs)

File ~/anaconda3/envs/llm/lib/python3.12/site-packages/unsloth_zoo/vllm_utils.py:614, in convert_vllm_to_huggingface(quant_state_dict, config, dtype, bnb_config)
    612 quant_state = quant_state_dict[f"{layer_name}.weight.quant_state"]
    613 n_layers = config.num_hidden_layers
--> 614 layer = Linear4bit(0, 0, device = "cuda:0", bias = has_bias, compute_dtype = compute_dtype, **kwargs)
    615 layer.in_features  = quant_state.shape[1]
    616 layer.out_features = quant_state.shape[0]

UnboundLocalError: cannot access local variable 'compute_dtype' where it is not associated with a value

Output After Fix:

🦥 Unsloth: Will patch your computer to enable 2x faster free finetuning.
🦥 Unsloth Zoo will now patch everything to make training faster!
INFO 03-16 00:08:17 __init__.py:207] Automatically detected platform cuda.
==((====))==  Unsloth 2025.3.14: Fast Llama patching. Transformers: 4.49.0. vLLM: 0.7.3.
   \\   /|    NVIDIA GeForce RTX 4090. Num GPUs = 1. Max memory: 23.621 GB. Platform: Linux.
O^O/ \_/ \    Torch: 2.5.1+cu124. CUDA: 8.9. CUDA Toolkit: 12.4. Triton: 3.1.0
\        /    Bfloat16 = TRUE. FA [Xformers = 0.0.28.post3. FA2 = False]
 "-____-"     Free license: http://github.com/unslothai/unsloth
Unsloth: Fast downloading is enabled - ignore downloading bars which are red colored!
Unsloth: vLLM loading void-mckenzie/krikri-sft_compound_instruct with actual GPU utilization = 75.45%
Unsloth: Your GPU has CUDA compute capability 8.9 with VRAM = 23.62 GB.
Unsloth: Using conservativeness = 1.0. Chunked prefill tokens = 1024. Num Sequences = 160.
Unsloth: vLLM's KV Cache can use up to 2.39 GB. Also swap space = 4 GB.
INFO 03-16 00:08:22 config.py:549] This model supports multiple tasks: {'score', 'classify', 'reward', 'generate', 'embed'}. Defaulting to 'generate'.
Unsloth: vLLM Bitsandbytes config using kwargs = {'load_in_8bit': False, 'load_in_4bit': True, 'bnb_4bit_compute_dtype': 'bfloat16', 'bnb_4bit_quant_storage': 'uint8', 'bnb_4bit_quant_type': 'fp4', 'bnb_4bit_use_double_quant': False, 'llm_int8_enable_fp32_cpu_offload': False, 'llm_int8_has_fp16_weight': False, 'llm_int8_skip_modules': [], 'llm_int8_threshold': 6.0}
INFO 03-16 00:08:22 llm_engine.py:234] Initializing a V0 LLM engine (v0.7.3) with config: model='void-mckenzie/krikri-sft_compound_instruct', speculative_config=None, tokenizer='void-mckenzie/krikri-sft_compound_instruct', skip_tokenizer_init=False, tokenizer_mode=auto, revision=None, override_neuron_config=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.bfloat16, max_seq_len=1024, download_dir=None, load_format=LoadFormat.BITSANDBYTES, tensor_parallel_size=1, pipeline_parallel_size=1, disable_custom_all_reduce=False, quantization=bitsandbytes, enforce_eager=False, kv_cache_dtype=auto,  device_config=cuda:0, decoding_config=DecodingConfig(guided_decoding_backend='xgrammar'), observability_config=ObservabilityConfig(otlp_traces_endpoint=None, collect_model_forward_time=False, collect_model_execute_time=False), seed=0, served_model_name=void-mckenzie/krikri-sft_compound_instruct, num_scheduler_steps=1, multi_step_stream_outputs=True, enable_prefix_caching=True, chunked_prefill_enabled=False, use_async_output_proc=True, disable_mm_preprocessor_cache=False, mm_processor_kwargs=None, pooler_config=None, compilation_config={"level":0,"splitting_ops":[],"compile_sizes":[],"cudagraph_capture_sizes":[160,152,144,136,128,120,112,104,96,88,80,72,64,56,48,40,32,24,16,8,4,2,1],"max_capture_size":160}, use_cached_outputs=False, 
INFO 03-16 00:08:23 cuda.py:229] Using Flash Attention backend.
[W316 00:08:23.516938937 CUDAAllocatorConfig.h:28] Warning: expandable_segments not supported on this platform (function operator())
INFO 03-16 00:08:23 model_runner.py:1110] Starting to load model void-mckenzie/krikri-sft_compound_instruct...
INFO 03-16 00:08:23 loader.py:1089] Loading weights with BitsAndBytes quantization.  May take a while ...
INFO 03-16 00:08:24 weight_utils.py:254] Using model weights format ['*.safetensors']
Loading safetensors checkpoint shards:   0% Completed | 0/4 [00:00<?, ?it/s]

INFO 03-16 00:08:26 model_runner.py:1115] Loading model weights took 5.6977 GB
INFO 03-16 00:08:26 punica_selector.py:18] Using PunicaWrapperGPU.
INFO 03-16 00:08:27 worker.py:267] Memory profiling takes 0.77 seconds
INFO 03-16 00:08:27 worker.py:267] the current vLLM instance can use total_gpu_memory (23.62GiB) x gpu_memory_utilization (0.75) = 17.82GiB
INFO 03-16 00:08:27 worker.py:267] model weights take 5.70GiB; non_torch_memory takes 0.08GiB; PyTorch activation peak memory takes 0.87GiB; the rest of the memory reserved for KV Cache is 11.18GiB.
INFO 03-16 00:08:27 executor_base.py:111] # cuda blocks: 5723, # CPU blocks: 2048
INFO 03-16 00:08:27 executor_base.py:116] Maximum concurrency for 1024 tokens per request: 89.42x
INFO 03-16 00:08:29 model_runner.py:1434] Capturing cudagraphs for decoding. This may lead to unexpected consequences if the model is not static. To run the model in eager mode, set 'enforce_eager=True' or use '--enforce-eager' in the CLI. If out-of-memory error occurs during cudagraph capture, consider decreasing `gpu_memory_utilization` or switching to eager mode. You can also reduce the `max_num_seqs` as needed to decrease memory usage.
Capturing CUDA graph shapes: 100%|██████████| 23/23 [00:08<00:00,  2.68it/s]INFO 03-16 00:08:38 model_runner.py:1562] Graph capturing finished in 9 secs, took 0.59 GiB
INFO 03-16 00:08:38 llm_engine.py:436] init engine (profile, create kv cache, warmup model) took 11.35 seconds

# All Unsloth Zoo code licensed under LGPLv3
# Unmerges vLLM modules to create HF compatible model
config.update({"torch_dtype" : dtype}) # Do not use config file's dtype!
new_model = create_empty_causal_lm(config, dtype)
quantization_config = getattr(config, "quantization_config", {})
kwargs = dict()
if quantization_config != {}:
if quantization_config != {} or bnb_config:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if there's no quant involved and we're doing BF16 LoRA or something? compute_dtype will still be not declared...
might want to check that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Datta0 This doesn't change the flow for non quantized cases. This comes into effect only when loading custom models which are BF16 on hugging face, but the user mentions load in 4bit and gast inference to be True.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't change the flow for non quant cases, but that is where the problem lies.
compute_dtype would be undeclared variable only to be referenced a few lines later
I'm saying maybe we should set compute_dtype outside the if quantization .... so that it is always available.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean, yes. Regardless of the type of quantization config, the compute_dtype is always set to the same dtype value passed to this method. I'll update this accordingly. Thanks for the catch!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done @Datta0 !

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for the changes :)

@shimmyshimmer
Copy link
Contributor

Thank you for the PR we will review it

# All Unsloth Zoo code licensed under LGPLv3
# Unmerges vLLM modules to create HF compatible model
config.update({"torch_dtype" : dtype}) # Do not use config file's dtype!
new_model = create_empty_causal_lm(config, dtype)
quantization_config = getattr(config, "quantization_config", {})
kwargs = dict()
if quantization_config != {}:
if quantization_config != {} or bnb_config:
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for the changes :)

@enochlev
Copy link

confirmed working for me

@enochlev
Copy link

maybe replace

    pass

with

    else:
        compute_dtype = dtype

@danielhanchen danielhanchen changed the base branch from main to nightly March 16, 2025 22:13
@danielhanchen
Copy link
Contributor

Thanks and appreciate it! I'll add this to nightly and push it a mini release later today!

@danielhanchen danielhanchen merged commit 4c72e79 into unslothai:nightly Mar 16, 2025
danielhanchen added a commit that referenced this pull request Mar 18, 2025
* Update compiler.py

* debugging

* remove debugging

* num items in batch

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* logs

* Update patching_utils.py

* VLM attention mask

* Update loss_utils.py

* Update loss_utils.py

* Update loss_utils.py

* Update loss_utils.py

* Update loss_utils.py

* Update loss_utils.py

* Recheck

* Update compiler.py

* Update patching_utils.py

* Update patching_utils.py

* Update patching_utils.py

* Update patching_utils.py

* Update compiler.py

* Update patching_utils.py

* suppress errors

* Update compiler.py

* Update patching_utils.py

* Update compiler.py

* Update patching_utils.py

* Update patching_utils.py

* Update patching_utils.py

* Update peft_utils.py

* Update compiler.py

* Update loss_utils.py

* Update loss_utils.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* bug fixes

* Update compiler.py

* Update compiler.py

* Update vision_utils.py

* Update compiler.py

* Update loss_utils.py

* Update loss_utils.py

* Update loss_utils.py

* Update loss_utils.py

* Bug fixes

* Update dataset_utils.py

* Update dataset_utils.py

* Update dataset_utils.py

* Update dataset_utils.py

* Update dataset_utils.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update loss_utils.py

* Update loss_utils.py

* gpu_memory_utilization

* Update temporary_patches.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* train on completions VLMs

* Update dataset_utils.py

* Update dataset_utils.py

* Update dataset_utils.py

* Update dataset_utils.py

* VLM train only on completions

* Update loss_utils.py

* Update dataset_utils.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update saving_utils.py

* Update llama_cpp.py

* Update llama_cpp.py

* Update saving_utils.py

* Update saving_utils.py

* Update __init__.py

* Update compiler.py

* Update loss_utils.py

* Update compiler.py

* Update loss_utils.py

* Update loss_utils.py

* Update llama_cpp.py

* Update loss_utils.py

* Update compiler.py

* Update llama_cpp.py

* Update compiler.py

* Update vllm_utils.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update training_utils.py

* Update dataset_utils.py

* Update dataset_utils.py

* Revert "Update dataset_utils.py"

This reverts commit 3b690ad.

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update compiler.py

* Update compiler.py

* Remove prints

* Update compiler.py

* Update saving_utils.py

* Update temporary_patches.py

* Update __init__.py

* Update pyproject.toml

* Update vllm_utils.py

* bug fix #2008 unsloth issue - load_in_4bit = True + fast_inference = True (#79)

* bug fix #2008 unsloth

* non-quant dtype fix

* Update vllm_utils.py

---------

Co-authored-by: Daniel Han <[email protected]>

* Update dataset_utils.py

* Update compiler.py

* Update temporary_patches.py

* Gemma 3 fixes

* Update temporary_patches.py

* Update compiler.py

* Update compiler.py

* Gemma 3 fixes

* Update patching_utils.py

* Update compiler.py

* Update compiler.py

* Update patching_utils.py

* Update temporary_patches.py

* Update compiler.py

* Update compiler.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* compiler

* Update gradient_checkpointing.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* causal mask dtype

* Fix checkpoint and save from local file (#74)

* Enhance gradient checkpointing and add original model ID retrieval in saving utilities

* In case adapter_config.json as well

* Update patching_utils.py

* Update patching_utils.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update compiler.py

* Update loss_utils.py

* Update compiler.py

* Update vllm_utils.py

* Update compiler.py

* Update peft_utils.py

* Update rl_replacements.py

* Update vllm_utils.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

---------

Co-authored-by: Mukkesh Ganesh <[email protected]>
Co-authored-by: Edd <[email protected]>
danielhanchen added a commit that referenced this pull request Mar 19, 2025
* Update loss_utils.py

* Update loss_utils.py

* Update loss_utils.py

* Update loss_utils.py

* Recheck

* Update compiler.py

* Update patching_utils.py

* Update patching_utils.py

* Update patching_utils.py

* Update patching_utils.py

* Update compiler.py

* Update patching_utils.py

* suppress errors

* Update compiler.py

* Update patching_utils.py

* Update compiler.py

* Update patching_utils.py

* Update patching_utils.py

* Update patching_utils.py

* Update peft_utils.py

* Update compiler.py

* Update loss_utils.py

* Update loss_utils.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* bug fixes

* Update compiler.py

* Update compiler.py

* Update vision_utils.py

* Update compiler.py

* Update loss_utils.py

* Update loss_utils.py

* Update loss_utils.py

* Update loss_utils.py

* Bug fixes

* Update dataset_utils.py

* Update dataset_utils.py

* Update dataset_utils.py

* Update dataset_utils.py

* Update dataset_utils.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update loss_utils.py

* Update loss_utils.py

* gpu_memory_utilization

* Update temporary_patches.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* train on completions VLMs

* Update dataset_utils.py

* Update dataset_utils.py

* Update dataset_utils.py

* Update dataset_utils.py

* VLM train only on completions

* Update loss_utils.py

* Update dataset_utils.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update saving_utils.py

* Update llama_cpp.py

* Update llama_cpp.py

* Update saving_utils.py

* Update saving_utils.py

* Update __init__.py

* Update compiler.py

* Update loss_utils.py

* Update compiler.py

* Update loss_utils.py

* Update loss_utils.py

* Update llama_cpp.py

* Update loss_utils.py

* Update compiler.py

* Update llama_cpp.py

* Update compiler.py

* Update vllm_utils.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update training_utils.py

* Update dataset_utils.py

* Update dataset_utils.py

* Revert "Update dataset_utils.py"

This reverts commit 3b690ad.

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update compiler.py

* Update compiler.py

* Remove prints

* Update compiler.py

* Update saving_utils.py

* Update temporary_patches.py

* Update __init__.py

* Update pyproject.toml

* Update vllm_utils.py

* bug fix #2008 unsloth issue - load_in_4bit = True + fast_inference = True (#79)

* bug fix #2008 unsloth

* non-quant dtype fix

* Update vllm_utils.py

---------

Co-authored-by: Daniel Han <[email protected]>

* Update dataset_utils.py

* Update compiler.py

* Update temporary_patches.py

* Gemma 3 fixes

* Update temporary_patches.py

* Update compiler.py

* Update compiler.py

* Gemma 3 fixes

* Update patching_utils.py

* Update compiler.py

* Update compiler.py

* Update patching_utils.py

* Update temporary_patches.py

* Update compiler.py

* Update compiler.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* compiler

* Update gradient_checkpointing.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* causal mask dtype

* Fix checkpoint and save from local file (#74)

* Enhance gradient checkpointing and add original model ID retrieval in saving utilities

* In case adapter_config.json as well

* Update patching_utils.py

* Update patching_utils.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update compiler.py

* Update loss_utils.py

* Update compiler.py

* Update vllm_utils.py

* Update compiler.py

* Update peft_utils.py

* Update rl_replacements.py

* Update vllm_utils.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update compiler.py

* Update vllm_lora_worker_manager.py

* Update utils.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update vllm_utils.py

* Update vllm_utils.py

* Update vllm_utils.py

* Update vllm_utils.py

* Update dataset_utils.py

* bidirectional attention

* Update vllm_utils.py

* Update __init__.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update vllm_utils.py

* Update vllm_utils.py

* Update vllm_utils.py

* Update vllm_utils.py

* Update vllm_utils.py

* Update vllm_utils.py

* Update vllm_lora_worker_manager.py

* Update vllm_lora_worker_manager.py

* Update vllm_lora_worker_manager.py

---------

Co-authored-by: Mukkesh Ganesh <[email protected]>
Co-authored-by: Edd <[email protected]>
danielhanchen added a commit that referenced this pull request Mar 22, 2025
* Update compiler.py

* Update patching_utils.py

* Update patching_utils.py

* Update patching_utils.py

* Update peft_utils.py

* Update compiler.py

* Update loss_utils.py

* Update loss_utils.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* bug fixes

* Update compiler.py

* Update compiler.py

* Update vision_utils.py

* Update compiler.py

* Update loss_utils.py

* Update loss_utils.py

* Update loss_utils.py

* Update loss_utils.py

* Bug fixes

* Update dataset_utils.py

* Update dataset_utils.py

* Update dataset_utils.py

* Update dataset_utils.py

* Update dataset_utils.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update loss_utils.py

* Update loss_utils.py

* gpu_memory_utilization

* Update temporary_patches.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* train on completions VLMs

* Update dataset_utils.py

* Update dataset_utils.py

* Update dataset_utils.py

* Update dataset_utils.py

* VLM train only on completions

* Update loss_utils.py

* Update dataset_utils.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update saving_utils.py

* Update llama_cpp.py

* Update llama_cpp.py

* Update saving_utils.py

* Update saving_utils.py

* Update __init__.py

* Update compiler.py

* Update loss_utils.py

* Update compiler.py

* Update loss_utils.py

* Update loss_utils.py

* Update llama_cpp.py

* Update loss_utils.py

* Update compiler.py

* Update llama_cpp.py

* Update compiler.py

* Update vllm_utils.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update training_utils.py

* Update dataset_utils.py

* Update dataset_utils.py

* Revert "Update dataset_utils.py"

This reverts commit 3b690ad.

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update compiler.py

* Update compiler.py

* Remove prints

* Update compiler.py

* Update saving_utils.py

* Update temporary_patches.py

* Update __init__.py

* Update pyproject.toml

* Update vllm_utils.py

* bug fix #2008 unsloth issue - load_in_4bit = True + fast_inference = True (#79)

* bug fix #2008 unsloth

* non-quant dtype fix

* Update vllm_utils.py

---------

Co-authored-by: Daniel Han <[email protected]>

* Update dataset_utils.py

* Update compiler.py

* Update temporary_patches.py

* Gemma 3 fixes

* Update temporary_patches.py

* Update compiler.py

* Update compiler.py

* Gemma 3 fixes

* Update patching_utils.py

* Update compiler.py

* Update compiler.py

* Update patching_utils.py

* Update temporary_patches.py

* Update compiler.py

* Update compiler.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* compiler

* Update gradient_checkpointing.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* causal mask dtype

* Fix checkpoint and save from local file (#74)

* Enhance gradient checkpointing and add original model ID retrieval in saving utilities

* In case adapter_config.json as well

* Update patching_utils.py

* Update patching_utils.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update compiler.py

* Update loss_utils.py

* Update compiler.py

* Update vllm_utils.py

* Update compiler.py

* Update peft_utils.py

* Update rl_replacements.py

* Update vllm_utils.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update compiler.py

* Update vllm_lora_worker_manager.py

* Update utils.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update vllm_utils.py

* Update vllm_utils.py

* Update vllm_utils.py

* Update vllm_utils.py

* Update dataset_utils.py

* bidirectional attention

* Update vllm_utils.py

* Update __init__.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update vllm_utils.py

* Update vllm_utils.py

* Update vllm_utils.py

* Update vllm_utils.py

* Update vllm_utils.py

* Update vllm_utils.py

* Update vllm_lora_worker_manager.py

* Update vllm_lora_worker_manager.py

* Update vllm_lora_worker_manager.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update loss_utils.py

* Update loss_utils.py

* Update loss_utils.py

* Update loss_utils.py

* Update loss_utils.py

* Update __init__.py

* fix: AsyncLLMEngine bugs (#82)

* fixed a typo in L119, removing unnecessary len() (#84)

Co-authored-by: Xiaochen Zhu <[email protected]>

---------

Co-authored-by: Mukkesh Ganesh <[email protected]>
Co-authored-by: Edd <[email protected]>
Co-authored-by: Brad Hilton <[email protected]>
Co-authored-by: SpaceHunter <[email protected]>
Co-authored-by: Xiaochen Zhu <[email protected]>
danielhanchen added a commit that referenced this pull request Mar 26, 2025
* Update dataset_utils.py

* Update dataset_utils.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update loss_utils.py

* Update loss_utils.py

* gpu_memory_utilization

* Update temporary_patches.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* train on completions VLMs

* Update dataset_utils.py

* Update dataset_utils.py

* Update dataset_utils.py

* Update dataset_utils.py

* VLM train only on completions

* Update loss_utils.py

* Update dataset_utils.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update saving_utils.py

* Update llama_cpp.py

* Update llama_cpp.py

* Update saving_utils.py

* Update saving_utils.py

* Update __init__.py

* Update compiler.py

* Update loss_utils.py

* Update compiler.py

* Update loss_utils.py

* Update loss_utils.py

* Update llama_cpp.py

* Update loss_utils.py

* Update compiler.py

* Update llama_cpp.py

* Update compiler.py

* Update vllm_utils.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update rl_replacements.py

* Update training_utils.py

* Update dataset_utils.py

* Update dataset_utils.py

* Revert "Update dataset_utils.py"

This reverts commit 3b690ad.

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update compiler.py

* Update compiler.py

* Remove prints

* Update compiler.py

* Update saving_utils.py

* Update temporary_patches.py

* Update __init__.py

* Update pyproject.toml

* Update vllm_utils.py

* bug fix #2008 unsloth issue - load_in_4bit = True + fast_inference = True (#79)

* bug fix #2008 unsloth

* non-quant dtype fix

* Update vllm_utils.py

---------

Co-authored-by: Daniel Han <[email protected]>

* Update dataset_utils.py

* Update compiler.py

* Update temporary_patches.py

* Gemma 3 fixes

* Update temporary_patches.py

* Update compiler.py

* Update compiler.py

* Gemma 3 fixes

* Update patching_utils.py

* Update compiler.py

* Update compiler.py

* Update patching_utils.py

* Update temporary_patches.py

* Update compiler.py

* Update compiler.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* Update compiler.py

* compiler

* Update gradient_checkpointing.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* causal mask dtype

* Fix checkpoint and save from local file (#74)

* Enhance gradient checkpointing and add original model ID retrieval in saving utilities

* In case adapter_config.json as well

* Update patching_utils.py

* Update patching_utils.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update compiler.py

* Update loss_utils.py

* Update compiler.py

* Update vllm_utils.py

* Update compiler.py

* Update peft_utils.py

* Update rl_replacements.py

* Update vllm_utils.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update compiler.py

* Update vllm_lora_worker_manager.py

* Update utils.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update vllm_utils.py

* Update vllm_utils.py

* Update vllm_utils.py

* Update vllm_utils.py

* Update dataset_utils.py

* bidirectional attention

* Update vllm_utils.py

* Update __init__.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update vllm_utils.py

* Update vllm_utils.py

* Update vllm_utils.py

* Update vllm_utils.py

* Update vllm_utils.py

* Update vllm_utils.py

* Update vllm_lora_worker_manager.py

* Update vllm_lora_worker_manager.py

* Update vllm_lora_worker_manager.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update temporary_patches.py

* Update loss_utils.py

* Update loss_utils.py

* Update loss_utils.py

* Update loss_utils.py

* Update loss_utils.py

* Update __init__.py

* fix: AsyncLLMEngine bugs (#82)

* fixed a typo in L119, removing unnecessary len() (#84)

Co-authored-by: Xiaochen Zhu <[email protected]>

* Fix gradient checkpointing warning filter implementation

* Input grads fix for gemma3 (#96)

* gemma require gradients fix

* Update peft_utils.py

---------

Co-authored-by: Daniel Han <[email protected]>

* Update vision_utils.py

* Vision requires grad

* Check SDPA for Mistral / Pixtral

* Update compiler.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update __init__.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update vision_utils.py

* Update vllm_utils.py (#99)

Fix bugs in generate_batches.py.Original output = [] will result in duplication of results.

* Update vision_utils.py

* Fixes to support IterableDataset (#98)

* Support Iterable Datasets

* Update dataset_utils.py

* Update dataset_utils.py

* Update dataset_utils.py

* Update dataset_utils.py

* Preserve batch size from iterable dataset

* Preserve batch size from iterable dataset

* Support train_on_response_only with IterableDataset

* Support train_on_response_only with IterableDataset

* Support train_on_response_only with IterableDataset

* Support train_on_response_only with IterableDataset

---------

Co-authored-by: Mukkesh Ganesh <[email protected]>
Co-authored-by: Edd <[email protected]>
Co-authored-by: Brad Hilton <[email protected]>
Co-authored-by: SpaceHunter <[email protected]>
Co-authored-by: Xiaochen Zhu <[email protected]>
Co-authored-by: Roland Tannous <[email protected]>
Co-authored-by: DoubleMathew <[email protected]>
Co-authored-by: Michael Han <[email protected]>
Co-authored-by: Qian Wu <[email protected]>
Co-authored-by: marcandrelarochelle <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants