-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix][Refactor] Modular Transformer Pipeline and Fix Gemini/Anthropic Empty Content Handling #6063
[BugFix][Refactor] Modular Transformer Pipeline and Fix Gemini/Anthropic Empty Content Handling #6063
Conversation
While preparing this fix, I noticed we currently don't have explicit tests validating empty-string handling for Gemini API. Would it be helpful if I added a dedicated test case directly in this PR? Or should we handle the test separately in a follow-up PR/issue? Please let me know what you prefer—happy to help either way! |
Yes. It would be good to include the test as part of this PR. Also, we can use model family to check if we need to make the white space adjustment, rather than doing this for all model API |
Another thing is instead of modifying the returned content, we should be modifying the messages that got sent to the model api. So we need to modify to_oai_types function and enable option to use white space instead of empty string for empty content, if the model family is Gemini |
@ekzhu Therefore, I'll implement a closure-based approach that:
However, I still see some merit in the original approach (modifying responses), as it keeps changes minimal and simple. If you prefer simplicity and minimal impact on existing code over the extensibility provided by the closure approach, I'd be happy to revert to the original implementation. I'll proceed with the closure-based solution for now, but please let me know if you'd prefer the simpler response-handling option instead. Happy to accommodate your preference! here is sample code for closure case # Define closure once at the top level
def create_content_transformer(model_family: ModelFamily):
if model_family == ModelFamily.GEMINI:
return lambda c: " " if not c else c
else:
return lambda c: c
# Usage at API request call-site
content_transformer = create_content_transformer(model_family)
oai_messages = [
to_oai_type(msg, content_transformer=content_transformer)
for msg in messages
]
# Usage at token counting
token_count_messages = [
to_oai_type(msg, content_transformer=content_transformer)
for msg in token_count_messages
]
# at to_oai_type
def to_oai_type(
message: LLMMessage,
prepend_name: bool = False,
content_transformer: Callable[[str], str] = lambda x: x
) -> Sequence[ChatCompletionMessageParam]:
if isinstance(message, SystemMessage):
return [system_message_to_oai(message, content_transformer)]
elif isinstance(message, UserMessage):
return [user_message_to_oai(message, prepend_name, content_transformer)]
elif isinstance(message, AssistantMessage):
return [assistant_message_to_oai(message, content_transformer)]
else:
return tool_message_to_oai(message, content_transformer) |
@SongChiYoung I think what you are doing is a useful step toward a more modular way to manage different message transformation logic for each model family. I think your new approach is in the right direction. I would consider making it more modular and easier to maintain by having a global dictionary of separate transformation functions that goes from a list of So we can then refactor the model client to use this dictionary. This will make it easier to solve other problem like this one #6034 easier. As a next step in a different PR, we can address the parsing of |
@ekzhu Thanks for your valuable feedback! Motivation:
Additionally, I'll place this global registry within a dedicated Based on your previous comment, it seems like we're on the same page regarding this structural direction, but I just wanted to confirm once more before proceeding. Let me know if you agree or have additional suggestions—once confirmed, I'll try the implementation! |
I would keep this within the |
b61a868
to
82c50ce
Compare
@ekzhu Thanks again for your thoughtful feedback! Please check my new changes of code.
Details
Before finalizing the change, I just wanted to briefly share why I initially designed the global transformer registry at the top-level namespace (
I fully respect your initial suggestion and I'm happy to move forward by placing it under Could you please let me know your final thoughts on this? Thanks again for your time and patience! |
Updated the PR title and description to better reflect the scope of changes. Let me know if anything else needs clarification! |
@ekzhu As I mentioned in the recent issue I filed (#6083), Anthropic models also raise errors when given empty content — similar to the Gemini case. This applies to both OpenAI-compatible and native Anthropic SDKs. To address this properly, I realized that the transformer logic needs to apply more broadly — not just under the OpenAI namespace. I’m truly sorry to repeat this suggestion again, as I know I previously proposed a similar idea. However, this additional context (supporting Anthropic and potentially other models) makes it clearer that placing the transformer registry under a top-level Please let me know what you think — and I really appreciate your patience reviewing this again. |
They @SongChiYoung thanks for the update. I understand that it is much more modular with the new approach with potential added benefit, but the PR is getting quite large, and it introduces a new architecture that maybe we as maintainers aren't extremely familiar with. Consider this, you may stop contributing to the project after a few months with very good reason as you may start work on other things interest you. But we will still be left with the code we may not understand very well. So, my suggestion is to make the changes incrementally and think about what the minimal design is needed to solve the problems related to Right now, I'd say beefing up the integration test cases for OpenAIChatCompletionClient is more important than creating a new architecture for message transformation. Right now, the tests only use OpenAI and Gemini models, but we hope to increase the modularity and expand to more providers. |
Thank you @ekzhu — I really appreciate your thoughtful feedback, and your team’s dedication to maintaining the quality and long-term sustainability of AutoGen. Your suggestion makes perfect sense, and I’ll proceed as you advised:
In addition, since the Anthropic SDK is specific to Anthropic models only, I’ll go ahead and apply the simple fix there as well — without needing any model check logic — and include that in this PR. Thanks again, and I’ll update the PR shortly! |
Thank you very much for your work! |
@ekzhu
Let me know what you think—appreciate your time and thoughtful guidance throughout this process! |
python/packages/autogen-ext/src/autogen_ext/models/anthropic/_anthropic_client.py
Show resolved
Hide resolved
python/packages/autogen-ext/src/autogen_ext/models/openai/_message_transform.py
Outdated
Show resolved
Hide resolved
python/packages/autogen-ext/src/autogen_ext/models/openai/_message_transform.py
Show resolved
Hide resolved
python/packages/autogen-ext/src/autogen_ext/models/openai/_message_transform.py
Show resolved
Hide resolved
python/packages/autogen-ext/src/autogen_ext/models/openai/_message_transform.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe, Solved all of your require changes
python/packages/autogen-ext/src/autogen_ext/models/openai/_message_transform.py
Show resolved
Hide resolved
python/packages/autogen-ext/src/autogen_ext/models/openai/_message_transform.py
Show resolved
Hide resolved
python/packages/autogen-ext/src/autogen_ext/models/openai/_message_transform.py
Outdated
Show resolved
Hide resolved
@ekzhu Done.
|
One potential issue identified during review: In It might be on purpose, just pointing it out if it isn't. |
@a-holm You’re right: __CLAUDE_TRANSFORMER_MAP was defined separately with the _set_empty_to_whitespace fix (just like Gemini), but the registration loop mistakenly assigns __BASE_TRANSFORMER_MAP to Claude models. That wasn’t intentional — it’s a clear oversight on my part during the transformer registration cleanup. I’ll fix the registration logic to correctly apply the Claude-specific map. Also planning to add a minimal test case to confirm the whitespace fallback behavior works properly when accessed via the OpenAI-compatible interface. Really appreciate you pointing this out before it shipped. Saved me from an embarrassing bug down the line! |
I found two other issues while fixing it:
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #6063 +/- ##
==========================================
+ Coverage 76.98% 77.08% +0.10%
==========================================
Files 192 197 +5
Lines 13493 13636 +143
==========================================
+ Hits 10387 10511 +124
- Misses 3106 3125 +19
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Great work @SongChiYoung. As next step (separate PR for each):
|
@ekzhu
this issue (and also #6147 and #6145) seems addressable via the modular transformer design proposed in #6063.
Cool, I’ll get that in and open another PR shortly.
Haha, just to note, the model pointer for claude-3-5-haiku was actually already registered in this PR. The error was due to a different underlying issue in AutoGen, not the missing pointer. I had tracked this separately as part of an internal analysis (including a follow-up Issue/PR), but decided to hold off to avoid adding too much surface area all at once — mainly to keep the community's cognitive load low. I'll share the full context soon, once I package it cleanly. |
We don't have to work on every issue. :) I create a separate issue to address this for Mistral. #6147
Oh nice. Thanks for investigating. |
This PR adds a module-level docstring to `_message_transform.py`, as requested in the review for [PR #6063](#6063). The documentation includes: - Background and motivation behind the modular transformer design - Key concepts such as transformer functions, pipelines, and maps - Examples of how to define, register, and use transformers - Design principles to guide future contributions and extensions By embedding this explanation directly into the module, contributors and maintainers can more easily understand the structure, purpose, and usage of the transformer pipeline without needing to refer to external documents. ## Related issue number Follow-up to [PR #6063](#6063)
Why are these changes needed?
This change addresses a compatibility issue when using Google Gemini models with AutoGen. Specifically, Gemini returns a 400 INVALID_ARGUMENT error when receiving a response with an empty "text" parameter.
The root cause is that Gemini does not accept empty string values (e.g., "") as valid inputs in the history of the conversation.
To fix this, if the content field is falsy (e.g., None, "", etc.), it is explicitly replaced with a single whitespace (" "), which prevents the Gemini model from rejecting the request.
""
), causing runtime errors. This PR ensures such messages are safely replaced with whitespace where appropriate.Summary
This PR introduces a modular transformer pipeline for message conversion and fixes a Gemini-specific bug related to empty assistant message content.
Key Changes
[Refactor] Extracted message transformation logic into a unified pipeline to:
[BugFix] Gemini models do not accept empty assistant message content.
_set_empty_to_whitespace
transformer to replace empty strings with" "
only where needed"text"
and"thought"
message types, not to"tools"
to avoid serialization errorsImproved structure for model-specific handling
Test coverage added
Motivation
Originally, Gemini-compatible endpoints would fail when receiving assistant messages with empty content (
""
).This issue required special handling without introducing brittle, ad-hoc patches.
In addressing this, I also saw an opportunity to modularize the message transformation logic across models.
This improves clarity, avoids duplication, and simplifies future adaptations (e.g., different constraints across model families).
📘 AutoGen Modular Message Transformer: Design & Usage Guide
This document introduces the new modular transformer system used in AutoGen for converting
LLMMessage
instances to SDK-specific message formats (e.g., OpenAI-styleChatCompletionMessageParam
).The design improves reusability, extensibility, and maintainability across different model families.
🚀 Overview
Instead of scattering model-specific message conversion logic across the codebase, the new design introduces:
ChatCompletionUserMessageParam
)🧱 1. Define Transform Functions
Each transformer function takes:
LLMMessage
: a structured AutoGen messagecontext: dict
: metadata passed through the builder pipelineAnd returns:
{"content": ..., "name": ..., "role": ...}
)🪢 2. Compose Transformer Pipelines
Multiple transformer functions are composed into a pipeline using
build_transformer_func()
:message_param_func
is the actual constructor for the target message class (usually from the SDK).🗂️ 3. Register Transformer Map
Each model family maintains a
TransformerMap
, which mapsLLMMessage
types to transformers:"openai"
is currently required (as only OpenAI-compatible format is supported now).🔁 4. Conditional Transformers (Optional)
When message construction depends on runtime conditions (e.g.,
"text"
vs."multimodal"
), use:Where:
funcs_map
: maps condition label → list of transformer functionsmessage_param_func_map
: maps condition label → message buildercondition_func
: determines which transformer to apply at runtime🧪 Example Flow
🎯 Design Benefits
_set_name
) is DRY🔮 Future Direction
"openai"
-scoped)Related issue number
Closes #5762
Checks