You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When using MultimodalWebSurfer in MagenticOneGroupChat, the task prematurely terminates after the first WebSurfer action. This is caused by improper handling of multimodal content (images + text) in the MagenticOneOrchestrator's progress assessment logic.
To Reproduce
fromautogen_ext.teams.magentic_oneimportMagenticOneasyncdefmain():
client=xxxautoctf=MagenticOne(client=client)
task="open bing.com search weather then click third result link"result=awaitautoctf.run_stream(task=task)
# The task terminates after WebSurfer's first action without completing click
The issue occurs because:
WebSurfer returns MultiModalMessage containing both text and image
MagenticOneOrchestrator fails to properly process this multimodal content when assessing task progress
This leads to incorrect termination assessment in _orchestrate_step
Expected behavior
The WebSurfer should be able to perform multiple actions as needed
Task should continue until the actual goal is achieved
The orchestrator should properly handle multimodal content in progress assessment
Technical Details
The issue occurs in two key components:
MagenticOneOrchestrator._thread_to_context:
# Original problematic codeifisinstance(m, (TextMessage, MultiModalMessage, ToolCallSummaryMessage)):
context.append(UserMessage(content=m.content, source=m.source))
Progress assessment logic incorrectly processes multimodal content, leading to premature task completion judgment.
Fix Implementation
The fix involves properly handling MultiModalMessage content in the orchestrator:
What happened?
Describe the bug
When using MultimodalWebSurfer in MagenticOneGroupChat, the task prematurely terminates after the first WebSurfer action. This is caused by improper handling of multimodal content (images + text) in the MagenticOneOrchestrator's progress assessment logic.
To Reproduce
The issue occurs because:
_orchestrate_step
Expected behavior
Technical Details
The issue occurs in two key components:
Fix Implementation
The fix involves properly handling MultiModalMessage content in the orchestrator:
Additional context
Environment
Which packages was the bug in?
Python AgentChat (autogen-agentchat>=0.4.0)
AutoGen library version.
Python dev (main branch)
Other library version.
No response
Model used
qwen-vl-max-latest
Model provider
Other (please specify below)
Other model provider
Qwen
Python version
3.10
.NET version
None
Operating system
None
The text was updated successfully, but these errors were encountered: