Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JanusFlow示例代码修复建议 #77

Open
songafu opened this issue Jan 28, 2025 · 3 comments
Open

JanusFlow示例代码修复建议 #77

songafu opened this issue Jan 28, 2025 · 3 comments

Comments

@songafu
Copy link

songafu commented Jan 28, 2025

在使用JanusFlow的过程中,发现示例代码中的文生图部分在当前最新的Transformer版本(>=4.48.0)下无法正常运行且示例代码可能存在代码缺陷,建议修复如下:
(1)JanusFlow示例代码(文生图)中数据流的处理,存在变量引用逻辑错误
if step == 0:
outputs = vl_gpt.language_model.model(inputs_embeds=llm_emb,
use_cache=True,
attention_mask=attention_mask,
past_key_values=None)
past_key_values = []
for kv_cache in outputs.past_key_values: #should be outputs.past_key_values
k, v = kv_cache[0], kv_cache[1]
past_key_values.append((k[:, :, :inputs_embeds.shape[1], :], v[:, :, :inputs_embeds.shape[1], :]))
past_key_values = tuple(past_key_values)

(2)在当前最新的Transformer版本(>=4.48.0)下无法正常运行JanusFlow服务,报错如下,建议在Quick Start 提示用户选择使用较低版本的transformer(如4.38.2)或兼容最新的transformer版本修复。
llama/modeling_llama.py", line 551, in forward
past_seen_tokens = past_key_values.get_seq_length() if past_key_values is not None else 0
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AttributeError: 'tuple' object has no attribute 'get_seq_length'

@SimonYS001
Copy link

Good job!

@scifisatan
Copy link

scifisatan commented Feb 4, 2025

just created a fix for this #137

Replace the code on line 108 - 122

if step == 0:
            outputs = vl_gpt.language_model.model(inputs_embeds=llm_emb, 
                                             use_cache=True, 
                                             attention_mask=attention_mask,
                                             past_key_values=None)
            past_key_values = []
            for kv_cache in past_key_values:
                k, v = kv_cache[0], kv_cache[1]
                past_key_values.append((k[:, :, :inputs_embeds.shape[1], :], v[:, :, :inputs_embeds.shape[1], :]))
            past_key_values = tuple(past_key_values)
        else:
            outputs = vl_gpt.language_model.model(inputs_embeds=llm_emb, 
                                             use_cache=True, 
                                             attention_mask=attention_mask,
                                             past_key_values=past_key_values)

with this

if step == 0:
            past_key_values = None  # Ensure it starts as None
        else:
            past_key_values = tuple(past_key_values) if past_key_values else None  # Convert only if it's valid

        outputs = vl_gpt.language_model.model(
            inputs_embeds=llm_emb, 
            use_cache=True, 
            attention_mask=attention_mask,
            past_key_values=past_key_values  # Now correctly assigned
        )

Hope it helps :)

@nv-samcheng
Copy link

Just want to confirm if the kv cache bu used in Janus Flow?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants