Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attach Deepseek <think></think> Text as a .txt File #91

Open
saphtea opened this issue Feb 1, 2025 · 7 comments
Open

Attach Deepseek <think></think> Text as a .txt File #91

saphtea opened this issue Feb 1, 2025 · 7 comments

Comments

@saphtea
Copy link

saphtea commented Feb 1, 2025

Hey! Seriously grateful for this tool you've been providing and upkeeping, llmcord has been wonderful for my ollama setup.

I've been playing around with implementing it myself, but there's not many comments and I am TERRIBLE when it comes to API async stuff lmfao.

Hoping to have an option to either filter out the tags or to be able to attach it in a .txt file for ppl who want to see how the models thinking.

Appreciate your time and again thank you for a wonderful lil program !!

@jakobdylanc
Copy link
Owner

jakobdylanc commented Feb 1, 2025

I'm glad you're enjoying llmcord! & yeah sorry about the lack of comments...maybe I should add some more haha.

I like your ideas. The biggest issue is that different providers currently handle reasoning content differently.

With Ollama it looks like the the reasoning content is always returned in the main content field with <think> tags as you mentioned. But with OpenRouter for example, the reasoning content is NOT included unless you set include_reasoning: true in your API parameters. And even then it's still returned as a separate reasoning field instead of being included in content. I like this behavior better.

I'm hoping that all providers eventually agree on a single standard, otherwise it'll be much more of a pain to implement while maintaining universal compatibility.

Also with the .txt file attachment idea, I question whether the .txt file should include the ENTIRE response or JUST the reasoning content. I feel like including the entire response can be useful sometimes, like when the response gets split into multiple Discord messages.

@saphtea
Copy link
Author

saphtea commented Feb 1, 2025

Yeah it's pretty inconsistent at the moment xD.

I was thinking just the reasoning, discord has built in embeds you can expand for txt files as well. Though, yeah, including the response in it as well I'm sure could be good for some use cases.

Might end up being better for me just to keep toying around and try to put it together since my use case is purely ollama lolol.

@nikdavis
Copy link

nikdavis commented Feb 11, 2025

@jakobdylanc that sounds pretty reasonable.

i would ask you maybe consider an intermediate config hack to filter/extract thought tokens as that has emerged as a simpler standard, but no worries if not.

do you know, are any providers standardizing around streaming thought? i would imagine openai's is not streaming (judging by how you described it). i would expect a streaming standard might win, but who knows.

anywho. i'm going to work around it myself for now – not sure how to visualize the thought though.

very cool project btw. appreciate the simplicity 🙏🏻

@jakobdylanc
Copy link
Owner

do you know, are any providers standardizing around streaming thought?

Reasoning content is fundamentally delivered the same way as regular content, it's just surrounded by <think> tags. So it naturally supports both streaming and non-streaming.

The emerging standard for providers (see DeepSeek's API, OpenRouter API, LM Studio, etc.) is to simply separate this reasoning content into a reasoning_content field so you don't have to separate it yourself. Hopefully Ollama supports this soon.

Separating it yourself is trickier/hackier ESPECIALLY when doing streamed responses (what llmcord does) since you have to check for the </think> closing tag on every new chunk. Which I see you already attempted in your fork, nice hacking :)

i would imagine openai's is not streaming

OpenAI's reasoning models (e.g. o3-mini) don't return ANY reasoning content in the API. They hide it from you!

@nikdavis
Copy link

Okay yeah that makes sense. Agreed it does genuinely sound like an upstream issue. Shame openai hasn't modeled it much as most folks emulate them as a standard.

Separating it yourself is trickier/hackier ESPECIALLY when doing streamed responses (what llmcord does) since you have to check for the closing tag on every new chunk. Which I see you already attempted in your fork, nice hacking :)

I wouldn't know anything about how painful that would be to implement 😅. yeah my fork is working; wraps thought with spoiler tags; but calling it an attempt may still be generous. :)

Not receiving thought at all is a bummer .. that explains the poor o3-mini exp in apps like cursor.

FWIW I am using tabbyAPI.

@Daxiongmao87
Copy link

I wrote an application that simply searches for the tags and does what it will for them (in my case, show/hide), so if you want flexibility, maybe have the 'reasoning begin' and 'reasoning end' tags as variables in the config (with and as defaults), might be the way to go.

@jakobdylanc
Copy link
Owner

jakobdylanc commented Mar 9, 2025

I've put more thought into it, and I don't really like the .txt file attachment solution, for a few reasons:

  1. You can only view .txt files on desktop, not mobile, which is a sucky and inconsistent experience

  2. The reasoning content should be able to stream in just like regular content, which isn't possible with a .txt file

  3. Adding this functionality (file attachments on the bot's messages) adds more complexity than you'd think, which I'm always hesitant to do

  4. File attachments on the bot's messages just looks ugly!
    Image

My idea: reasoning content should be delivered in the same message format as regular content, with special reply behavior to keep it separate from the main conversation context.

Here's a graphical explanation of what I mean:

graph TD;
    B["***Bot reply - reasoning content***"]-->A["User message"];
    C["Bot reply - regular content"]-->A["User message"];
    D["User reply"]-->C["Bot reply - regular content"]
    E["..."]-->D["User reply"]
Loading

Of course this still has issues, like when the reasoning content is ridiculously long (which it often is), it would be too much message spam.

I'm still on the fence whether I want to add support for this at all or not.

IMO reasoning content is more of a niche thing for power users that maybe doesn't belong in llmcord. Feedback welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants