> IMPORTANT: DO NOT ADD *ANY** COMMENTS unless asked*
Interesting. This was in the old 1.x prompt, removed for 2.0. But CC would pretty much always add comments in 1.x, something I would never request, and would often have to tell it to stop doing (and it would still do it sometimes even after being told to stop).
I'm using Anthropic's pay-as-you-go API, since it was easier to set up on the server than CC's CLI/web login method. Running the bot costs me ~$1.8 per month.
The bot is based on Mario Zechner's excellent work[1] - so all credit goes to him!
I really like these tools. Yesterday I gave it a filename for a video of my infant daughter eating which I took while I had my phone on the charger. The top of the charger slightly obscured the video.
I told it to crop the video to just her and remove the obscured portion and that I had ffmpeg and imagemagick installed and it looked at the video, found the crop dimensions, then ran ffmpeg and I had a video of her all cleaned up! Marvelous experience.
My only complaint is that sometimes I want high speed. Unfortunately Cerebras and Groq don't seem to have APIs that are compatible enough for someone to have put them into Charm Crush or anything. But I can't wait for that.
If croq talks openai API, you enable the anthropic protocol, and openai provider with a base url to croq. Set ANTHROPIC_BASE_URL to the open endpoint and start claude.
I haven't tested croq yet, but this could be an interesting use case...
Something I realized about this category of tool (I call them "terminal agents" but that already doesn't work now there's an official VS Code extension for this - maybe just "coding agents" instead) is that they're actually an interesting form of general agent.
Claude Code, Codex CLI etc can effectively do anything that a human could do by typing commands into a computer.
They're incredibly dangerous to use if you don't know how to isolate them in a safe container but wow the stuff you can do with them is fascinating.
They're only as dangerous as the capabilities you give them. I just created a `codex` and `claude` user on my Linux box and practically always run in yolo mode. I've not had a problem so far.
One thing I really like using them for is refactoring/reorganizing. The tedious nature of renaming, renaming all implementations, moving files around, creating/deleting folders, updating imports exports, all melts away when you task an agent with it. Of course this assumes they are good enough to do them with quality, which is like 75% of the time for me so far.
I've found that it can be hard or expensive for the agent to do "big but simple" refactors in some cases. For example, I recently tasked one with updating all our old APIs to accept a more strongly typed user ID instead of a generic UUID type. No business logic changes, just change the type of the parameter, and in some cases be wary of misleading argument names by lazy devs copy pasting code. This ended up burning through the entire context window of GPT-5-codex and cost the most $ of anything I've ever tasked an agent with.
I was under the impression that Docker container escapes are actually very rare. How high do you rate the chance of a prompt injection attack against Claude running in a docker container on macOS managing to break out of that container?
(Actually hang on, you called me out for suggesting containers like Docker are safe but that's not what I said - I said "a safe container" - which is a perfectly responsible statement to make: if you know how to run them in a "safe container" you should do so. Firecracker or any container not running on your own hardware would count there.)
That's the secret, cap... you can't. And it's due to in band signalling, something I've mentioned on numerous occasions. People should entertain the idea that we're going to have to reeducated people about what is and isn't possible because the AI world has been playing make believe so much they can't see the fundamental problems to which there is no solution.
> They're incredibly dangerous to use if you don't know how to isolate them in a safe container but wow the stuff you can do with them is fascinating.
True but all it will take is one report of something bad/dangerous actually happening and everyone will suddenly get extremely paranoid and start using correct security practices. Most of the "evidence" of AI misalignment seems more like bad prompt design or misunderstanding of how to use tools correctly.
This seems unlikely. We've had decades of horrible security issues, and most people have not gotten paranoid. In fact, after countless data leaks, crypto miner schemes, ransomware, and massive global outages, now people are running LLM bots with the full permission of their user and no guardrails and bragging about it on social media.
"When you use Claude Code, we collect feedback, which includes usage data (such as code acceptance or rejections), associated conversation data, and user feedback submitted via the /bug command."
So I can opt out of training, but they still save the conversation? Why can't they just not use my data when I pay for things. I am tired of paying, and then them stealing my information. Tell you what, create a free tier that harvests data as the cost of the service. If you pay, no data harvesting.
Even that is debatable. There are a lot of weasel words in their text. At most they're saying "we're not training foundation models on your data", which is not to say "we're not training reward models" or "we're not testing our other-data models on your data" and so on.
I guess the safest way to view this is to consider anything you send them as potentially in the next LLMs, for better or worse.
When they ask "How is Claude doing this session?", that appears to be a sneaky way for them to harvest the current conversation based on the terms-of-service clause you pointed out.
That's not just them saving it locally to like `~/.claude/conversations`? Feels weird if all conversations are uploaded to the cloud + retained forever.
The seems to be looking to let the agent access the source code for review. But in that case, the agent should only see the codebase and nothing else. For a code review agent, all it really needs are:
- Access to files in the repositorie(s)
- Access to the patch/diff being reviewed
- Ability to perform text/semantic search across the codebase
That doesn’t require running the agent inside a container on a system with sensitive data. Exposing an API to the agent that specifically give it access to the above data, avoiding the risk altogether.
If it's really important that the agent is able to use a shell, why not use something like codespaces and run it in there?
I'm currently using Goose[1]. My brother in law uses Claude Code and he likes it. It makes me wonder if I'm missing anything. Can anyone tell me if there's any reason I should switch to Claude Code, or comparisons between the two?
I tried goose and it seems like there's a lot of nice defaults that Claude Code provides that Goose does not. How did you do your initial configuration?
What I've been trying to use it for is to solve a number of long-standing bugs that I've frankly given up on in various Linux tools.
I think I lack the social skills to community drive a fix, probably through some undiagnosed disorder or something so I've been trying to soldier alone on some issues I've had for years.
The issues are things like focus jacking in some window manager I'm using on xorg where the keyboard and the mouse get separate focuses
Goose has been somewhat promising, but still not great.
I mean overall, I don't think any of these coding agents have given me useful insight into my long vexing problems
I think there has to be some type of perception gap or knowledge asymmetry to be really useful - for instance, with foreign languages.
I've studied a few but just in the "taking classes at the local JC" way. These LLMs are absolutely fantastic aids there because I know enough to frame the question but not enough to get the answer.
There's some model for dealing with this I don't have yet.
Essentially I can ask the right question about a variety of things but arguably I'm not doing it right with the software.
I've been writing software for decades, is it really that I'm not competent enough to ask the right question? That's certainly the simplest model but it doesn't check out.
Maybe in some fields I've surpassed a point where llms are useful?
It all circles back to an existential fear of delusional competency.
> Maybe in some fields I've surpassed a point where llms are useful?
I've hit this point while designing developer UX for a library I'm working on. LLMs can nail boilerplate, but when it comes to dev UX they seem to not be very good. Maybe that's because I have a specific vision and some pretty tight requirements? Dunno. I'm in the same spot as you for some stuff.
Never used goose, but looked at it way back when-- Claude Code feels more native IMO. Especially if you're already using Anthropic API/Plans anyways, I'd say give it a try.
I've always been curious. Are tags like that one: "<system-reminder>" useful at all? Is the LLM training altered to give a special meaning to specific tags when they are found?
Can a user just write those magic tags (if they knew what they are) and alter the behavior of the LLM in a similar manner?
Claude tends to work well with such semi-xml tags in practice (probably trained for it?).
You can just make them up, and ask it to respond with specific tags, too.
Like “Please respond with the name in <name>…</name> tags and the <surname>.”
It’s one of the approaches to forcing structured responses, or making it role-play multiple actors in one response (having each role in its tags), or asking it to do a round of self-critique in <critique> tags before the final response, etc.
A user can append similar system reminders in their own prompt. It’s one of the things that the Claude Code team discovered worked and now included in other CLIs like Factory, which was talked about today by cofounder of Factory: https://www.youtube.com/live/o4FuKJ_7Ds4?si=py2QC_UWcuDe7vPN
As a burnt-out, laid-off aging developer, I want to thank Anthropic for helping me get in love with programming again. Claude Code on terminal with all my beloved *nix tools and vim rocks.
100%. As a burnt-out manager, who doesn't get a lot of spare time to actually code. It's nice to have a tool like CC where I can make actual incremental changes in the spare 15 minutes I get here and there.
I spend most of my time making version files with the prompt, but pretty impressed by how far I've gotten on an idea that would have never seen the light of day....
The thoughts of having to write input validation, database persistence, and all the other boring things I've had to write a dozen times in the past....
As an Architect, i feel like a large part of my job is to help my team be their best, but I'm also focused on the delivery of a few key solutions. I'm used to writing tasks, and helping assign it to members on the team while occasionally picking up the odd-end piece of work myself, focusing more on architecture and helping individual members when they get stuck or when problems come up. But with the latest coding agents, i'm always thinking in the back of my head (I can get the AI to finish this task 3x quicker, and probably better quality if I just do it myself with the AI). We sit on SCRUM meetings sizing tasks, and i'm thinking "bro, you're just going to take my task description paste it into AI and be done in 1/2 hr" but we size it to a day or 2.
Agreed, it's actually fun again. The evening hours I used to burn with video games and weed are now spent with claude code, rewriting and finishing up all my custom desktop tools and utilities I started years ago.
I had a lot of fun making 'tools' like this, but once I settled into a complicated problem (networking in a multiplayer game), it has become frustrating to watch Claude give back control to me without accomplishing anything, over and over again. I think I need to start using the SDK in order to force it to its job.
This kind of stuff is where my anxiety rises a bit. Another example like this is audio code - it compiles and “works” but there could be subtle timing bugs or things that cause pops and clicks that are very hard to track down without tracing through the code and building that whole mental model yourself.
There’s a great sweet spot though around stuff like “make me this CRUD endpoint and a migration and a model with these fields and an admin dashboard”.
I've found that in those cases, I likely am better off doing it myself. The LLMs I've used will frequently overfit the code when it gets complicated. I am working on a language learning app and it so often will add special-casing for words occurring in the tests. In general, as soon as you leave boiler-plate territory, I found it will start writing dirtier and dirtier code.
It’s still better letting Claude slog through all that boilerplate and skeletal code for you so that you can take the wheel when things start getting interesting. I’ve avoided working on stuff in the past just because I knew I wouldn’t be motivated enough to write the foundation and all the uninteresting stuff that has to come first.
Try giving codex IDE a go, now included with ChatGPT.
Had equal frustrations with Claude making bad decisions, in contrast gpt5 codex high is extremely good!
I've got it using dbus, doing funky stuff with Xlib, working with pipewire using the pulseaudio protocol which it implemented itself (after I told it to quit using libraries for it.) You can't one-shot complicated problems, at least not without extensive careful prompting, but at this point I believe I can walk it through pretty much anything I can imagine wanting done.
Depends on the game tbh, having claude ping me for attention every few minutes disrupts most games too much, but with turn-based or casual games it works out well. OpenTTD and Endless Sky are fun to play while claude churns.
Thanks for caring. At the moment I am in a good place and luckily I don't have financial problems. My mental health is getting better thanks to a fixed schedule, sleep, diet, exercise, socializing, and walks in nature. I hope you get better soon, too.
Careful with trying to generalize personalized insights from therapy! Everyone is different. I believe advice giving (& receiving!) is very difficult to do well, even with people you know personally. For strangers on an internet forum, it is impossible.
Wait, still no support for the new MCP features? How come Claude Code, from the creators of MCP, is still lacking félicitation, server logging, and progress report ?!
To those lamenting that the Plan with Opus/Code with Sonnet feature is not available, check the charts.
Sonnet 4.5 is beating Opus 4.1 on many benchmarks. Feels like it's a change they made not to 'remove options', but because it's currently universally better to just let Sonnet rip.
I notice that thinking triggers like "Think harder" are not highlighted in the prompt anymore. Could that mean that thinking is now only a single toggle with tab (no gradation)?
Has anyone figured out how to do claude sub agents without using claude? some sort of opensource cli with openrouter or something? I want to use subagents on differnt LLMs ( copilot,selfhost ).
I wish there were an option to cancel a currently running prompt midway. Right now, pressing Ctrl+C twice ends up terminating the entire session instead.
I'm always watching Claude Code as it runs, ready to hit the Escape key as soon as it goes off the rails. Sometimes it gets stuck in a cul de sac, or misunderstands something basic about the project or its structure and gets off on a bad tangent. But overall I love CC.
I am ended up not using this option anyway. I am using B-MAD agents for planning and it gets into a long-running planning stream, where it needs permission to execute steps. So you end up running the planning in the "accept edits" mode.
I use Opus to write the planning docs for 30 min, then use Sonnet to execute them for another 30 min.
They removed the /model option where you can select Opus to plan and Sonnet to execute. But you can still Shift + Tab to cycle between auto-accept and plan mode.
is Plan mode any different from telling Claude "this is what I'd like to do, please describe an implementation plan"?
that's generally my workflow and I have the results saved into a CLAUDE-X-plan.md. then review the plan and incrementally change it if the initial plan isn't right.
There's a bit of UI around it where you can accept the plan. I personally stopped using it and instead moved to a workflow where I simply ask it to write the plan in a file. It's much easier to edit and improve this way.
Yeah, I just have it generate PRDs/high-level plans, then break it down into cards in "Kanban.md" (a bunch of headers like "Backlog," "In-Progress", etc).
To be honest, Claude is not great about moving cards when it's done with a task, but this workflow is very helpful for getting it back on track if I need to exit a session for any reason.
i've experienced the same thing. usually i try to set up or have it set up a milestone/phase approach to an implementation with checklists (markdown style) but it's 50/50 if it marks them automatically upon completion.
I think they meant the 'Plan with Opus' model. shift+tab still works for me, the VS code extension allows you to plan still too, but the UI is _so_slow with updates.
I was using aider quite a lot from ~ 7 months ago to ~ 3 months ago.
I had to stop because they refuse to implement MCPs and Claude/Codex style agentic workflow just yields better results.
how do i revert to the previous version? I find that the "claude" command in terminal still works great, but the new native VSC extension is missing all these things (before it would launch terminal + run "claude")
I feel like there's so many bugs. The / commands for add-dir and others I used often are gone.
i really hate the fact that every single model has its own cli tool. the ux for claude code is really great, but being stuck using only anthropic models makes me not want to use it no matter how good it is.
Pardon my ignorance, but what does this mean? It's a terminal app that has always expanded to the full terminal, no? I've not noticed any difference in how it renders in the terminal.
A tui does not have to start full screen, v1 of claude did not take over the entire terminal, it would only use a bit at the bottom and scroll up until it was full screen.
pretty sure your old behavior was the broken one tho - i vaguely remember fugling with this to "fullscreen correctly" for a claude-in-docker-in-cygwin-via-MSYS2 a while ago
I'm consistently hitting weird bugs with opencode, like escape codes not being handled correctly so the tui output looks awful, or it hanging on the first startup. Maybe after they migrate to opentui it'll be better
I do like the model selection with opencode though
* New native VS Code extension
* Fresh coat of paint throughout the whole app
* /rewind a conversation to undo code changes
* /usage command to see plan limits
* Tab to toggle thinking (sticky across sessions)
* Ctrl-R to search history
* Unshipped claude config command
* Hooks: Reduced PostToolUse 'tool_use' ids were found without 'tool_result' blocks errors
* SDK: The Claude Code SDK is now the Claude Agent SDK Add subagents dynamically with --agents flag
[1] https://github.com/anthropics/claude-code/blob/main/CHANGELO...
[1] https://github.com/marckrenn/cc-mvp-prompts/compare/v1.0.128...
[2] https://x.com/CCpromptChanges/status/1972709093874757976
Interesting. This was in the old 1.x prompt, removed for 2.0. But CC would pretty much always add comments in 1.x, something I would never request, and would often have to tell it to stop doing (and it would still do it sometimes even after being told to stop).
The bot is based on Mario Zechner's excellent work[1] - so all credit goes to him!
[1] https://mariozechner.at/posts/2025-08-03-cchistory/
This is pretty funny while Cursor shipped their own CLI.
Looks great, but it's kind of buggy:
- I can't figure out how to toggle thinking
- Have to click in the text box to write, not just anywhere in the Claude panel
- Have to click to reject edits
I told it to crop the video to just her and remove the obscured portion and that I had ffmpeg and imagemagick installed and it looked at the video, found the crop dimensions, then ran ffmpeg and I had a video of her all cleaned up! Marvelous experience.
My only complaint is that sometimes I want high speed. Unfortunately Cerebras and Groq don't seem to have APIs that are compatible enough for someone to have put them into Charm Crush or anything. But I can't wait for that.
https://www.cerebras.ai/blog/introducing-cerebras-code
https://github.com/grafbase/nexus/
If croq talks openai API, you enable the anthropic protocol, and openai provider with a base url to croq. Set ANTHROPIC_BASE_URL to the open endpoint and start claude.
I haven't tested croq yet, but this could be an interesting use case...
Claude Code, Codex CLI etc can effectively do anything that a human could do by typing commands into a computer.
They're incredibly dangerous to use if you don't know how to isolate them in a safe container but wow the stuff you can do with them is fascinating.
Also, I think shellagent sounds cooler.
I expect the portion of Claude Code users who have a dedicated user setup like this is pretty tiny!
Not the exact setup, but also pretty solid.
https://news.ycombinator.com/newsguidelines.html
Edit: We've had to ask you this more than once before, and you've continued to do it repeatedly (e.g. https://news.ycombinator.com/item?id=45389115, https://news.ycombinator.com/item?id=45282435). If you don't fix this, we're going to end up banning you, so it would be good if you'd please review the site guidelines and stick to them from now on.
I was under the impression that Docker container escapes are actually very rare. How high do you rate the chance of a prompt injection attack against Claude running in a docker container on macOS managing to break out of that container?
(Actually hang on, you called me out for suggesting containers like Docker are safe but that's not what I said - I said "a safe container" - which is a perfectly responsible statement to make: if you know how to run them in a "safe container" you should do so. Firecracker or any container not running on your own hardware would count there.)
That's the secret, cap... you can't. And it's due to in band signalling, something I've mentioned on numerous occasions. People should entertain the idea that we're going to have to reeducated people about what is and isn't possible because the AI world has been playing make believe so much they can't see the fundamental problems to which there is no solution.
https://en.m.wikipedia.org/wiki/In-band_signaling
True but all it will take is one report of something bad/dangerous actually happening and everyone will suddenly get extremely paranoid and start using correct security practices. Most of the "evidence" of AI misalignment seems more like bad prompt design or misunderstanding of how to use tools correctly.
So I can opt out of training, but they still save the conversation? Why can't they just not use my data when I pay for things. I am tired of paying, and then them stealing my information. Tell you what, create a free tier that harvests data as the cost of the service. If you pay, no data harvesting.
Even that is debatable. There are a lot of weasel words in their text. At most they're saying "we're not training foundation models on your data", which is not to say "we're not training reward models" or "we're not testing our other-data models on your data" and so on.
I guess the safest way to view this is to consider anything you send them as potentially in the next LLMs, for better or worse.
When they ask "How is Claude doing this session?", that appears to be a sneaky way for them to harvest the current conversation based on the terms-of-service clause you pointed out.
Storing the data is not the same as stealing. It's helpful for many use cases.
I suppose they should have a way to delete conversations though.
- Access to files in the repositorie(s) - Access to the patch/diff being reviewed - Ability to perform text/semantic search across the codebase
That doesn’t require running the agent inside a container on a system with sensitive data. Exposing an API to the agent that specifically give it access to the above data, avoiding the risk altogether.
If it's really important that the agent is able to use a shell, why not use something like codespaces and run it in there?
1: https://block.github.io/goose/
I think I lack the social skills to community drive a fix, probably through some undiagnosed disorder or something so I've been trying to soldier alone on some issues I've had for years.
The issues are things like focus jacking in some window manager I'm using on xorg where the keyboard and the mouse get separate focuses
Goose has been somewhat promising, but still not great.
I mean overall, I don't think any of these coding agents have given me useful insight into my long vexing problems
I think there has to be some type of perception gap or knowledge asymmetry to be really useful - for instance, with foreign languages.
I've studied a few but just in the "taking classes at the local JC" way. These LLMs are absolutely fantastic aids there because I know enough to frame the question but not enough to get the answer.
There's some model for dealing with this I don't have yet.
Essentially I can ask the right question about a variety of things but arguably I'm not doing it right with the software.
I've been writing software for decades, is it really that I'm not competent enough to ask the right question? That's certainly the simplest model but it doesn't check out.
Maybe in some fields I've surpassed a point where llms are useful?
It all circles back to an existential fear of delusional competency.
I've hit this point while designing developer UX for a library I'm working on. LLMs can nail boilerplate, but when it comes to dev UX they seem to not be very good. Maybe that's because I have a specific vision and some pretty tight requirements? Dunno. I'm in the same spot as you for some stuff.
For throwaway code they're pretty great.
I've always been curious. Are tags like that one: "<system-reminder>" useful at all? Is the LLM training altered to give a special meaning to specific tags when they are found?
Can a user just write those magic tags (if they knew what they are) and alter the behavior of the LLM in a similar manner?
You can just make them up, and ask it to respond with specific tags, too.
Like “Please respond with the name in <name>…</name> tags and the <surname>.”
It’s one of the approaches to forcing structured responses, or making it role-play multiple actors in one response (having each role in its tags), or asking it to do a round of self-critique in <critique> tags before the final response, etc.
Okay, I know I shouldn't anthropomorphize, but I couldn't prevent myself from thinking that this was a bit of a harsh way of saying things :(
I spend most of my time making version files with the prompt, but pretty impressed by how far I've gotten on an idea that would have never seen the light of day....
The thoughts of having to write input validation, database persistence, and all the other boring things I've had to write a dozen times in the past....
There’s a great sweet spot though around stuff like “make me this CRUD endpoint and a migration and a model with these fields and an admin dashboard”.
The thing is a lot of software jobs boil down to not difficult but time consuming.
If you want to chat with somebody, let me know.
Sonnet 4.5 is beating Opus 4.1 on many benchmarks. Feels like it's a change they made not to 'remove options', but because it's currently universally better to just let Sonnet rip.
cl --version 1.0.44 (Claude Code)
as expected … liar! ;)
cl update
Wasn't that hard sorry for bothering
https://www.reddit.com/r/ClaudeAI/comments/1mlhx2j/comment/n...
Though I will see how this pans out.
I use Opus to write the planning docs for 30 min, then use Sonnet to execute them for another 30 min.
This isn't true, you just need to use the usual shortcut twice: shift+tab
that's generally my workflow and I have the results saved into a CLAUDE-X-plan.md. then review the plan and incrementally change it if the initial plan isn't right.
To be honest, Claude is not great about moving cards when it's done with a task, but this workflow is very helpful for getting it back on track if I need to exit a session for any reason.
If I hit shift-Tab twice I can still get to plan mode
WTF. Terrible decision if true. I don't see that in the changelog though
They just changed it so you can't set it to use Opus in planning mode... it uses Sonnet 4.5 for both.
Which makes sense Iif it really is a stronger and cheaper model.
I hope this is the case.
I feel like there's so many bugs. The / commands for add-dir and others I used often are gone.
I logged in, it still says "Login"
Pardon my ignorance, but what does this mean? It's a terminal app that has always expanded to the full terminal, no? I've not noticed any difference in how it renders in the terminal.
What am i misunderstanding in your comment?
I just downgraded to v1 to confirm this.
Wonder what changes that i'm not seeing? Do you think it's a regression or intentional?
pretty sure your old behavior was the broken one tho - i vaguely remember fugling with this to "fullscreen correctly" for a claude-in-docker-in-cygwin-via-MSYS2 a while ago
I do like the model selection with opencode though