This approach is so limiting it seems like it would be better to change the constraints. For example, in the case of a software agent you could run everything in a container, only allow calls you trust to not exfiltrate private and make the end result a PR you can review.
Is there any work related to using some kind of soft tokens for reasoning? It seems so inefficient to try to encode so much information down into a single token for the next pass of the model, when you could output a large vector for each forward pass, and have a drastically larger working memory/scratchpad, and have much higher bandwidth for the models to pass information forward to the next token call. If a single token has 17 bits of information, a vector of 1024 floats could have 32,768 bits of information.
I just found a recent paper about this: https://arxiv.org/abs/2505.15778. It's really thoughtful and well written. They mix the different token outputs together.
If you care that much about having correct data you could just do a SHA-256 of the whole thing. Or an HMAC. It would probably be really fast. If you don’t care much you can just do murmur hash of the serialized data. You don’t really need to verify data structure properties if you know the serialized data is correct.
I've been so impressed with Cursor's tab completion. I feel like I just update my Typescript types, and point my cursor at the right spot in the file and it reads my mind and writes the correct code. It's so good that I feel like I lose time if I write my instructions to an AI in the chat window, instead of just letting Cursor guess what I want. None of the other models feel anything like this.
Ah that's a bummer. You can still add threads as context, but that you cannot use slash commands there, so the only way to add them or other stuff in the context is to click buttons with the mouse. It would be nice if at least slash commands were working there.
edit: actually it is still possible to include text threads in there
It actually seems to work for me. I have an active text thread and it was added automatically to my inline prompt in the file. There was this box on the bottom of the inline text box. I think I had to click it the first time to include the context, but the subsequent times it was included by default.
Yeah, It was great because you were in control of where and when the edits happened.
So you could manage the context with great care, then go over to the editor and select specific regions and then "pull in" the changes that were discussed.
I guess it was silly that I was always typing "use the new code" in every inline assist message.
A hotkey to "pull new code" into a selected region would have been sweet.
I don't really want to "set it and forget it" and then come back to some mega diff that is like 30% wrong. Especially right now where it keeps getting stuck and doing nothing for 30m.
One of the weirdest and most interesting parts of LLMs is that they grow more effective the more languages and disciplines they are trained in. It turns out training LLMs on code instead of just prose boosted their intelligence and reasoning capabilities by huge amounts.
There are a lot of papers that stumble upon this while working with LLMs and its mentioned off hand in the conclusions. If you want a paper that explicitly tested this here you go - https://aclanthology.org/2022.coling-1.437.pdf
> features an amorphous silicon dioxide coating that's infused into the inner-wall of the bottle. Essentially, this forms a glass-like finish that provides a totally natural, and completely inert, solution to the problem
"glass-like" != glass
Dollars to doughnuts that coating is spray lined. Which means some sort of solvent, at a minimum. And I'm unaware of any way to fuse silica at temperatures under which plastic would survive, so it's a coating with a binder (dissolved in the solvent) which can still be damaged, scratched, leach or flake off into the water.
I wouldn't touch it with a 10 foot pole. Glass, stainless steel (316 please), or ceramics for me.