Adventures coding with AI

w00key

Ars Tribunus Angusticlavius
7,353
Subscriptor
You know that saying that software requirements always expand to require / target the latest hardware? I suddenly get a feeling that it happens for software development too. Things that would be waaaaay too expensive (in time -> $) to afford are suddenly doable, so work expands to include a ton of nice to have's.

Who could have ever afforded this time, for a SME and not a huge chain?

1754141100050.png


Okay I'm cheating hard as a stakeholder on both the dev and user side, but still, I wouldn't even dream of suggesting / implementing this for a digital menu a month ago, without Claude to help me rapidly iterate on requirements and poop out a full .md that can be machine translated into models.py, json API, and even fill most of its content.

This is going tie into another Claude/Gemini generated web app that has a ton of allergen info, like this is Pearl River Bridge brand Soy Sauce, here's the photo of the front and ingredients list, and this contains Gluten and Soy. It will have every single packaging in our dry storage, fridge and freezer, for 100% coverage.

With a bit of LLM magic, it's going from Soy Sauce Chicken -> long description [manual approval / edit step here] -> list of ingredients [manual approval / edit step here] -> Soy Sauce, etc etc -> union of all allergens found. Contains Soy, Gluten, contains natural Sulfite from mushroom flavored dark soy sauce. That last one is very often / consistently missed when you manually type up a menu-allergens.xlsx.


[edit] Random thought, it isn't that long ago that a "Star Trek" style "computer, do whatever I need" commands were totally science fiction and IRL "assistant" apps are near useless. But now it's totally doable? People are sharing "what is the most lazy / ridiculous prompt that worked?" Things that shouldn't work, just did. "Fix it [paste stack trace or logs]" isn't that crazy, but "Make doge the favicon" is wat.
 
Last edited:

hanser

Ars Legatus Legionis
42,256
Subscriptor++
^^ Man, I've done a bunch of "low on the priority list" work over the last two weeks. Just because I can, and I'm using it as a way to familiarize myself with Claude Code as a programming tool. (I actually think it was you, w00key, that made me think "I should try this outside my comfort zone on smaller tasks that'll never see the light of day otherwise.")

I'm alternating between "things I have deep expertise in" and "things I know nothing about", and it's equally effective in both places.

--

Unrelated, but I've found Sonnet to be just fine when using extended thinking in a chatbox. I sometimes have to correct it in CC, though. My colleagues have said CC goes downhill when they have to switch back to Sonnet. Hmm. I wish there was a "use extended thinking" option in CC for Sonnet. Might work better?
 
Last edited:
After poking around with a couple this afternoon, I'd say "no." I don't have a huge machine, but the simple query "write C code to reverse the columns of a CSV file" generates nonsense that crashes, even though I'm sure there are plenty of examples on the web. Even GPT 4o-mini, which I think is too big to run locally, forgets to strip the trailing newline from the final field, so e.g.

ab,cd,ef
12,34,56

becomes

ef
,cd,ab
56
,34,12

I gather the big models aren't this stupid, but it sadly looks like local ones have a long ways to go.
I think the local models are going to need RAG to match performance on "do something that there's a plenty of examples of". They're intrinsically too small to memorize all that stuff in the weights.

It would be rather advantageous for many coders if this can be done with RAG rather than memorization in weights, because with RAG you can also cite the original sources or (if its coming from some library) not duplicate the functionality.
 

w00key

Ars Tribunus Angusticlavius
7,353
Subscriptor
Unrelated, but I've found Sonnet to be just fine when using extended thinking in a chatbox. I sometimes have to correct it in CC, though. My colleagues have said CC goes downhill when they have to switch back to Sonnet. Hmm. I wish there was a "use extended thinking" option in CC for Sonnet. Might work better?
I think it's very much the user. I was very careful in checking and double checking plans before and it works great.


Yesterday I was really lazy and just let it build the whole UI for

Store => Menu => Section => Item plus POS linking etc as a oneshot prompt from a hastily defined requirements file. Bunch of CRUD pages and a dashboard style overview. Go to work, see you in 30 minutes. Holy shit that code was a huge disaster. No tests (note to self: ALWAYS tell it to generate tests), runtime errors everywhere (yay Python), forgot to instantiate Form in view classes and pass it into the context, random mix of {{ form|crispy }} and manual input type=text, I considerd nuking it and starting over again from the plan.

But with like 20 additional prompts it was sort of usable? I don't doubt that Opus would do better with more freedom (read: laziness) and don't make these stupid mistakes, but Sonnet clearly can't handle this.