Rendered at 20:51:29 GMT+0000 (Coordinated Universal Time) with Cloudflare Workers.
HPSimulator 30 minutes ago [-]
One thing that feels different with AI-generated code is that the "design discussion" often happened inside the prompt instead of the PR.
In traditional workflows, a lot of the reasoning is visible through commit history, comments, or intermediate refactors. With LLMs, the reasoning step can be hidden because the model collapses that exploration into a single output.
What we've started doing internally is asking for two artifacts instead of just the code:
1. the prompt or task description that produced the code
2. the generated code itself
Reviewing both together gives you much better context about the intent, constraints, and tradeoffs that led to the implementation.
ativzzz 2 hours ago [-]
We as engineers are still paid to create working software. As such, you are responsible for the genAI code you ship to production. That is, our customers are paying us for working software, so we should all understand what the AIs are writing. This is slower and we become the bottlenecks, but it's a part of the offering of our business.
If I was working at a startup or working on a personal project, I wouldn't read the code but instead build a tighter verification loop to ensure the code functions as expected. Much harder to do in an existing system that was built pre-AI
raw_anon_1111 6 hours ago [-]
My controversial opinion is I don’t for the most part except for the unit and most of the time for me the integration tests and load testing I have it generate.
Other than that, I review known “this works when testing but breaks in production” - concurrency code, scalability issues, using the correct data load patterns, database indexes, etc.
I also validate non functional requirements around security, logging, costs, etc.
This is the same thing I did when I was working with other team leads as the “architect” who was mostly concerned with - would it fall over in production, would it cause security issues, would it cause compliance issues and are they following standards that we agreed on.
On the other hand, I haven’t done any serious web development since 2002. Now I vibe code internal web admin apps for customers where I use AWS Cognito for authentication. I don’t look at a line of code for it. I verify that it works, and the UX (not the UI - it’s ugly AF)
The chance of any human ever looking at the “AI first” code I wrote is slim. By AI first I mean detailed markdown files with references to other markdown files that start with the contract, requirement gathering transcripts and is modular by design.
allinonetools_ 4 hours ago [-]
In our team we treat AI-generated code the same way we treat junior-written code — the important part is whether the author actually understands it. During review we usually ask for a short explanation of the approach and edge cases they considered. If they can not explain it clearly, the code probably needs another pass.
lazypl82 15 hours ago [-]
The artifact review approach makes sense to me. How the code was produced doesn't change what the reviewer needs to answer: does this do what it's supposed to, and does it do it cleanly? If anything, I'd rather have a short design note in the PR – intent, constraints, alternatives considered – than a full prompt history. The prompt history is noise; the intent is signal.
That said, one thing review can't fully cover is runtime behavior under real traffic. Not saying that's a review problem – it's just a separate layer that still needs attention after the merge.
christophilus 19 hours ago [-]
> lacking the context around how the change was produced, the plans, the prompting, to understand how an engineer came to this specific code change as a result. Did they one-shot this? did they still spend hours prompting/iterating/etc.? something in-between?
In my opinion, you have to review it the way you always review code. Does the code do what it's supposed to do? Does it do it in a clean, modular way? Does it have a lot of boilerplate that should be reduced via some helper functions, etc.
It doesn't matter how it was produced. A code review is supposed to be: "Here's this feature {description}" and then, you look at the code and see if it does the thing and does it well.
al_borland 19 hours ago [-]
At the end of the day, the code is the only thing that matters. That is what defines, in concrete terms, what the program is doing. I don't view an LLM as a high level language, as we can run a prompt 10 different times and get 10 different results. This isn't valuable when it comes to programming.
Even without LLMs, there was a thought process that led to the engineer coming to a specific outcome for the code, maybe some conversations with other team members, discussions and thoughts about trade offs, alternatives, etc... all of this existed before.
Was that all included in the PR in the past? If so, the engineer would have to add it in, so they should still do so. If not, why do you all the sudden need it because of an AI involved?
I don't see why things would fundamentally change.
JamesLangford 19 hours ago [-]
I mostly ignore how the code was made and review the artifact the same way I always have: correctness, tests, readability, and whether it matches the system design. The prompt history is fine, but it’s not something I want to rely on during review because it’s not stable or reproducible.
What does help is requiring a short design note in the PR explaining the intent, constraints, and alternatives considered. That gives the context reviewers actually need without turning the review into reading a chat transcript.
In traditional workflows, a lot of the reasoning is visible through commit history, comments, or intermediate refactors. With LLMs, the reasoning step can be hidden because the model collapses that exploration into a single output.
What we've started doing internally is asking for two artifacts instead of just the code:
1. the prompt or task description that produced the code 2. the generated code itself
Reviewing both together gives you much better context about the intent, constraints, and tradeoffs that led to the implementation.
If I was working at a startup or working on a personal project, I wouldn't read the code but instead build a tighter verification loop to ensure the code functions as expected. Much harder to do in an existing system that was built pre-AI
Other than that, I review known “this works when testing but breaks in production” - concurrency code, scalability issues, using the correct data load patterns, database indexes, etc.
I also validate non functional requirements around security, logging, costs, etc.
This is the same thing I did when I was working with other team leads as the “architect” who was mostly concerned with - would it fall over in production, would it cause security issues, would it cause compliance issues and are they following standards that we agreed on.
On the other hand, I haven’t done any serious web development since 2002. Now I vibe code internal web admin apps for customers where I use AWS Cognito for authentication. I don’t look at a line of code for it. I verify that it works, and the UX (not the UI - it’s ugly AF)
The chance of any human ever looking at the “AI first” code I wrote is slim. By AI first I mean detailed markdown files with references to other markdown files that start with the contract, requirement gathering transcripts and is modular by design.
That said, one thing review can't fully cover is runtime behavior under real traffic. Not saying that's a review problem – it's just a separate layer that still needs attention after the merge.
In my opinion, you have to review it the way you always review code. Does the code do what it's supposed to do? Does it do it in a clean, modular way? Does it have a lot of boilerplate that should be reduced via some helper functions, etc.
It doesn't matter how it was produced. A code review is supposed to be: "Here's this feature {description}" and then, you look at the code and see if it does the thing and does it well.
Even without LLMs, there was a thought process that led to the engineer coming to a specific outcome for the code, maybe some conversations with other team members, discussions and thoughts about trade offs, alternatives, etc... all of this existed before.
Was that all included in the PR in the past? If so, the engineer would have to add it in, so they should still do so. If not, why do you all the sudden need it because of an AI involved?
I don't see why things would fundamentally change.
What does help is requiring a short design note in the PR explaining the intent, constraints, and alternatives considered. That gives the context reviewers actually need without turning the review into reading a chat transcript.