I am against vibe coding on principle. HOWEVER...
I had the goal to add a simple gallery to this blog. I wasn't sure what I wanted it to look like and did not have strong feelings about the implementation. I knew that I did want it to match the existing vibes of the blog, rather than looking like I grabbed the first thing I could find from a coding tutorial. (Which is typically what I would start with.)
I booted up my GitHub Codespace to edit the site from mobile. Codespace allows me to edit the site and even run development commands in Terminal as if I have a dedicated DEV box that I am accessing from my iPad.
When I got into Codespaces, I noticed that Co-Pilot was now prominently available. "Pshaw" I thought, and then went about brainstorming how to add my photo gallery.
Then I thought, I've seen reviews of the recent AI models that show that they are more than capable of cobbling together a first draft. Copilot is also setup to be agentic in Codespace, meaning that it has full access to read the files and make edits on the fly. It can even run commands in Terminal and check the logs for runtime errors. This allows it to run loops, making changes and checking output, all without human intervention.
I gave it a go.
My previous experiences with AI coding have taught me some key lessons.
- The more you know about the project, the better.
- The more the popular the framework is, the better.
- The more specific you are, the better.
The more you know about the project, the better
Project Language Refactor
In one project, I was porting some boutique code from Python, and I wanted to rebuild it in Swift. I had already done all the Python coding by hand, including custom GUIs. I knew the project goals, requirements, and architecture in depth. I threw ChatGPT at the code refactor, (using Xcode's agentic tools at the time.) It nearly one-shotted a correct and viable app. There were some small changes to make for the GUI and UI, but the app was functional from the start.
Project from Scratch
Following on the great success of the project above, I sought to test my newfound superpowers on a new project. "If it's so easy to build an app in Swift using these tools, even thought I know nothing about Swift at all, I should be able to build an app from scratch just as easily," I thought, in my hubris. I set out to build a new app, with clear goals, but zero thoughts on implementation or architecture. I then spent the next 5 hours chasing debug errors, and having the AI delete and recreate massive sections of the project from one prompt to the next. I spent 3 days over the course of the project trying to get at least a semi-functional proof of concept, to no avail. I gave up on getting a chatbot to produce anything usable here, at least until I had gotten my own code written to set a foundation.
The more popular the framework, the better
Radicale
I had attempted to setup a Radicale server, to host shared, and synced calendars and reminders amongst our team. Radicale is an open source WebDAV and CalDAV server. It has plenty of good documentation...as long as you are building something simple for yourself. As soon as you start trying to add shared, synced calendars, backed by LDAP and group permissions, the documentation starts looking sparse.
In this project, I thought it would be another good opportunity to test what the newest chatbots could do.
I tested gpt-oss and qwen3-coder-next through ollama. In my prompts, I included instructions to check their work against the published documentation for Radicale, which they have certainly ingested, to make sure that their answers were viable.
In the output from both models, they "thought" about their answers, and then "checked" their work. Then they proceeded to spew nonsense about non-existent settings, missing config files, and version mismatches.
Both models presented their answers with utmost confidence, even when I attempted to correct them. After correction, they would change their answer to some other hallucination, and toggle between hallucinated answers the more I corrected them.
This was not a waste. I did find it helpful to "brainstorm" with the models, even thought they did not output a single correct answer. But correcting their "thought process" helped me to solidify my own, and I was able to work a solution.
My takeaway here is that because the documentation was sparse for what I was attempting, and because there were not even community discussions with guides for a similar configuration, the models did not have enough training data to be helpful for this particular case.
The more specific you are, the better
This is the most recent lesson I have learned from using AI models. For this blog, and my attempt to add a gallery page, I could prompt the model with something like this:
Create a drop down menu, and give it a low elevation shadow. Make the border for the menu match the color of the background for the page.Copilot, in this case, would then proceed to create the drop down, as requested, but would make mistakes, such as adding a white border, instead of a green one, or fail to add a box shadow at all.
If I scrapped those changes and prompted it again, with something like this instead:
Create a drop down menu, and apply the `--shadow-elevation-low` style from `style.css`. Make the border for the menu clear, with no color, so that the background for the page shows through.It could then provide the correct answer in a single attempt.
Vibe Coding an Image Gallery
This project in particular has some traits that make it an excellent candidate for Agentic AI coding.
- It's in a well known framework (Gatsby)
- It's a well known application of the technology (Web, image gallery)
- I know the project well (I started with a template, but built out a lot of customization myself)
- I know the framework, and its limitations (I am well accustomed to working in CSS, HTML, and Gatsby)
- Importantly, everything is in by Git, so erasing AI hallucinations in a Commit is easy.
I know enough that I could have worked through setting up the gallery all by myself, and would have been able to finish it in a day or two of spare time.
But working with Copilot, running Claude Haiku 4.5, I was able to complete the feature additions in 2 hours, start to finish.
This included debugging, adding in accessibility features, accounting for small device screens, matching existing CSS colors and styles, etc.
Only once did I find that Copilot was out in the weeds on an attempted solution.
In that case, ye olde git stash -a -u && git pull destroyed the broken code, and I was able to start a new prompt with more precise instructions.