Agent Skills in Cartagena, Colombia

Talking about AI architecture is one thing. Doing it entirely in Spanish is another.

While working on a project in El Salvador, I took a detour to Colombia to speak at GDG Cartagena’s Build with AI event. It was my first talk as a Google Developer Expert (Cloud & AI), and my first time presenting deep technical architecture in a foreign language.

(Duolingo, if you are reading this—please sponsor me already 🦉).

It was such an honor to be invited to speak in Cartagena. Instead of high-level AI theories, we focused on a pain point every developer knows too well: the constant tax of brittle end-to-end testing.

The Universal Developer Tax: Brittle E2E Testing

Using my side project, thebookingsapp (a high-volume restaurant reservation system that I use as a live playground for agentic experimentation), as a live demo, we looked at what happens when frontend builds regenerate dynamic HTML IDs (like in Vue or Nuxt).

Hardcoded Playwright or Selenium scripts break. You end up spending more time fixing your test suite than shipping features. Traditional scripts fail because they lack context—they know what code to click, but they don't understand what they are trying to achieve.

The Solution: Building Autonomous QA Agents

The fix isn't better scripts; it's autonomy. I demonstrated how to wire up Gemini CLI and Chrome DevTools MCP with "Agentic Skills".

An Agentic Skill is an open-standard, Markdown-first specification that packages human procedural knowledge, rules, and edge cases into a format that a modern AI agent can consume. By structuring instructions within a simple .agents/skills/ directory inside your repository, you supply the agent with a dedicated playbook:

your-project/
├── .agents/
│   └── skills/
│       └── web-qa-agent/
│           ├── SKILL.md          <-- Metadata, rules, entry point
│           ├── references/       <-- Deep documentation & snippets
│           └── scripts/          <-- Helper automation tools

To make this autonomous, we combine the open-source Gemini CLI tool with the Chrome DevTools Model Context Protocol (MCP) server. This allows the model to spin up a headless browser instance, inspect layout rendering, and dynamically interact with DOM elements using semantic understanding rather than brittle CSS selectors.

Instead of writing brittle locators, you write a Markdown playbook (SKILL.md) that teaches an LLM how to systematically stress-test your specific reservation flow, validate fields, and check mobile responsiveness multimodally across six critical scenarios:

The Happy Path: Open the widget, dynamically select an open date, fill customer details, and confirm success.
Dynamic Party Sizes: Validate that the available times automatically adapt when switching between 1, 4, and 8 guests.
Field Validation Hardening: Submit incomplete data to confirm the UI returns clear validation errors.
Responsive Mobile Testing: Force the browser viewport to a mobile profile (375x812) to verify layout integrity.
Localization: Load the system with and without localization parameters (e.g., ?lang=es) to confirm semantic content shifts accurately.
Boundary Dates: Attempt reservations for the current minute, or 30 days out, to verify graceful system degradation.

Because Gemini is multimodal, it doesn't just read the HTML source—it evaluates visual rendering to ensure elements don't overlap or break.

Skills are the API Docs for the Agent Era

The takeaway: AI agents already know how to use a browser, fill a form, or generate a report. What they need is your operational context.

A decade ago, we focused intensely on writing clean API documentation so that software applications could talk to other software applications. In the era of AI engineering, Agentic Skills are rapidly becoming the new API documentation for the agent era. They bridge the gap between human institutional business logic and autonomous execution.

Huge thanks to Giselle Ulloa and the entire GDG Cartagena team for a flawless event.

Are you still maintaining hardcoded E2E scripts, or are you experimenting with agentic QA? Let me know! 👇

The Universal Developer Tax: Brittle E2E Testing

The Solution: Building Autonomous QA Agents

Skills are the API Docs for the Agent Era

Alexander Amin