We use Diffie to test Diffie. Every pull request opened against our repo runs a small suite of AI browser tests against our dev environment before it can be merged. The suite finished a recent run in 48 seconds with three passes, and the green check on the PR is what gives us confidence to ship.
This post is the actual setup, copy paste ready. One workflow file, two GitHub secrets, three Diffie tests. If you have a Diffie account already, you can replicate the same gate on your repo in about ten minutes.
Why we dogfood
Two reasons. The obvious one is product quality: if Diffie cannot reliably run on Diffie, it cannot reliably run on anyone else's app. The less obvious one is that it makes us live with the same workflow our customers live with. Token rotation, flaky environments, slow previews, broken auth, surprising rate limits. Anything that hurts a customer's morning is going to hurt our morning first, which is exactly the feedback loop we want.
The CI workflow itself is intentionally boring. No custom action, no plugin, no third-party runner. Just curl and jq against two REST endpoints. We want anyone who ever has to debug it at 11pm to be able to read the whole thing in one screen.
The suite that gates every PR
We picked three tests, not thirty. The bar for inclusion was simple: if this flow breaks, no one can use the product, and our oncall has to roll the deploy back. Three flows met that bar.
- Diffie AI Login Test The full Google OAuth round trip into the app. If this breaks, no one can sign in.
- Diffie Test Creation and Cleanup Create a new test through the product UI, then delete it. Exercises the API, the database, the recording flow, and the dashboard.
- Diffie Run Test Open an existing test, run it, and confirm the result page renders. Exercises the runner, the realtime stream, and the result UI.
Together these three take about 35 seconds of wall clock time on the slowest test, and the suite finishes in under a minute. That budget matters. If gating the PR costs five minutes, people start asking to skip the gate. If it costs less than a minute, no one notices it is there until it fails.

The workflow file
Here is the file we ship at .github/workflows/ci.yml. Token and suite ID are pulled from GitHub repository secrets, never committed.
name: CI
on:
pull_request:
types: [opened, synchronize, reopened, ready_for_review]
branches: ['**']
push:
branches:
- main
merge_group:
jobs:
e2e-tests:
name: diffie-e2e-tests
runs-on: ubuntu-latest
if: github.event_name == 'pull_request'
steps:
- name: Run Diffie Test Suite
env:
DIFFIE_TOKEN: ${{ secrets.DIFFIE_TOKEN }}
DIFFIE_SUITE_ID: ${{ secrets.DIFFIE_SUITE_ID }}
PREVIEW_URL: https://dev.diffie.ai
run: |
echo "Running Diffie tests against: $PREVIEW_URL"
RESPONSE=$(curl -s -X POST \
"https://api.diffie.ai/ci/suites/$DIFFIE_SUITE_ID/execute" \
-H "Authorization: Bearer $DIFFIE_TOKEN" \
-H "Content-Type: application/json" \
-d "{\"baseUrl\": \"$PREVIEW_URL\"}")
RUN_ID=$(echo $RESPONSE | jq -r '.suiteRunId')
RUN_URL=$(echo $RESPONSE | jq -r '.url')
if [ "$RUN_ID" = "null" ] || [ -z "$RUN_ID" ]; then
echo "Failed to start suite run"
echo "$RESPONSE"
exit 1
fi
echo "Suite run started: $RUN_ID"
while true; do
STATUS_RESPONSE=$(curl -s \
"https://api.diffie.ai/ci/suite-runs/$RUN_ID" \
-H "Authorization: Bearer $DIFFIE_TOKEN")
STATUS=$(echo $STATUS_RESPONSE | jq -r '.status')
PASSED=$(echo $STATUS_RESPONSE | jq -r '.passed_tests')
TOTAL=$(echo $STATUS_RESPONSE | jq -r '.total_tests')
echo "Status: $STATUS ($PASSED/$TOTAL passed)"
if [ "$STATUS" = "passed" ]; then
echo "All tests passed."
exit 0
elif [ "$STATUS" = "failed" ]; then
echo "Tests failed. View details: $RUN_URL"
exit 1
elif [ "$STATUS" = "cancelled" ]; then
echo "Suite run was cancelled."
exit 1
fi
sleep 10
doneThat is the entire integration. Two API calls (start and poll), three exit codes (pass, fail, cancelled), and a sleep. Same script works on GitLab, CircleCI, Jenkins, or anywhere else that can run a shell.
Two GitHub secrets and you are done
Both values come from the Diffie dashboard. Add them under Settings → Secrets and variables → Actions:
DIFFIE_TOKENgenerate a token under Diffie Settings → API Tokens.DIFFIE_SUITE_IDopen the suite you want to gate the PR on, copy the ID from the URL.
We never check tokens into the workflow file. The placeholders above use the standard GitHub Actions secrets syntax so the values stay encrypted in your repo settings.
Every PR runs through this gate
We test every pull request like this. The suite's job is not to re-derive the bug the author already fixed. It is to confirm that login, test creation, and the run path still work on the way out. That is the job of a CI gate: stop the obvious cliff, do not pretend to find every cliff.

Things that surprised us once we shipped this
A few practical notes from running the gate for ourselves over the last weeks.
- Pin your preview URL early. We test against a fixed
dev.diffie.airather than per-PR previews. It keeps the suite stable and makes it obvious when a regression is the PR's fault versus an environment hiccup. If you use Vercel or Cloudflare per-PR previews, pass that URL intoPREVIEW_URLinstead. - Three tests is the right number to start. We almost added a fourth (billing flow). We are glad we did not. Watching three tests for two weeks taught us where the suite was actually flaky before we doubled the surface area.
- Sleep ten seconds, not one. The first version of the polling loop slept one second. It worked, but it spammed the API, and the logs were unreadable. Ten seconds is fine, the suite still finishes in under a minute, and the GitHub log fits on one screen.
- The link to the failed run is the most valuable line in the script. When a test fails, you click through from the GitHub log to the Diffie dashboard, see screenshots, the trace, and the agent transcript. That click takes triage from minutes to seconds.
What we would add next
The honest answer: not much, and not soon. The temptation is to grow the suite. The failure mode of growing it is a slow gate that everyone learns to bypass. We will add a fourth test the day a fourth flow becomes load bearing for our customers. Until then, three tests, 48 seconds, green check. That is the gate.
If you want to set up the same thing on your repo, the linked guide below walks through the workflow file in more detail, including dynamic preview URL setups for Vercel and Netlify.
Written by Anand Narayan, Founder of Diffie. First engineer at HackerRank, CEO at Codebrahma.
Published May 5, 2026