Today, partner QA is a manual, human-executed process. An Ops team member must navigate a partner's checkout flow, capture the Rokt experience data from browser DevTools, verify the data integration, record screenshots along the way, and manually upload this information into a QA document.
This workflow is time-intensive, prone to inconsistency, and creates a bottleneck at high-launch volume. It also takes time away from higher-leverage tasks such as analyzing the customer experience overall, and seeing where the team can make improvements to the Rokt–client setup.
The team initially set out to automate the QA process entirely. The first approach used a fully autonomous browser agent to drive the checkout end-to-end — navigating a client's site, completing a test purchase, capturing the Rokt integration data, and generating the QA document with zero human involvement. In controlled scenarios it worked well.
But in practice, the long tail of edge cases proved difficult to solve reliably: CAPTCHA gates, mandatory account creation, bot detection measures, and atypical checkout flows all required progressively more engineering investment for diminishing returns. Full automation was solving the wrong problem first.
The team pivoted to reframe the product around a human–LLM collaboration model, which they named Prism. An Ops team member drives the test purchase themselves, while a browser extension records everything along the way: relevant network requests, the Rokt integration data, UI screenshots at each key step. At the end of the session, the system takes over — it validates all integration attributes against the checklist, validates the UI/UX results against the customer policy handbook, and generates a standardised QA document with a structured analysis of the findings.
Starting a Prism session: the operator selects Partner Audit or Prod QA mode, searches for the account, and launches the recording session.
Mid-session: the operator navigates the partner's checkout flow while the Prism extension captures integration data, attributes, screenshots, and performance metrics in real time.
The output: a structured QA document automatically posted to the ticket with required attributes, performance metrics, integration issues, and linked screenshots — all generated from the session data.
The result is the same output quality as a fully automated run — with much less investment required — and it works on every partner site without exception. The approach leverages the best of what both humans and AI can do together: humans handle the unpredictable checkout navigation, and AI handles the structured validation and documentation that was consuming the majority of the time.
The pivot from full autonomy to human–LLM collaboration is itself a key lesson. Rather than engineering around every edge case, the team identified where AI adds the most value (structured analysis and document generation) and where humans are irreplaceable (navigating the unpredictable real world). Prism is built by Julian Mullins and Pep Pattamasaevi.