The short version: AI performance creative is the discipline of producing high-volume, photorealistic ad creative with generative AI under human creative direction. We rebuilt our pipeline around three tools (GPT Image 2 through OpenAI’s Codex CLI, Seedance 2.0, and Lovart) plus a Claude Code creative director agent. The first TikTok campaign we shipped with this setup dropped CPA roughly 50% versus the creator-shot baseline.
I want to be honest about something. Six months ago I would not have run a campaign that leaned this hard on AI-generated people. The avatars looked off. Hands were a mess. Eyes did not track. You could feel the synthetic in the first half second, and TikTok’s audience is brutal about that.
That changed faster than I expected. The current generation of image models produces faces I genuinely cannot tell apart from a real creator’s selfie. Same for the animation. So we rebuilt the workflow around it, ran a campaign, and the numbers came back better than the version we shot with real people.
This post is the actual workflow we now run as part of our AI performance creative service. Three tools, three steps, one human still in the loop where it matters.
What is AI performance creative?
AI performance creative is the discipline of producing high-volume ad creative using generative AI tools (image, video, copy) under the direction of a human creative lead, then testing those variants against performance metrics like CPA, ROAS, and CTR. It pairs the speed and volume of generative AI with the strategic point of view of a human director.
The category is sometimes confused with “AI ad creative” or “AI-generated ads,” but the distinction matters. AI ad creative describes the output. AI performance creative describes the discipline: the pipeline, the testing cadence, and the feedback loop between media performance and the next round of generation. Without that loop, you are just generating images. With it, you are running a creative system that compounds.
The industry term has emerged over the past 12 to 18 months as image and video models crossed the threshold of “indistinguishable from a real shoot” for paid social formats. The 9x16 vertical ad on TikTok and Reels is the dominant proving ground because the formats are short, the production volume is high, and the audience tolerates handheld-feeling footage.
What does the workflow look like end to end?
The pipeline has three steps and one principle: a human creative director still owns the brief, but everything downstream of the brief is now AI. Direction happens in Claude Code with a creative director agent. Image generation happens in Codex with GPT Image 2. Animation happens in Seedance 2.0, batched through Lovart.
Here is how the steps stack up against the old pipeline:
| Step | Old Pipeline | New Pipeline |
|---|---|---|
| Creative direction | Human director, briefed in Slack, drafts in a doc | Human director paired with a Claude Code creative agent in the same workspace |
| On-camera talent | Cast a creator, ship product, wait on raw footage | Generate the avatar in Codex with GPT Image 2 |
| Animation and delivery | Editor cuts, captions, exports per spec | Lovart batches the stills into Seedance 2.0 and returns finished 9x16 video |
| Time per variant | 5 to 10 days | Under an hour of human time |
Step 1: Creative direction in Claude Code, with a human still in the chair
The first step is not the model. It is the brief. The brief is what stops the rest of the pipeline from making generic, soulless content, which is exactly what AI-only creative looks like when nobody owns the point of view.
We pair our human creative director with a Claude Code agent we call the creative director agent. It lives in the same repo as the rest of the project. The human writes the angle and the emotional beat. The agent drafts variants of the script, the on-screen caption, and the avatar’s wardrobe and setting in one pass. The director keeps what works and rewrites what does not.
The reason this matters is that the avatar is downstream of the brief. If the brief is “woman, 28, talks about loneliness,” the output will be flat. If the brief is “Korean woman lying in bed at 3am, hands over her mouth, embarrassed about how much she narrates her life inside her own head,” the output is a specific human moment. Same model, completely different ceiling.
Step 2: Photoreal avatars in GPT Image 2 through Codex
Step two is generating the avatar. We use OpenAI’s GPT Image 2, the new image model in the GPT 5.5 family, accessed through the Codex CLI rather than the chat app.
Codex matters here for one reason: we are almost never generating one image. We are generating 4 to 12 variants of the same character in different framings, lighting conditions, and micro-expressions. Codex lets us script that whole batch from one prompt file, version-control the prompts, and re-run a single variant without re-rolling the entire set. That workflow is much harder in the chat interface.
Here is one of the avatars from the campaign. This is a still from Seedance, but the underlying face was generated entirely in GPT Image 2:
Look at the skin texture under the train window light. Look at the slight asymmetry in the eyes. Look at the way the tie sits against the shirt. None of that exists. The character does not exist. The train does not exist. Six months ago this would have been a casting call and a half day of B-roll on the JR line.
A few more from the same run, different settings, different talent:
Four characters, four lighting setups, four cities we never flew to. The total human time from brief to delivered video was under an hour each.
Step 3: Lovart batches the animation through Seedance 2.0
Step three is making the still talk. We use Seedance 2.0 for animation. It is the best model we have tested at preserving the face across frames, which is the whole game. If the face drifts even slightly between the first and last second of a 9x16 ad, the algorithm and the audience both notice.
The catch with Seedance is that it processes one image at a time. That is fine for one ad. It is not fine when we are shipping 20 variants in a sprint. Lovart solves that. Lovart is an agent layer that queues the whole batch, watches the renders, and returns finished video files. We hand it the stills in the morning and pick up the videos at lunch. No human watches a progress bar.
This is the step that turns the pipeline from a clever demo into something that actually scales for paid media.
Why this dropped CPA 50% on TikTok
The honest answer is that we are still pulling the threads apart. But three things look load-bearing.
Volume of variants. The old pipeline produced 2 to 4 ads per concept because shooting was the bottleneck. The new pipeline produces 12 to 20 because rendering is cheap. TikTok’s own creative best-practices guidance has consistently pointed to creative volume and rotation as the strongest drivers of sustained performance, and that lines up with what we see in account.
Specificity of casting. With real creators we cast from who is available and who fits the budget. With avatars we cast from imagination. The salaryman on the train is exactly the salaryman the brief described, in exactly the lighting the brief described, in exactly the wardrobe. That specificity is hard to buy on a creator marketplace at any price.
Speed of iteration. When a hook tests well, we can ship a new variant of it the same day. When it dies, we kill it and replace it the same day. The old pipeline measured iteration in weeks. The new one measures it in hours.
What AI performance creative does not replace
A few things to be straight about.
It does not replace the creative director. The work is more leveraged when there is a strong human point of view at the top of the funnel and weaker when there is not. We tried running a sprint with just the agent driving the brief and the variants felt generic. Adding the human director back in fixed it.
It does not replace performance instinct. Knowing which hook to push, which avatar to retire, and when to rotate the whole concept set is still a paid media skill. The tools make the production cheap. They do not make the strategy.
It does not replace product authenticity. The avatars work because the script underneath them is true to a real product and a real audience. If you wrap synthetic talent around a synthetic insight, the audience can feel it. The compounding only happens when one of the layers is genuinely real.
Where AI performance creative is going
The reason I wrote this up is that the pipeline keeps getting cheaper and faster, and the gap between teams using it well and teams not using it at all is widening fast. A 50% drop in CPA on one campaign is not a victory lap. It is a baseline. The next campaign will move the number again because the models will be better and the workflow will be tighter.
If you want to talk through what an AI performance creative pipeline would look like for your account, that is exactly what our AI performance creative service is built around. The setup cost of moving into this workflow is not large. The cost of skipping it is.
Seeing patterns like this in your own growth data?
We help growth-stage companies diagnose exactly what's working and what's not.
Book a Free Diagnostic