Production-ready AI cart abandonment email sequence for Shopify

By
GenHup
Disclosure: This website may contain affiliate links, which means I may earn a commission if you click on the link and make a purchase. I only recommend products or services that I personally use and believe will add value to my readers. Your support is appreciated!

Most teams trying to build an AI cart abandonment email sequence for Shopify do not have a creativity problem. They have a production safety problem.

The failure pattern is predictable: someone glues a model call into a webhook, forwards full customer data including email, and starts issuing discounts with no idempotency. It works for a demo. It is unmaintainable in production.

This guide takes the opposite approach. Treat the model as a small, constrained decision engine behind a hard shell of schemas, rules, and monitoring. You get:

  • a minimal, runnable cart abandonment flow you can ship in hours or days
  • copy paste webhook verification in Node and Python, with replay protection
  • a normalization layer that strips PII before any model call
  • idempotent discount orchestration with a clear database schema
  • observability queries, alert ideas, and a rollback playbook
  • privacy language for DPAs and DSAR handling scripts

Prerequisites before you start:

  • Shopify admin access and API credentials
  • an ESP account such as Klaviyo or Shopify Email
  • access to at least one model provider (for example OpenAI or Anthropic)
  • a place to run a small service (serverless or container) and a CI pipeline

Minimal runnable quickstart: ship a 1% canary in under a day

The fastest credible way to ship is a thin vertical slice. One play. One email. One experiment flag at 1 percent of eligible traffic.

Step 1: clone the reference repo

Use a small, focused reference project. For illustration, imagine a repository with this layout:

shopify-ai-cart-abandon/
  README.md
  package.json
  src/
    index.ts
    webhook/
      shopifyCheckout.ts
    decision/
      normalize.ts
      modelClient.ts
      schema.ts
      guardrails.ts
    discounts/
      orchestrator.ts
      db.ts
    esp/
      klaviyo.ts
      shopifyEmail.ts
  python/
    webhook_fastapi.py
    verify_shopify.py
  migrations/
    001_create_idempotency_table.sql
  test/
    normalize.fixtures.json
    normalize.test.ts
    webhookVerify.test.ts
    contractModelSchema.test.ts
  .github/
    workflows/ci.yml

You can adapt this structure directly. The rest of the guide matches these filenames so you can copy paste with minimal edits.

Step 2: configure environment

Define configuration using environment variables, not hardcoded secrets:

SHOPIFY_API_KEY=<server side key>
SHOPIFY_API_SECRET=<shared secret for webhook HMAC>
OPENAI_API_KEY=<or ANTHROPIC_API_KEY etc>
ESP_PROVIDER=klaviyo
KLAVIYO_API_KEY=<optional>
DB_URL=postgres://...
ABANDONMENT_ENABLE_CANARY=true
ABANDONMENT_CANARY_PERCENT=1

Expose a single public HTTP endpoint for Shopify to call, for example /webhooks/shopify/checkout.

Step 3: wire the Shopify webhook

In Shopify admin:

  1. Create a webhook on the event that best matches your flow. Common options:
    • checkouts/update
    • carts/update
    • orders/create with abandoned checkouts data
  2. Point it at your endpoint URL over HTTPS
  3. Use JSON and set the shared secret to your SHOPIFY_API_SECRET

During the canary phase, keep the flow simple: trigger only when there is an email on the checkout and the cart is abandoned for at least a threshold, for example 30 minutes. The service will enforce these rules; the webhook just delivers signals.

Step 4: run tests, then deploy a small slice

Before you send real emails:

  • run the unit tests locally: npm test or pnpm test
  • use ngrok or similar to expose your local endpoint and test Shopify webhook delivery
  • confirm that invalid HMACs are rejected with 401 and valid ones are accepted
  • verify that model calls never see raw email addresses in logs

Deploy to your target environment only after these pass. Start with ABANDONMENT_CANARY_PERCENT=1 for live traffic. The rollout section later explains how to graduate this to 10, 50, then 100 percent.

Webhook handlers: safe Shopify verification in Node and Python

Incorrect HMAC handling is where many cart abandonment hooks go wrong. Signature verification must operate on the raw request body bytes, not a parsed object. It also needs timing safe comparison and some basic replay protection.

Node example with Express and raw body capture

Set up Express to store the raw body before any JSON parsing.

// src/index.ts
import express from "express";
import crypto from "crypto";
import { handleShopifyCheckout } from "./webhook/shopifyCheckout";

const app = express();

// raw body saver
app.use(
  express.json({
    verify: (req: any, res, buf) => {
      req.rawBody = buf;
    },
  })
);

function timingSafeEqual(a: Buffer, b: Buffer): boolean {
  if (a.length !== b.length) return false;
  return crypto.timingSafeEqual(a, b);
}

function verifyShopifyHmac(
  rawBody: Buffer,
  headerHmac: string | undefined,
  secret: string
): boolean {
  if (!headerHmac) return false;
  const digest = crypto
    .createHmac("sha256", secret)
    .update(rawBody)
    .digest("base64");

  const expected = Buffer.from(digest, "utf8");
  const provided = Buffer.from(headerHmac, "utf8");
  return timingSafeEqual(expected, provided);
}

app.post("/webhooks/shopify/checkout", async (req: any, res) => {
  const hmacHeader = req.get("X-Shopify-Hmac-Sha256");
  const secret = process.env.SHOPIFY_API_SECRET || "";

  if (!verifyShopifyHmac(req.rawBody, hmacHeader, secret)) {
    return res.status(401).send("Invalid signature");
  }

  // Optional simple replay protection: check timestamp header
  const timestamp = req.get("X-Shopify-Webhook-Id");
  // For a stronger design, persist webhook ids and reject duplicates.

  try {
    await handleShopifyCheckout(req.body);
    return res.status(200).send("ok");
  } catch (err) {
    console.error("Webhook error", err);
    return res.status(500).send("Internal error");
  }
});

app.listen(3000, () => {
  console.log("Server on :3000");
});

export { verifyShopifyHmac };

Key points:

  • req.rawBody is the exact byte sequence Shopify signed
  • use base64 digest with HMAC SHA256, as Shopify expects
  • use constant time comparison to avoid timing side channels
  • log only high level webhook results, never the full payload

Python example with FastAPI

# python/verify_shopify.py
import base64
import hashlib
import hmac
from typing import Optional

def verify_shopify_hmac(raw_body: bytes,
                        header_hmac: Optional[str],
                        secret: str) -> bool:
    if not header_hmac:
        return False
    digest = hmac.new(
        key=secret.encode("utf-8"),
        msg=raw_body,
        digestmod=hashlib.sha256,
    ).digest()
    expected_b64 = base64.b64encode(digest)
    provided_b64 = header_hmac.encode("utf-8")
    if len(expected_b64) != len(provided_b64):
        return False
    return hmac.compare_digest(expected_b64, provided_b64)
# python/webhook_fastapi.py
from fastapi import FastAPI, Request, HTTPException
from verify_shopify import verify_shopify_hmac
import os
import json

app = FastAPI()

@app.post("/webhooks/shopify/checkout")
async def shopify_checkout(request: Request):
    raw_body = await request.body()
    header_hmac = request.headers.get("X-Shopify-Hmac-Sha256")
    secret = os.environ.get("SHOPIFY_API_SECRET", "")

    if not verify_shopify_hmac(raw_body, header_hmac, secret):
        raise HTTPException(status_code=401, detail="Invalid signature")

    payload = json.loads(raw_body)

    # Call into shared business logic (could reuse Node service patterns)
    # handle_shopify_checkout(payload)

    return {"status": "ok"}

For replay protection, you can:

  • store webhook ids from X-Shopify-Webhook-Id in a small table
  • reject repeats within a retention window, for example 24 hours

Normalization and PII redaction: keep the model blind to identities

The safest pattern is a three layer envelope:

  1. minimize inputs and strip PII before inference
  2. enforce JSON schema at prompt time and at runtime
  3. override the model with business rules on output

The normalization layer sits between the webhook and the model. It creates derived features like cart value buckets and product category summaries, and it redacts or hashes any identifiers.

Normalization design

A practical normalized input shape to feed to the model might look like this:

export type NormalizedCheckout = {
  hashed_customer_id: string | null;
  is_returning_customer: boolean;
  cart_item_count: number;
  cart_value_bucket: "low" | "medium" | "high";
  cart_currency: string;
  product_category_counts: Record<string, number>;
  time_since_last_order_hours: number | null;
  historical_discount_usage_rate_bucket: "none" | "low" | "medium" | "high";
  prior_complaint_flag: boolean;
  in_sale_segment: boolean;
  is_high_risk_segment: boolean;
};

normalize implementation with PII stripping

// src/decision/normalize.ts
import crypto from "crypto";

type ShopifyCheckoutPayload = {
  id: number;
  customer?: {
    id?: number;
    email?: string;
    tags?: string[];
    orders_count?: number;
  };
  line_items: {
    product_id: number;
    title: string;
    product_type: string;
    quantity: number;
    price: string;
    total_discount: string;
  }[];
  currency: string;
  subtotal_price: string;
  customer_locale?: string;
  // ... other fields not needed for the model
};

export function hashIdentifier(value: string | number | undefined | null) {
  if (value === undefined || value === null) return null;
  const str = String(value);
  return crypto.createHash("sha256").update(str).digest("hex");
}

export function normalizeCheckout(
  payload: ShopifyCheckoutPayload
): NormalizedCheckout {
  const subtotal = parseFloat(payload.subtotal_price || "0");
  const cart_value_bucket =
    subtotal < 50 ? "low" : subtotal < 200 ? "medium" : "high";

  const product_category_counts: Record<string, number> = {};
  let itemCount = 0;
  for (const item of payload.line_items) {
    const category = item.product_type || "unknown";
    product_category_counts[category] =
      (product_category_counts[category] || 0) + item.quantity;
    itemCount += item.quantity;
  }

  const customer = payload.customer;
  const is_returning_customer =
    (customer?.orders_count || 0) > 0 ? true : false;

  // In a real system these would come from your own data store
  const historical_discount_usage_rate_bucket = "low";
  const prior_complaint_flag = false;
  const in_sale_segment = false;
  const is_high_risk_segment = false;

  return {
    hashed_customer_id: hashIdentifier(customer?.id || customer?.email),
    is_returning_customer,
    cart_item_count: itemCount,
    cart_value_bucket,
    cart_currency: payload.currency,
    product_category_counts,
    time_since_last_order_hours: null,
    historical_discount_usage_rate_bucket,
    prior_complaint_flag,
    in_sale_segment,
    is_high_risk_segment,
  };
}

The model never sees any of:

  • email
  • name
  • address
  • full checkout id

The service that writes to your ESP keeps that mapping. That service does not use the model provider as a processor for PII. This separation is important for risk and for vendor contracts.

PII redaction tests and fixtures

Create fixtures that include clear PII, then assert the normalized outputs do not carry it.

// test/normalize.fixtures.json
{
  "simple_checkout": {
    "id": 123,
    "customer": {
      "id": 999,
      "email": "alice@example.com",
      "orders_count": 3
    },
    "line_items": [
      {
        "product_id": 1,
        "title": "T shirt red",
        "product_type": "apparel",
        "quantity": 2,
        "price": "20.00",
        "total_discount": "0.00"
      }
    ],
    "currency": "USD",
    "subtotal_price": "40.00"
  }
}
// test/normalize.test.ts
import { normalizeCheckout, hashIdentifier } from "../src/decision/normalize";
import fixtures from "./normalize.fixtures.json";

describe("normalizeCheckout", () => {
  it("hashes customer identifiers and strips PII", () => {
    const payload: any = (fixtures as any).simple_checkout;
    const normalized = normalizeCheckout(payload);

    expect(normalized.hashed_customer_id).toBeDefined();
    expect(typeof normalized.hashed_customer_id).toBe("string");
    // hash must not match raw email or id
    expect(normalized.hashed_customer_id).not.toContain("alice");
    expect(normalized.hashed_customer_id).not.toBe("999");
  });

  it("creates stable buckets", () => {
    const payload: any = (fixtures as any).simple_checkout;
    const normalized = normalizeCheckout(payload);
    expect(normalized.cart_value_bucket).toBe("low");
    expect(normalized.cart_item_count).toBe(2);
    expect(normalized.product_category_counts["apparel"]).toBe(2);
  });
});

describe("hashIdentifier", () => {
  it("is deterministic", () => {
    const a = hashIdentifier("alice@example.com");
    const b = hashIdentifier("alice@example.com");
    expect(a).toBe(b);
  });
});

Decision service: provider specific model calls behind a thin abstraction

Do not spread provider SDK calls all over your codebase. Keep a small abstraction that:

  • accepts a NormalizedCheckout
  • returns a strongly typed decision object
  • hides provider specific request shapes and error handling

Decision schema

// src/decision/schema.ts
export type PlayType = "remind_only" | "small_discount" | "large_discount";

export type Decision = {
  version: string;
  play: PlayType;
  discount_percentage: number | null;
  reason_code: string;
};

You enforce that shape through JSON schema validation and guardrails.

OpenAI example

// src/decision/modelClient.ts
import OpenAI from "openai";
import { Decision, PlayType } from "./schema";
import { z } from "zod";

const decisionSchema = z.object({
  version: z.string(),
  play: z.enum(["remind_only", "small_discount", "large_discount"]),
  discount_percentage: z.number().nullable(),
  reason_code: z.string(),
});

const openaiClient = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

export type Provider = "openai" | "anthropic";

export async function getDecision(
  normalized: NormalizedCheckout,
  provider: Provider = "openai"
): Promise<Decision> {
  if (provider === "openai") {
    return await getDecisionOpenAI(normalized);
  }
  if (provider === "anthropic") {
    return await getDecisionAnthropic(normalized);
  }
  throw new Error("Unsupported provider");
}

async function getDecisionOpenAI(
  normalized: NormalizedCheckout
): Promise<Decision> {
  const prompt = buildPrompt(normalized);

  const response = await openaiClient.responses.create({
    model: "gpt-4.1-mini",
    input: prompt,
    response_format: {
      type: "json_schema",
      json_schema: {
        name: "cart_abandon_decision",
        schema: {
          type: "object",
          additionalProperties: false,
          properties: {
            version: { type: "string" },
            play: {
              type: "string",
              enum: ["remind_only", "small_discount", "large_discount"],
            },
            discount_percentage: {
              anyOf: [{ type: "number" }, { type: "null" }],
            },
            reason_code: { type: "string" },
          },
          required: ["version", "play", "discount_percentage", "reason_code"],
        },
        strict: true,
      },
    },
    temperature: 0.1,
  });

  const outputText =
    response.output[0].content[0].type === "output_text"
      ? response.output[0].content[0].text
      : "";

  let parsed: unknown;
  try {
    parsed = JSON.parse(outputText);
  } catch (err) {
    throw new Error("Model did not return valid JSON");
  }

  const result = decisionSchema.safeParse(parsed);
  if (!result.success) {
    throw new Error("Decision schema validation failed");
  }
  return result.data;
}

function buildPrompt(normalized: NormalizedCheckout): string {
  return `
You are a decision engine for cart abandonment emails.

Input is a JSON object with derived, non PII signals.

Decide:
- play: one of "remind_only", "small_discount", "large_discount"
- discount_percentage: null or a number between 5 and 25
- version: a static string "v1"
- reason_code: short code like "new_low_value", "loyal_high_value", "high_risk"

Business constraints:
- if in_sale_segment is true, play must be "remind_only" and discount_percentage must be null
- if is_high_risk_segment is true, play must be "remind_only"
- if historical_discount_usage_rate_bucket is "high", discount_percentage must be <= 10
- prefer "remind_only" for "low" cart_value_bucket and new customers

Return only JSON.

Input:
${JSON.stringify(normalized)}
`;
}

You keep temperature low for determinism. You also constrain the shape via response format and zod validation before doing anything with the result.

Anthropic example

A similar wrapper for Anthropic keeps the rest of your service unchanged.

import Anthropic from "@anthropic-ai/sdk";

const anthropicClient = new Anthropic({
  apiKey: process.env.ANTHROPIC_API_KEY,
});

async function getDecisionAnthropic(
  normalized: NormalizedCheckout
): Promise<Decision> {
  const prompt = buildPrompt(normalized);

  const msg = await anthropicClient.responses.create({
    model: "claude-3-5-sonnet-latest",
    input: prompt,
    max_output_tokens: 512,
  });

  // Anthropic may return JSON directly in text content
  const outputText =
    msg.output[0].content[0].type === "output_text"
      ? msg.output[0].content[0].text
      : "";

  let parsed: unknown;
  try {
    parsed = JSON.parse(outputText);
  } catch {
    throw new Error("Anthropic response is not valid JSON");
  }

  const result = decisionSchema.safeParse(parsed);
  if (!result.success) {
    throw new Error("Decision schema validation failed");
  }
  return result.data;
}

You can switch providers at deployment time without touching business logic.

Schema enforcement and guardrails: overwrite the model when needed

Even with strict schemas, you should treat the model as a suggestion engine. The guardrail layer is where policy lives.

Guardrail implementation

// src/decision/guardrails.ts
import { Decision } from "./schema";

type GuardrailContext = {
  normalized: NormalizedCheckout;
};

export function applyGuardrails(
  decision: Decision,
  ctx: GuardrailContext
): Decision {
  let result = { ...decision };

  // Hard business rules
  if (ctx.normalized.in_sale_segment) {
    result.play = "remind_only";
    result.discount_percentage = null;
    result.reason_code = "override_sale_segment";
  }

  if (ctx.normalized.is_high_risk_segment) {
    result.play = "remind_only";
    result.discount_percentage = null;
    result.reason_code = "override_high_risk";
  }

  // Range checks
  if (result.discount_percentage !== null) {
    if (result.discount_percentage < 0 || result.discount_percentage > 40) {
      result.play = "remind_only";
      result.discount_percentage = null;
      result.reason_code = "override_invalid_percentage";
    }
  }

  // Complaint protection
  if (ctx.normalized.prior_complaint_flag) {
    result.play = "remind_only";
    result.discount_percentage = null;
    result.reason_code = "override_prior_complaint";
  }

  return result;
}

Your decision pipeline now looks like:

  1. normalize checkout
  2. call model
  3. validate JSON schema
  4. apply guardrails and overrides
  5. record final decision
  6. orchestrate discount and send email

ESP adapters: Klaviyo and Shopify Email

Once you have a safe decision, you emit a single instruction to your ESP. A simple strategy is to map plays to templates or flows.

Klaviyo server side event example

// src/esp/klaviyo.ts
import fetch from "node-fetch";

type KlaviyoConfig = {
  apiKey: string;
};

type EspEvent = {
  profileEmail: string;
  event: string;
  properties: Record<string, any>;
};

export async function sendKlaviyoEvent(
  cfg: KlaviyoConfig,
  event: EspEvent
): Promise<void> {
  const res = await fetch("https://a.klaviyo.com/api/events", {
    method: "POST",
    headers: {
      Authorization: `Klaviyo-API-Key ${cfg.apiKey}`,
      "Content-Type": "application/json",
      Accept: "application/json",
    },
    body: JSON.stringify({
      data: {
        type: "event",
        attributes: {
          metric: {
            name: event.event,
          },
          properties: event.properties,
          profile: {
            email: event.profileEmail,
          },
        },
      },
    }),
  });

  if (!res.ok) {
    const body = await res.text();
    throw new Error(`Klaviyo error ${res.status}: ${body}`);
  }
}

You would call this only from your trusted server side code, and only with PII there. The decision service gives you the play and discount; the ESP adapter uses your own customer id to look up the email in your database rather than forwarding email into model calls.

Shopify Email pattern

For Shopify Email, a practical pattern is to write metafields or tags on the customer or checkout, such as:

  • ai_abandonment_play: remind_only | small_discount | large_discount
  • ai_abandonment_decision_id: <uuid>

Your email flow can use those metafields as triggers or segmentation criteria.

Idempotency and discount orchestration

Duplicate discounts are one of the highest risk failure modes. Retries from Shopify, from your queue, or from manual replays can all issue multiple codes if you do not store decisions and discount mappings.

Database schema

Use a small idempotency table keyed by a deterministic hash.

-- migrations/001_create_idempotency_table.sql
CREATE TABLE IF NOT EXISTS discount_idempotency (
  idempotency_key VARCHAR(128) PRIMARY KEY,
  shopify_shop_domain VARCHAR(255) NOT NULL,
  checkout_token VARCHAR(255) NOT NULL,
  decision_hash VARCHAR(64) NOT NULL,
  price_rule_id VARCHAR(64) NOT NULL,
  discount_code VARCHAR(64) NOT NULL,
  created_at TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE INDEX IF NOT EXISTS idx_discount_idempotency_checkout
  ON discount_idempotency (shopify_shop_domain, checkout_token);

Key generation pattern

A simple key idea: sha256(shop_domain + ":" + checkout_token + ":" + decision_hash). The decision hash can be a digest of the normalized input and final decision.

// src/discounts/db.ts
import crypto from "crypto";
import { Pool } from "pg";
import { Decision } from "../decision/schema";

const pool = new Pool({ connectionString: process.env.DB_URL });

export function buildDecisionHash(decision: Decision): string {
  const json = JSON.stringify(decision);
  return crypto.createHash("sha256").update(json).digest("hex");
}

export function buildIdempotencyKey(
  shopDomain: string,
  checkoutToken: string,
  decision: Decision
): string {
  const decisionHash = buildDecisionHash(decision);
  return crypto
    .createHash("sha256")
    .update(`${shopDomain}:${checkoutToken}:${decisionHash}`)
    .digest("hex");
}

export async function getExistingDiscount(
  idempotencyKey: string
): Promise<{ price_rule_id: string; discount_code: string } | null> {
  const res = await pool.query(
    "SELECT price_rule_id, discount_code FROM discount_idempotency WHERE idempotency_key = $1",
    [idempotencyKey]
  );
  if (res.rowCount === 0) return null;
  return res.rows[0];
}

export async function saveDiscount(
  idempotencyKey: string,
  shopDomain: string,
  checkoutToken: string,
  decisionHash: string,
  priceRuleId: string,
  discountCode: string
): Promise<void> {
  await pool.query(
    `INSERT INTO discount_idempotency
     (idempotency_key, shopify_shop_domain, checkout_token,
      decision_hash, price_rule_id, discount_code)
     VALUES ($1, $2, $3, $4, $5, $6)
     ON CONFLICT (idempotency_key) DO NOTHING`,
    [
      idempotencyKey,
      shopDomain,
      checkoutToken,
      decisionHash,
      priceRuleId,
      discountCode,
    ]
  );
}

Orchestration with 409 handling

If your Shopify API call to create a price rule or discount hits a conflict (for example you used a duplicate code), you fetch the mapping instead of retrying blindly.

// src/discounts/orchestrator.ts
import { Decision } from "../decision/schema";
import {
  buildDecisionHash,
  buildIdempotencyKey,
  getExistingDiscount,
  saveDiscount,
} from "./db";

type DiscountResult = {
  priceRuleId: string | null;
  discountCode: string | null;
};

export async function ensureDiscountForDecision(
  shopDomain: string,
  checkoutToken: string,
  decision: Decision
): Promise<DiscountResult> {
  if (decision.play === "remind_only" || decision.discount_percentage === null) {
    return { priceRuleId: null, discountCode: null };
  }

  const decisionHash = buildDecisionHash(decision);
  const key = buildIdempotencyKey(shopDomain, checkoutToken, decision);

  const existing = await getExistingDiscount(key);
  if (existing) {
    return {
      priceRuleId: existing.price_rule_id,
      discountCode: existing.discount_code,
    };
  }

  // Create price rule + discount code via Shopify Admin API
  // This is a simplified placeholder; in real code use the official SDK.
  const { priceRuleId, discountCode } =
    await createShopifyDiscount(shopDomain, decision.discount_percentage);

  // Persist idempotent mapping
  await saveDiscount(
    key,
    shopDomain,
    checkoutToken,
    decisionHash,
    priceRuleId,
    discountCode
  );

  return { priceRuleId, discountCode };
}

async function createShopifyDiscount(
  shopDomain: string,
  percentage: number
): Promise<{ priceRuleId: string; discountCode: string }> {
  // Pseudo code: use fetch or Shopify SDK
  // Handle 409 conflicts by reading existing rule if applicable.
  const priceRuleId = "pr_123";
  const discountCode = "SAVE10";
  return { priceRuleId, discountCode };
}

This pattern makes retries safe. Even if your worker processes the same checkout five times, the customer sees only one code.

Observability and alerts: what to watch and how to react

If you cannot see it, you cannot safely run it. You need metrics for errors, discount spend, play distribution, and deliverability.

Core metrics

  • webhook_verification_failure_rate
    ratio of 401 responses on the webhook endpoint
  • model_call_error_rate
    parse failures, schema rejects, provider errors
  • decision_play_distribution
    percentage of plays per type per hour
  • discount_issue_rate
    discounts created per minute
  • discount_spend_velocity
    approximate gross discount amount per hour (using historical averages if you cannot compute exact numbers at first)
  • email_complaint_rate
    complaints per thousand sends
  • unsubscribe_rate
    unsubscribes per thousand sends

Example Prometheus style metrics

# increments per webhook request
counter shopify_webhook_requests_total{status="ok"}
counter shopify_webhook_requests_total{status="invalid_signature"}

# model
counter cart_ai_model_calls_total{provider="openai", outcome="success"}
counter cart_ai_model_calls_total{provider="openai", outcome="error"}
counter cart_ai_schema_failures_total

# decisions
counter cart_ai_decisions_total{play="remind_only"}
counter cart_ai_decisions_total{play="small_discount"}
counter cart_ai_decisions_total{play="large_discount"}

# discount orchestration
counter cart_ai_discounts_created_total
counter cart_ai_discounts_reused_total
counter cart_ai_discount_errors_total{type="shopify_409"}

# email outcomes (fed from ESP webhooks)
counter cart_ai_email_sent_total
counter cart_ai_email_complaint_total
counter cart_ai_email_unsub_total

Sample Prometheus alert ideas

  • High schema failure
    Query example:
    sum(rate(cart_ai_schema_failures_total[15m])) / sum(rate(cart_ai_model_calls_total[15m])) > 0.005
    Effect: page or alert operators, automatically flip traffic to a static fallback if you can.
  • Discount spike
    sum(rate(cart_ai_discounts_created_total[5m])) > some_threshold
    Set threshold based on historical manual discount volumes. If triggered, freeze discount issuance and fall back to remind only.
  • Play distribution drift
    You can compare the ratio of each play type in the last 15 minutes to a baseline. A simple heuristic is to alert if a play type more than doubles its share for an hour. This catches model drift or prompt changes.

Datadog query examples

  • Webhook verification failure rate
    sum:shopify_webhook_requests_total{status:invalid_signature}.rollup(sum, 300) / sum:shopify_webhook_requests_total{*}.rollup(sum, 300)
  • Complaint rate
    1000 * sum:cart_ai_email_complaint_total.rollup(sum, 3600) / sum:cart_ai_email_sent_total.rollup(sum, 3600)

Attach a simple runbook to each alert that answers: what might cause this and what is the safe immediate action. The safe action is almost always to reduce traffic, freeze discount issuance, or route to a known good policy.

Privacy, DPA language, and DSAR operational scripts

You can do a lot to reduce risk with the design above, but you still need language in your vendor contracts and a plan for data subject requests.

Vendor and model provider DPA clauses

The goal is to keep model providers firmly as processors for non PII signals, not full customer records. You can adapt language such as:

Purpose limitation

Customer will send to Provider only pseudonymous identifiers and derived behavioral signals that do not directly identify an individual data subject. Provider will process such data solely to generate decision outputs for Customer’s cart abandonment workflows and for no other purpose.

Training and retention

Provider will not use Customer data to train or fine tune general purpose models. Provider will retain Customer data for no longer than is necessary to perform the contracted services and in any case no longer than [N] days, after which data will be deleted or irreversibly anonymized.

Subprocessing

Provider will not engage additional subprocessors that have access to Customer data without prior written notice and an opportunity for Customer to object where reasonable.

Internal data retention choices

For your own logs, a defensible pattern is:

  • do not log raw Shopify payloads in normal operation
  • log only hashed ids, decision summaries, and error codes
  • keep detailed decision logs for a short window, for example 30 days, to debug issues
  • aggregate metrics for longer, for example 12 months

DSAR handling script

Support teams need a clear script for data subject access and deletion requests that touch this system.

Access request

  1. Confirm identity using your normal account verification process
  2. Look up internal customer id from email
  3. Query decision logs for records with the hashed identifier that matches this customer
  4. Export:
    • dates of cart abandonment decisions
    • decision type (play)
    • whether a discount was issued
  5. Provide this summary to the customer in clear language

Deletion request

  1. Confirm identity
  2. Delete or anonymize any records in discount_idempotency and decision logs that reference this customer hash
  3. Ensure forwarding of deletion to ESP, for example remove profile from cart abandonment list in Klaviyo
  4. Do not attempt to modify aggregate metrics that do not identify the person

A/B test design for revenue per recipient (RPR)

You should not roll AI driven sequences to everyone without measuring whether they help. A simple primary metric is revenue per recipient for the cart abandonment series.

Sample size example

Imagine:

  • baseline RPR from your existing non AI sequence is 1.20 in your currency
  • you hope for a lift of 0.15 (about 12.5 percent)
  • your historical standard deviation in RPR is about 3.50
  • you want a significance level of 0.05 and power of 0.8

A rough two sample t approximation for sample size per arm is:

n_per_arm ≈ 2 * (Z_0.975 + Z_0.8)^2 * sigma^2 / delta^2

Using Z values of about 1.96 and 0.84, sigma of 3.50, and delta of 0.15, you get a requirement on the order of several thousand recipients per arm. You should recompute this with your own baseline and variance numbers, but this shows that you likely need thousands, not hundreds, of abandoned checkouts to get a clean read.

Guardrail metrics for the test

Alongside RPR, track:

  • complaint rate compared to baseline
  • unsubscribe rate
  • discount spend per recipient

A possible decision rule:

  • promote the AI sequence if RPR lift is positive and significant, and guardrails are stable
  • reject if complaints or unsubscribes increase beyond a tolerable relative band, even if RPR improves

Operational rollout and rollback checklist

A clean rollout saves you from chasing ghosts later. Treat this like any other production change.

Pre launch

  • CI green:
    • normalization tests pass
    • webhook verification tests pass
    • schema contract tests pass for each provider
  • secrets set in your deployment environment and not in code
  • Shopify webhook delivers to a staging or test endpoint without errors
  • ESP test profile receives expected template with manual triggers

Canary rollout steps

  1. 1 percent traffic
    • limit to remind only play or a very small discount
    • monitor errors, discount volume, and complaints daily
  2. 10 percent traffic
    • enable full set of plays
    • start A/B test against your current sequence
  3. 50 percent traffic
    • only after you see stable metrics across at least one full weekday and weekend
  4. 100 percent traffic
    • only after your A/B test shows a clear benefit or is neutral with no meaningful downside on guardrails

Rollback criteria

Define explicit triggers for rollback before launch, for example:

  • schema failure rate exceeds a threshold for 15 minutes
  • discount creation rate spikes to more than a defined multiplier of historical manual rates
  • complaint or unsubscribe rate increases by more than a defined percentage for two consecutive days
  • critical webhook or discount errors stay elevated for more than a time window, for example 30 minutes

When any of these hit, your runbook steps might be:

  1. flip feature flag to route all traffic to remind only with no discount
  2. if issues persist, disable the AI flow entirely and revert to baseline campaign
  3. capture logs and metrics around the incident window for later analysis

CI contract tests for model schema and normalization invariants

Every deployment should re verify that your model outputs and normalization still match the contract you expect.

Schema contract tests

Create synthetic normalized fixtures and feed them through a mocked model that returns sample responses for each provider. Assert validation and guardrails accept or override as expected.

// test/contractModelSchema.test.ts
import { applyGuardrails } from "../src/decision/guardrails";
import { decisionSchema } from "../src/decision/modelClient";

describe("model decision contract", () => {
  it("accepts a valid decision", () => {
    const raw = {
      version: "v1",
      play: "small_discount",
      discount_percentage: 10,
      reason_code: "new_low_value",
    };
    const parsed = decisionSchema.parse(raw);
    expect(parsed.play).toBe("small_discount");
  });

  it("rejects extra fields", () => {
    const raw: any = {
      version: "v1",
      play: "small_discount",
      discount_percentage: 10,
      reason_code: "new_low_value",
      extra: "not_allowed",
    };
    expect(() => decisionSchema.parse(raw)).toThrow();
  });

  it("guardrails override invalid percentage", () => {
    const decision = {
      version: "v1",
      play: "small_discount" as const,
      discount_percentage: 90,
      reason_code: "test",
    };
    const normalized: any = {
      in_sale_segment: false,
      is_high_risk_segment: false,
      prior_complaint_flag: false,
    };
    const result = applyGuardrails(decision, { normalized });
    expect(result.play).toBe("remind_only");
    expect(result.discount_percentage).toBeNull();
  });
});

Normalization invariants

Add tests that fail if:

  • new PII fields leak into the normalized structure
  • bucket boundaries are changed without an explicit test update
  • hashIdentifier behavior changes unexpectedly

Suggested internal testbench and ownership

Before you give this system real traffic, teams should agree on ownership and how they will test it on their own data.

Ownership

  • service owner
    responsible for code, on call, and incident response
  • data or analytics owner
    interprets A/B tests and drift metrics
  • privacy or legal owner
    approves vendor DPAs and retention choices

Testbench ideas

A useful internal testbench can:

  • replay a sample of historical abandoned checkouts (with PII removed) through the decision service
  • inspect the distribution of plays
  • flag any decisions that would have violated current discount policies
  • let operators simulate guardrail changes and recompute outcomes

You can implement this as a separate script that reads from a file or staging database and calls the same decision code with logging enabled.

Practical decision: how to ship this safely

Choose a simple path:

  • If you need to move quickly and do not yet have deep ML experience, keep the AI decision set small, run synchronous calls with tight timeouts, and constrain plays to reminder vs small discount only. Get the plumbing right before you expand complexity.
  • If you expect high volume or tighter control requirements, invest in the asynchronous pattern with a queue and worker, add formal drift monitoring, and give your operators a control panel to adjust guardrails without changing code.
  • If you cannot keep someone on the hook to watch metrics and handle alerts, prefer a static rules based cart abandonment flow and revisit AI decisions later. An unmaintained AI sequence is worse than a well run manual one.

The winning pattern is not the fanciest model. It is the one that treats AI as a small component within a hardened system you can test, observe, and change on purpose.

Share This Article
Leave a Comment