SOP-MON-03: Allocation Health Monitor v1.0¶
Detects PAID orders stuck without inventory allocation
Document ID: SOP-MON-03-v1.0
Version: 1.0
Status: 🔵 Ready to Deploy
Created: 2026-02-17
Owner: Protocol Raw Operations
Review Date: 2026-05-17
Purpose¶
Automatically detects PAID orders where one or more order_items have not been allocated to a RELEASED batch, and alerts the operations team via Slack.
The Gap This Closes: The Order Ingestion SLO monitor (SOP-MON-02) catches orders that fail to arrive in Supabase from Shopify. But an order can ingest successfully and then sit unallocated indefinitely if no RELEASED inventory is available. Without this monitor, the customer has paid but nothing ships — and nobody is alerted.
Trigger: pg_cron scheduled job (every 15 minutes)
Processing Time: <100ms
Human Intervention: Only when unallocated orders detected
Automation Coverage: 100%
Scope¶
Applies to: All PAID orders in raw_ops.orders with incomplete allocations
Monitoring Architecture: Follows SOP-MON-01 pattern exactly:
pg_cron (*/15 * * * *)
↓
pg_net (async HTTP)
↓
Edge Function: run-monitor
↓
PostgreSQL: raw_ops.fn_check_allocation_health_v2()
↓
monitoring_runs (logging)
↓
Edge Function: ops-alerter [if unallocated orders found]
↓
Slack: #ops-alerts or #ops-urgent
Related SOPs: - SOP-MON-01: Monitoring & Alerting Architecture (pattern) - SOP-MON-02: Order Ingestion SLO Monitor (upstream — order must exist first) - SOP-01: Batch Creation & Lab-to-Release (allocation fires on RELEASED) - SOP-INV-01: Inventory Control (allocation logic, stock levels) - SOP-0X: 3PL Order Export (downstream — only exports allocated orders)
Alert Logic¶
What Triggers an Alert¶
An order is flagged when ALL of the following are true:
1. orders.status = 'PAID' (payment confirmed)
2. At least one order_item has no matching record in allocations
3. The order has been PAID for longer than the grace period (30 minutes)
The 30-minute grace period accounts for: - Normal allocation trigger timing (fires on batch RELEASED) - Edge cases where a batch releases moments after an order arrives - Avoids false alarms during normal batch release → allocation cascade
Severity Levels¶
| Age Since Payment | Severity | Channel | Meaning |
|---|---|---|---|
| 30 min – 4 hours | WARN | #ops-alerts | Likely waiting for next batch release |
| 4 – 24 hours | HIGH | #ops-urgent | Stock gap — may need to expedite batch |
| > 24 hours | PAGE | #ops-urgent | Customer paid >1 day ago, no allocation |
Alert Fields¶
| Field | Description |
|---|---|
| Orders Affected | Count of unallocated orders |
| Oldest | Age of oldest unallocated order (hours) |
| Total kg Needed | Sum of kg required across all unallocated items |
| Available Stock (kg) | Total RELEASED inventory available |
| Severity | WARN / HIGH / PAGE |
| Order IDs | Shopify order IDs (first 10) |
Database Components¶
1. Core Check Function¶
CREATE OR REPLACE FUNCTION raw_ops.fn_check_allocation_health_v2()
RETURNS jsonb
LANGUAGE plpgsql
SECURITY DEFINER
AS $$
DECLARE
v_run_id uuid;
v_start_time timestamptz;
v_duration_ms integer;
v_unallocated_count integer;
v_oldest_hours numeric;
v_total_kg_needed numeric;
v_available_stock_kg numeric;
v_max_severity text;
v_order_ids text[];
v_alert_channel text;
v_alert_severity text;
BEGIN
v_start_time := clock_timestamp();
-- Create monitoring run record
INSERT INTO raw_ops.monitoring_runs (check_name, status)
VALUES ('allocation_health', 'running')
RETURNING id INTO v_run_id;
-- Find PAID orders with incomplete allocations older than 30 min grace period
WITH unallocated_orders AS (
SELECT
o.id AS order_id,
o.shopify_order_id,
o.created_at AS paid_at,
EXTRACT(EPOCH FROM (NOW() - o.created_at)) / 3600 AS age_hours,
SUM(oi.qty * p.kg) AS kg_needed,
CASE
WHEN EXTRACT(EPOCH FROM (NOW() - o.created_at)) / 3600 > 24 THEN 'PAGE'
WHEN EXTRACT(EPOCH FROM (NOW() - o.created_at)) / 3600 > 4 THEN 'HIGH'
ELSE 'WARN'
END AS severity
FROM raw_ops.orders o
JOIN raw_ops.order_items oi ON o.id = oi.order_id
JOIN raw_ops.products p ON oi.product_id = p.id
LEFT JOIN raw_ops.allocations a ON oi.id = a.order_item_id
WHERE o.status = 'PAID'
AND a.id IS NULL
AND o.created_at < NOW() - INTERVAL '30 minutes'
GROUP BY o.id, o.shopify_order_id, o.created_at
),
available_stock AS (
SELECT COALESCE(SUM(b.qty_on_hand - b.qty_reserved), 0) AS total_available_kg
FROM raw_ops.batches b
WHERE b.status = 'RELEASED'
AND (b.qty_on_hand - b.qty_reserved) > 0
)
SELECT
COUNT(*)::integer,
ROUND(MAX(uo.age_hours)::numeric, 1),
ROUND(COALESCE(SUM(uo.kg_needed), 0)::numeric, 1),
ast.total_available_kg,
MAX(uo.severity),
ARRAY_AGG(uo.shopify_order_id::text ORDER BY uo.age_hours DESC)
INTO
v_unallocated_count,
v_oldest_hours,
v_total_kg_needed,
v_available_stock_kg,
v_max_severity,
v_order_ids
FROM unallocated_orders uo
CROSS JOIN available_stock ast;
-- Handle NULL (no unallocated orders)
IF v_unallocated_count IS NULL OR v_unallocated_count = 0 THEN
v_unallocated_count := 0;
v_oldest_hours := 0;
v_total_kg_needed := 0;
v_max_severity := NULL;
v_order_ids := ARRAY[]::text[];
SELECT COALESCE(SUM(b.qty_on_hand - b.qty_reserved), 0)
INTO v_available_stock_kg
FROM raw_ops.batches b
WHERE b.status = 'RELEASED'
AND (b.qty_on_hand - b.qty_reserved) > 0;
END IF;
v_duration_ms := EXTRACT(EPOCH FROM (clock_timestamp() - v_start_time)) * 1000;
-- Determine alert channel
IF v_max_severity IN ('PAGE', 'HIGH') THEN
v_alert_channel := 'ops-urgent';
v_alert_severity := 'critical';
ELSIF v_max_severity = 'WARN' THEN
v_alert_channel := 'ops-alerts';
v_alert_severity := 'warning';
END IF;
-- Update monitoring run
UPDATE raw_ops.monitoring_runs
SET
status = 'success',
duration_ms = v_duration_ms,
result_summary = CASE
WHEN v_unallocated_count = 0 THEN 'All PAID orders allocated'
ELSE v_unallocated_count || ' orders unallocated (oldest: ' || v_oldest_hours || 'h)'
END,
result_json = jsonb_build_object(
'unallocated_count', v_unallocated_count,
'oldest_hours', v_oldest_hours,
'kg_needed', v_total_kg_needed,
'available_stock_kg', v_available_stock_kg,
'max_severity', v_max_severity,
'order_ids', v_order_ids
)
WHERE id = v_run_id;
-- Return result
RETURN jsonb_build_object(
'success', true,
'run_id', v_run_id,
'check_name', 'allocation_health',
'unallocated_count', v_unallocated_count,
'duration_ms', v_duration_ms,
'alert_config', CASE
WHEN v_unallocated_count = 0 THEN NULL
ELSE jsonb_build_object(
'channel', v_alert_channel,
'severity', v_alert_severity,
'title', '📦 Unallocated Orders Detected',
'message', v_unallocated_count || ' PAID order(s) have no inventory allocation',
'fields', jsonb_build_array(
jsonb_build_object('label', 'Orders Affected', 'value', v_unallocated_count::text, 'inline', true),
jsonb_build_object('label', 'Oldest', 'value', v_oldest_hours || ' hours', 'inline', true),
jsonb_build_object('label', 'Severity', 'value', v_max_severity, 'inline', true),
jsonb_build_object('label', 'kg Needed', 'value', v_total_kg_needed || ' kg', 'inline', true),
jsonb_build_object('label', 'Available Stock', 'value', v_available_stock_kg || ' kg', 'inline', true),
jsonb_build_object('label', 'Order IDs', 'value', array_to_string(v_order_ids[1:10], ', '), 'inline', false)
),
'source_check', 'allocation_health'
)
END
);
EXCEPTION WHEN OTHERS THEN
v_duration_ms := EXTRACT(EPOCH FROM (clock_timestamp() - v_start_time)) * 1000;
UPDATE raw_ops.monitoring_runs
SET status = 'error', duration_ms = v_duration_ms, error_message = SQLERRM
WHERE id = v_run_id;
RETURN jsonb_build_object(
'success', false,
'run_id', v_run_id,
'check_name', 'allocation_health',
'error', SQLERRM,
'duration_ms', v_duration_ms
);
END;
$$;
2. Required Indexes¶
-- Fast lookup: PAID orders without allocations
-- This is the critical query path for the monitor
CREATE INDEX IF NOT EXISTS idx_orders_paid_unallocated
ON raw_ops.orders(created_at DESC)
WHERE status = 'PAID';
-- Fast join: order_items to allocations (LEFT JOIN performance)
CREATE INDEX IF NOT EXISTS idx_allocations_order_item_id
ON raw_ops.allocations(order_item_id);
-- Fast lookup: available RELEASED inventory
CREATE INDEX IF NOT EXISTS idx_batches_released_available
ON raw_ops.batches(status)
WHERE status = 'RELEASED';
3. Verification View (Optional — for Metabase/Manual Checks)¶
CREATE OR REPLACE VIEW raw_ops.v_unallocated_orders AS
SELECT
o.id AS order_id,
o.shopify_order_id,
o.order_number,
c.email AS customer_email,
c.first_name,
o.created_at AS paid_at,
ROUND(EXTRACT(EPOCH FROM (NOW() - o.created_at)) / 3600, 1) AS hours_waiting,
SUM(oi.qty * p.kg) AS kg_needed,
p.sku,
CASE
WHEN EXTRACT(EPOCH FROM (NOW() - o.created_at)) / 3600 > 24 THEN 'PAGE'
WHEN EXTRACT(EPOCH FROM (NOW() - o.created_at)) / 3600 > 4 THEN 'HIGH'
ELSE 'WARN'
END AS severity
FROM raw_ops.orders o
JOIN raw_ops.order_items oi ON o.id = oi.order_id
JOIN raw_ops.products p ON oi.product_id = p.id
JOIN raw_ops.customers c ON o.customer_id = c.id
LEFT JOIN raw_ops.allocations a ON oi.id = a.order_item_id
WHERE o.status = 'PAID'
AND a.id IS NULL
AND o.created_at < NOW() - INTERVAL '30 minutes'
GROUP BY o.id, o.shopify_order_id, o.order_number, c.email, c.first_name, o.created_at, p.sku
ORDER BY o.created_at ASC;
GRANT SELECT ON raw_ops.v_unallocated_orders TO analytics_user;
Edge Function Update¶
run-monitor Dispatch Map¶
Add allocation_health to the MONITOR_FUNCTIONS map in run-monitor/index.ts:
const MONITOR_FUNCTIONS: Record<string, string> = {
// Existing monitors
order_ingestion_slo: "fn_check_order_ingestion_slo_v2",
outbox_health: "fn_check_outbox_health_v2",
stock_levels: "fn_check_stock_levels_v2",
lab_sla: "fn_check_lab_sla_v2",
daily_snapshot: "fn_daily_snapshot_v2",
working_capital: "fn_check_working_capital_v2",
refund_health: "fn_check_refund_health_v2",
subscription_health: "fn_check_subscription_health_v2",
subscription_daily: "fn_check_subscription_daily_v2",
// NEW: Allocation health (SOP-MON-03)
allocation_health: "fn_check_allocation_health_v2",
};
Deploy:
pg_cron Configuration¶
Schedule Job¶
SELECT cron.schedule(
'monitor-allocation-health',
'*/15 * * * *',
$$
SELECT net.http_post(
url := 'https://znfjpllsiuyezqlneqzr.supabase.co/functions/v1/run-monitor',
headers := '{"Content-Type": "application/json", "apikey": "YOUR_ANON_KEY"}'::jsonb,
body := '{"check": "allocation_health"}'::jsonb
);
$$
);
Verify Job¶
Testing¶
Test 1: No Unallocated Orders (Healthy State)¶
Expected: alert_config: null (no alert fired)
Test 2: Simulate Unallocated Order¶
-- Create a test order older than 30 min grace period
INSERT INTO raw_ops.orders (customer_id, total_inc_vat, status, created_at)
VALUES (
(SELECT id FROM raw_ops.customers LIMIT 1),
89.00,
'PAID',
NOW() - INTERVAL '2 hours'
)
RETURNING id;
-- Add an order_item with NO allocation
INSERT INTO raw_ops.order_items (order_id, product_id, qty)
VALUES (
'<order_id_from_above>',
(SELECT id FROM raw_ops.products WHERE sku = 'STARTER-8KG' LIMIT 1),
1
);
-- Run monitor — should detect the unallocated order
SELECT raw_ops.fn_check_allocation_health_v2();
-- Clean up test data
DELETE FROM raw_ops.order_items WHERE order_id = '<order_id_from_above>';
DELETE FROM raw_ops.orders WHERE id = '<order_id_from_above>';
Expected: alert_config populated with channel: ops-alerts, severity: warning
Test 3: Verify Monitoring Run Logged¶
SELECT * FROM raw_ops.monitoring_runs
WHERE check_name = 'allocation_health'
ORDER BY run_at DESC LIMIT 5;
Test 4: Edge Function End-to-End¶
curl -X POST 'https://znfjpllsiuyezqlneqzr.supabase.co/functions/v1/run-monitor' \
-H 'Content-Type: application/json' \
-H 'apikey: YOUR_ANON_KEY' \
-d '{"check": "allocation_health"}'
Alert Response Runbook¶
WARN (30 min – 4 hours)¶
Likely Cause: No RELEASED inventory available. Normal if a batch is in QA_HOLD.
Actions:
1. Check if batch is pending lab results: SELECT * FROM raw_ops.batches WHERE status = 'QA_HOLD' ORDER BY created_at DESC;
2. Check lab SLA: Is the batch within expected turnaround?
3. If lab results are overdue → chase lab (see SOP-01)
4. If no batch in pipeline → contact co-packer for expedited production
HIGH (4 – 24 hours)¶
Likely Cause: Stock gap — no batch available or lab results significantly delayed.
Actions:
1. All WARN actions above
2. Check demand forecast: SELECT * FROM raw_ops.v_unallocated_orders;
3. Quantify the gap: How much kg is needed vs what's in QA_HOLD?
4. Contact co-packer for production ETA
5. Consider communicating delay to affected customers if >12 hours
PAGE (> 24 hours)¶
Likely Cause: Serious supply chain issue — customer has been waiting >1 day.
Actions:
1. All HIGH actions above
2. Immediate co-packer escalation (phone, not email)
3. Prepare customer communication: proactive email explaining delay
4. If batch in QA_HOLD with results overdue: escalate lab directly
5. Consider manual allocation if partial stock available (ship what you can)
6. Log incident in ops_events with ALLOCATION_ESCALATION type
Deployment Checklist¶
- [ ] Run SQL: Create
fn_check_allocation_health_v2()function - [ ] Run SQL: Create indexes (3 indexes)
- [ ] Run SQL: Create
v_unallocated_ordersview - [ ] Update
run-monitorEdge Function withallocation_healthmapping - [ ] Deploy:
supabase functions deploy run-monitor - [ ] Run SQL: Schedule pg_cron job
- [ ] Test: Manual function call (Test 1)
- [ ] Test: Simulated unallocated order (Test 2)
- [ ] Test: Edge Function end-to-end (Test 4)
- [ ] Verify:
monitoring_runslogging (Test 3) - [ ] Verify: Slack alert appears in correct channel
Monitor Summary¶
| Property | Value |
|---|---|
| Check Name | allocation_health |
| Function | raw_ops.fn_check_allocation_health_v2() |
| Schedule | Every 15 minutes (*/15 * * * *) |
| Grace Period | 30 minutes (avoids false alarms during batch release) |
| WARN | 30 min – 4 hours → #ops-alerts |
| HIGH | 4 – 24 hours → #ops-urgent |
| PAGE | > 24 hours → #ops-urgent |
| Logging | raw_ops.monitoring_runs (check_name = 'allocation_health') |
| View | raw_ops.v_unallocated_orders (for Metabase / manual checks) |
Version History¶
| Version | Date | Author | Changes |
|---|---|---|---|
| 1.0 | 2026-02-17 | Anton + Claude | Initial release — closes Gap 2 from system review |
Document Owner: Protocol Raw Operations
Classification: Monitoring Infrastructure
Review Cycle: Quarterly
Next Review: 2026-05-17
Protocol Raw — Verified safe, batch by batch.