Back to Blog
3 min read

How I Scaled AI Lead Scoring with Generic XAI Prompts in Lockline AI

The Problem with Opaque Lead Scores

When I first shipped AI-driven lead scoring in Lockline AI, the model worked—mostly. It ranked incoming locksmith leads by conversion likelihood, routing hot prospects to the right providers. But when a lead scored high (or low) for unclear reasons, my team was flying blind.

Debugging was a nightmare. I’d see a score of 0.92 and ask: Why? Was it the user’s location? The time of day? The phrasing of their request? Without transparent reasoning, my customers (and internal teams) couldn’t trust the system. Worse, every AI provider I integrated—whether it was a fine-tuned LLM or a third-party API—had its own quirks, making consistency impossible.

I needed more than accuracy. I needed explainability.

From Hardcoded Prompts to Reusable XAI Templates

My first attempt at explanations was naive: I tacked on hardcoded prompts like "Explain why this lead is high-priority in one sentence" directly in the scoring logic. It worked—until it didn’t.

As I added more providers and scoring models, I found ourselves copy-pasting and tweaking prompts across services. A change in tone or structure in one prompt meant updating five different files. Worse, the explanations varied wildly in format: some were verbose, others cryptic. Auditability? Forget it.

The real turning point came when I reviewed a batch of low-scoring leads and realized the explanations contradicted each other. One said "User didn’t specify urgency," another "Urgency implied by 'immediately.'"—same signal, opposite interpretations. That’s when I knew: my XAI logic had to be as rigorous as my scoring logic.

So I refactored. I replaced hardcoded strings with a parameterized XAI prompt system. Instead of embedding prompts in service code, I defined a generic template:

Given a lead with attributes {attrs}, and a predicted score {score},
provide a concise, consistent explanation that:
- Uses neutral, professional tone
- References exactly one decisive factor
- Avoids speculation beyond input data
- Outputs in plain English (1 sentence)

This template wasn’t tied to locksmithing—it was designed to work across verticals. I injected context (like service type or regional demand) at runtime, keeping the core logic stable.

The change wasn’t just cosmetic. By decoupling what I explained from how I prompted, I made the system more maintainable and predictable. A single prompt update now propagates across all providers, not just one.

Reuse, Auditability, and the Hidden Win: Debugging at Scale

The real payoff came when I onboarded a second service vertical—emergency plumbing. I expected to rewrite most of the XAI logic. Instead, I plugged in the same generic prompt, swapped the attribute schema, and got coherent explanations out of the box.

That reusability wasn’t just convenient—it exposed a deeper benefit: auditability. With every explanation following the same structure, I could log, compare, and validate them programmatically. I built a simple dashboard that sampled explanations over time, flagging outliers or inconsistencies. When a model started citing "unknown factors" too often, I caught it before it reached customers.

But the biggest win? Debugging got faster. Instead of reverse-engineering why a lead scored poorly, I could read the explanation and trace it back to input features. One recent fix—correcting a timezone parsing bug that misclassified "after-hours" requests—was identified in minutes thanks to a cluster of explanations all citing "non-urgent timing" incorrectly.

Looking back, it’s clear: in B2B AI systems, the prompt isn’t just an interface—it’s part of the architecture. Treating it as a first-class, versioned component made Lockline AI more transparent, scalable, and trustworthy.

If you’re building AI into your pipeline, don’t treat explanations as an afterthought. Design them like code: reusable, testable, and central to your system’s integrity.

Newer post

Hardening AI Lead Flows: How I Stress-Tested Email Bounce Handling in Lockline AI

Older post

How I Stabilized My AI API in One Day: Debugging Authentication and Data Flow at Lockline AI