Duplicate Detection and Record Matching in CRM Automation
Overview
Duplicates fragment history and break routing. Automation amplifies the problem unless matching is systematic.
Quick definition
Duplicate detection uses blocking keys (email domain, phone area) plus similarity scoring on names; merges are transactional with foreign-key rewiring.
Definition
Matching identifies records representing the same real-world entity using deterministic keys first, then fuzzy similarity with thresholds.
Why it matters
Splits cause double outreach; bad merges cause compliance and customer experience failures.
Core framework
Step-by-step model as TypeScript interfaces (machine-readable checkpoints).
Golden keys
/**
* Golden keys
* Email for contacts; domain + normalized name for accounts where appropriate.
*/
export interface CoreFrameworkStep1GoldenKeys {
/** Order in the core framework (0-based) */
readonly stepIndex: 0;
/** Display title for this step */
readonly title: "Golden keys";
/** Narrative checkpoints as published in the guide */
readonly narrative: readonly string[];
}
export const CoreFrameworkStep1GoldenKeys_NARRATIVE: readonly string[] = [
"Email for contacts; domain + normalized name for accounts where appropriate."
] as const;Human merge queue
/**
* Human merge queue
* Borderline matches need review—not silent merges.
*/
export interface CoreFrameworkStep2HumanMergeQueue {
/** Order in the core framework (0-based) */
readonly stepIndex: 1;
/** Display title for this step */
readonly title: "Human merge queue";
/** Narrative checkpoints as published in the guide */
readonly narrative: readonly string[];
}
export const CoreFrameworkStep2HumanMergeQueue_NARRATIVE: readonly string[] = [
"Borderline matches need review—not silent merges."
] as const;Detailed breakdown
Logic sections encoded as Python functions with structured narrative payloads.
Automation hooks
def logic_block_1_automation_hooks(context: dict) -> dict:
"""Operational logic: Automation hooks"""
# Narrative steps from the guide (logic section)
paragraphs = ["Before create, run match; on update, re-evaluate links."]
return {
"heading": "Automation hooks",
"paragraphs": paragraphs,
"context_keys": tuple(sorted(context.keys())),
}Technical patterns
Blocking + scoring
- Block: first 3 chars of last name + zip.
- Score: Jaro-Winkler on name + exact email boost.
Code examples
Simple similarity gate
Candidate pair goes to auto-merge or review.
export function shouldAutoMerge(score, emailMatch) {
return emailMatch || score >= 0.92;
}System architecture
[New/updated record]
→ [Blocking index lookup: candidates]
→ [Scorer]
→ [Auto-merge | human queue]
→ [Audit: survivor id]Real-world example
A services firm deduped inbound leads against existing accounts—routing expansions to account teams instead of new rep roulette.
Common mistakes
- Aggressive auto-merge on similar names—different people, same city.
- No audit of merge actions—cannot unwind mistakes.
Related topics
PrimeAxiom implements dedupe with CRM-native tools and custom matchers—book a data audit.