CDP Schema Best Practices

Cédrick Lunven
Apr 11
4 min read

Designing the Contact-Company Relationship at Scale

When enterprises invest in a Customer Data Platform (CDP), much of the conversation centers on integrations, identity resolution, and real-time activation. But there is a foundational question that often gets overlooked until it becomes a costly problem: How should contacts and companies be related to each other in your data model?

Get this wrong, and you will find yourself rebuilding your schema 12 months in — just as your pipeline is heating up and your marketing team finally wants to personalise at scale.

At Good Bards Marketing OS, we have seen this play out repeatedly across B2B organisations. This post breaks down the two dominant schema patterns, when to use each, and what best-in-class looks like for enterprises building for scale.

Why the Contact-Company Schema Matters More Than You Think

The contact-company relationship is the backbone of your CDP. Every downstream use case — segmentation, lead scoring, account-based marketing, customer journey mapping — depends on how cleanly this relationship is modelled.

Yet most organisations inherit a schema that was designed for convenience, not scalability. The result is data that works fine for 10,000 contacts, but breaks under the weight of 500,000 — especially when contacts start appearing across multiple accounts.

The Two Schema Patterns

Schema One: Many-to-One (Contact belongs to one Company)

In this model, each contact is assigned to exactly one company. It is simple to query, easy to understand, and fast to implement. Most legacy CRMs were built on this assumption.

Where it works well:

• SMB or mid-market B2B with clean, stable account structures

• Internal employee directories

• Early-stage CDPs with limited contact volume

Where it breaks down:

• Consultants, freelancers, and advisors who work across multiple organisations

• Board members and investors associated with several companies

• Enterprise accounts with complex subsidiary and holding structures

• Data enrichment pipelines that surface multi-affiliation signals

Schema Two: Many-to-Many (Contact belongs to many Companies)

In this model, a contact can be associated with multiple companies through a junction table — often called contact_company or contact_account_roles. The junction table holds the relationship metadata, not just the foreign keys.

A well-designed junction table typically includes:

• contact_id and company_id (foreign keys)

• role / title at that company

• is_primary flag to denote the main affiliation

• start_date and end_date for historical accuracy

• data_source to track enrichment provenance

This is the industry-standard schema for enterprise CDPs, and for good reason. It reflects the reality of how people actually move through B2B markets — across roles, companies, and relationships simultaneously.

The main_recipient Flag: Your Secret Weapon

One of the most practical additions to any many-to-many schema is the main_recipient boolean on the junction table. This single field answers the question every salesperson and marketer eventually asks: "But which company do I associate this contact with?"

With main_recipient in place, your CDP can support both precision and flexibility — defaulting to a contact's primary company for segmentation and reporting, while preserving all other affiliations for enrichment, de-duplication, and outreach strategy.

At Good Bards Marketing OS, this is one of the first schema checks we run when onboarding a new enterprise client. The presence or absence of this field tells us a great deal about how mature the organisation's data thinking is.

Schema Design Principles for Enterprise CDPs

Beyond the contact-company relationship, here are the schema design principles we recommend for enterprises building or scaling a CDP:

1. Design for reality, not convenience

Real-world data is messy. Contacts change jobs, hold multiple roles, and get enriched from disparate sources. Your schema should accommodate that from day one, even if your current data does not yet reflect that complexity.

2. Track data provenance

Every field that gets populated from an external enrichment source — whether that is a data vendor, a web form, or a product event — should carry a source tag. This is critical for data governance, consent management, and GDPR compliance.

3. Plan for temporal data

Start and end dates on relationship records are often an afterthought. But for enterprises running account-based marketing or churn prediction models, knowing that a contact left a company 18 months ago is highly valuable signal.

4. Avoid schema lock-in

Composable CDP architectures — which connect directly to your data warehouse (Snowflake, BigQuery, Databricks) — give you significantly more flexibility to evolve your schema without being constrained by a vendor's opinionated data model. This is the direction enterprise data teams are increasingly moving toward.

A Quick Decision Guide

Not sure which approach fits your business? Use this table to find out. We have replaced the technical labels with plain language so anyone on your team — not just engineers — can weigh in on the decision.

Your Situation	The Simple Approach (One home per contact)	The Flexible Approach (Contacts can belong anywhere)
Your contacts each work at one company, full stop	✅ Works perfectly	Works, but more than you need
Some contacts are consultants, freelancers, or advisors	❌ Breaks down	✅ Handles this naturally
You do account-based marketing (ABM)	❌ Too rigid	✅ Built for this
You enrich contacts with data from external tools (e.g. LinkedIn, Clearbit)	⚠️ Limited	✅ Aligns well
You need to track contacts across subsidiaries or holding companies	❌ Not supported	✅ Standard use case
You are just starting out and want to keep things simple	✅ Good starting point	✅ Still a smart choice long-term
You are unsure and want to future-proof your data model	⚠️ May need to migrate later	✅ Recommended default

The bottom line: if three or more rows in the right column apply to you, build for flexibility from the start. The cost of re-architecting later — in engineering time, data migration risk, and lost momentum — is almost always higher than doing it right the first time.

Final Thoughts: Schema Is Strategy

A CDP is only as good as the data model underneath it. The schema decisions you make today will determine what your marketing, sales, and customer success teams can do with data 12, 24, and 36 months from now.

At Good Bards Marketing OS, we help enterprise teams audit their existing CDP data models, identify structural gaps, and architect schemas that support scale — without locking you into rigid vendor frameworks. Whether you are starting fresh or untangling a legacy data model, the principles above are a solid foundation.

If you are reassessing your customer data foundation this year, we would love to talk.