Consolidating client data for a UK wealth management group

Client

A UK wealth management group managing assets across private client, intermediary, and institutional channels

Sector

Financial Services

Engagement

Data architecture, engineering delivery, and regulatory assurance - multi-quarter programme.

The challenge

What the client needed

The group had grown through a series of acquisitions over seven years, bringing its assets under management to approximately thirty billion pounds. Each acquisition had added its own adviser platform, and four separate systems were still running in production, each maintaining its own client record in a different format. The consequences were felt across the business. Client data could not be reconciled in real time, creating duplicate records, inconsistent valuations, and a manual data quality process that consumed significant resource every month. Regulatory reporting under MiFID II and the FCA Consumer Duty framework required aggregating data across all four platforms, a process that was error-prone and typically ran days behind the required cadence. Advisers moving between channels had no single view of a client's full relationship with the group. Senior leadership had been told by their auditors that the data governance position carried material risk.

Our approach

How we worked

Ran a structured discovery phase mapping every client data entity, field-level lineage, and quality defect class across all four platforms, working with compliance, operations, and the adviser community to understand the business impact of each gap.
Designed a canonical client data model that could represent clients, accounts, and relationships consistently regardless of their originating platform, and validated it against the regulatory reporting requirements before a line of code was written.
Built an event-driven ingestion layer that consumed real-time change events from each source platform, applied the canonical model, and published a unified client record to a purpose-built serving layer.
Delivered an adviser-facing client view as the first consumer of the unified record, giving advisers a complete picture of each client's holdings, interactions, and history across all channels from a single interface.
Replaced the manual regulatory reporting process with an automated pipeline drawing from the canonical layer, reducing MiFID II reporting lag from four days to same-day.
Ran each source platform in parallel with the new canonical layer throughout the transition, reconciling records automatically and surfacing exceptions for operational review rather than relying on manual spot-checks.

Outcomes

Measured results

All figures verified with the client. Specific identifiers withheld in line with our standard confidentiality terms.

Duplicate client records eliminated: the canonical layer resolved more than forty thousand duplicate records identified during the reconciliation phase.
Regulatory reporting lag reduced from four working days to same-day across MiFID II transaction and position reporting.
Monthly data quality remediation effort reduced by over sixty per cent, freeing operations staff for higher-value work.
Adviser satisfaction with client data quality, measured by internal pulse survey, improved from thirty-one per cent to seventy-eight per cent within six months of the unified view going live.
External audit findings relating to data governance closed within the programme period, removing the material risk flag from the group's risk register.
In-house data engineering team transitioned to full ownership of the canonical layer within nine months, with Halfteck stepping back from embedded delivery to a support and review role.

"We had known the data fragmentation was a problem for years, but it had always been outpaced by growth. This programme gave us a foundation we could actually build on. The regulatory burden dropped, the advisers have a better experience, and we are no longer managing risk through heroics in the operations team."
- Chief Data Officer, UK Wealth Management Group

Working on something similar?

If this engagement looks like the kind of problem you are facing, we would be glad to compare notes by email.

sales@halfteck.com

Context and constraints

Wealth management businesses that grow through acquisition inherit data fragmentation as a structural condition. Each acquired firm arrives with its own client administration system, its own data conventions, and its own understanding of what a client record contains. The group we worked with had made four acquisitions over seven years, and while the front-office brands had been rationalised, the underlying data estates had not. Four platforms were still in operation at the start of the programme, each serving a subset of the adviser population, and each maintaining its own canonical view of who a client was.

The business consequences were well understood but had been managed rather than resolved. The data quality team ran a monthly reconciliation cycle that identified discrepancies, routed them to operations for manual review, and closed them in a process that consumed several person-weeks each month. Regulatory reporting was produced by extracting data from each platform separately, aggregating it in a series of spreadsheet-based models maintained by the finance and compliance teams, and submitting the results days after the position date. Advisers who managed clients across multiple channels found that their front-office system showed only the holdings administered on that channel, not the client's full picture with the group.

The external audit finding was the forcing function. Auditors had flagged the data governance position in two consecutive years, and the second finding carried a management letter recommendation that the group address the root cause rather than continue managing consequences. The Chief Data Officer commissioned the programme with a clear mandate: establish a single canonical client record, reduce regulatory reporting lag to same-day, and exit the audit finding within twelve months.

The approach in depth

We began with a structured discovery phase that covered all four platforms in parallel. The objective was not simply to document what data each platform held, but to understand how the same real-world entities, clients, accounts, holdings, and relationships, were represented differently across them. That analysis surfaced something important: the platforms did not just use different field names for the same concepts; they held different levels of completeness and different interpretations of what a field meant. Date of birth, for example, was populated in different formats and at different rates across the four systems. Account classification codes had been mapped differently when each acquired firm was onboarded, meaning that what counted as a discretionary managed account in platform A was represented as something else in platform B.

Understanding those differences at field level was necessary before designing the canonical model, because the canonical model had to make real decisions about which platform's version of a field was authoritative under which conditions, rather than simply aggregating all values and deferring the conflict to the consumer. That kind of deliberate design decision, documented in the data model and the ingestion logic, is what separates a canonical layer that actually resolves conflicts from one that moves them downstream in a different format.

We validated the canonical model against the regulatory reporting requirements before building anything. The compliance team provided the full set of MiFID II fields required for transaction and position reporting, and we walked through each field with the data team to confirm that the canonical model could produce it unambiguously from the unified record. That exercise identified two field classes that none of the existing platforms held in a fully compliant format, and those gaps were addressed in the source platforms before the ingestion layer was built, rather than after it was live.

Delivery phases and sequencing

We sequenced the programme around regulatory risk first and adviser experience second. The first delivery phase built the canonical layer and the regulatory reporting pipeline, running both in parallel with the existing manual process for a validation period. This gave the compliance team confidence in the automated output before any reliance was placed on it, and gave the programme the evidence needed to demonstrate to auditors that the new process was operating correctly. The second phase built the adviser-facing client view as the primary consumer of the unified record, and the third phase addressed the data migration and source platform rationalisation.

The parallel running model was non-negotiable. The group could not afford a compliance gap during the transition, and the adviser community would not accept a degraded client view during migration. Running the canonical layer alongside the existing processes for each phase meant there was always a validated fallback, and it gave the operations team a tool for identifying discrepancies before they became regulatory or client-facing issues rather than after.

The data migration itself presented the most significant risk management challenge. More than forty thousand duplicate records had been identified during the discovery phase, each requiring a merge decision that balanced the risk of incorrectly collapsing two genuinely separate clients against the risk of carrying forward a duplicate that would corrupt the canonical record. We built a merge decision engine that scored each candidate pair against a set of evidence criteria and routed high-confidence pairs to automated merge, borderline cases to operational review, and confirmed distinct records to a deduplication suppression list. That approach processed the majority of the duplicate population automatically while maintaining human review for the cases where the evidence was genuinely ambiguous.

Architecture and technology decisions

The canonical layer was designed as an event-driven system from the outset. Each source platform was instrumented to emit change events when client, account, or holding records were created or modified, rather than relying on periodic batch extracts. That design decision had two consequences. First, it meant the canonical record could be updated in near real time, removing the overnight batch lag that had been a persistent complaint from the adviser community. Second, it made the parallel running and reconciliation process tractable: the reconciliation engine compared canonical records to source platform records on each event, surfacing discrepancies as they occurred rather than accumulating them for a periodic review cycle.

The serving layer was built with separation of concerns between the operational and regulatory consumers. The adviser-facing client view was served through a low-latency API designed for interactive use, with a response profile that supported the sub-second page loads the front-office system required. The regulatory reporting pipeline was served through a separate interface optimised for bulk extraction and audit trail generation, with a different caching and consistency model. Keeping those two consumers on separate interfaces meant that regulatory reporting runs did not affect adviser experience and vice versa, and that the two surfaces could evolve at different cadences as requirements changed.

One significant decision was to retain the four source platforms as authoritative systems of record for their respective domains rather than attempting to migrate all data into a single new platform. The group had considered a full platform consolidation but concluded that the programme risk and cost of migrating each acquired firm's operational history was not justified by the data governance benefit, since the canonical layer could provide the unified view needed for regulatory and adviser purposes without requiring a single underlying platform. That decision constrained the architecture but significantly reduced programme risk and allowed the data governance outcomes to be delivered on a shorter timeline.

Measurable outcomes

The regulatory reporting improvement was the most immediately visible result. Moving from a four-day lag to same-day production on MiFID II reporting removed a persistent source of compliance risk and reduced the volume of ad hoc regulatory queries the group was fielding from the FCA. The compliance team described the shift as moving from reactive to proactive: they were no longer spending the first week of each month reconstructing position data for the previous period.

The adviser experience improvement was felt more gradually but ultimately had a larger commercial impact. Advisers could see a complete client picture before meetings rather than having to consult multiple systems or ask operations to pull a summary. Client review preparation time fell materially. In the adviser satisfaction survey conducted six months after go-live, seventy-eight per cent of respondents rated client data quality as good or very good, compared to thirty-one per cent before the programme. Retention conversations with the adviser community in the following review cycle attributed the improved tooling as a meaningful factor in retention decisions for several senior advisers.

The audit finding closure was achieved within the programme period. The external auditors reviewed the canonical layer's architecture, the data quality monitoring controls, and the regulatory reporting pipeline in the year-end audit and closed both prior findings without new recommendations. The Chief Data Officer described that as the first clean data governance audit result in five years.

Lessons learned

The field-level discovery work, which felt slow at the time, was what made everything else possible. Understanding not just that the platforms held client data differently, but precisely how they differed and which version was more reliable under which conditions, was the prerequisite for building a canonical model that genuinely resolved conflicts rather than passing them downstream. Teams that skip this step and go straight to design tend to build canonical layers that surface the same data quality problems in a new location rather than addressing them.

The decision to validate the canonical model against regulatory requirements before building was equally important. The two fields identified as non-compliant in the source platforms would not have been caught until the regulatory reporting pipeline was in testing, at which point the fix would have required changes to four live systems under time pressure. Finding them in the design phase meant the source platform changes could be sequenced into the programme without disrupting the delivery timeline.

Finally, the programme reinforced that ownership handover should be planned from the first sprint. The in-house data engineering team was embedded with the delivery squad from the outset, reviewing design decisions, understanding the architecture rationale, and taking progressive responsibility for components as they were delivered. The nine-month handover to full in-house ownership was achievable because it started from day one rather than being treated as a post-delivery phase.

If you are managing a fragmented client data estate and need to address regulatory reporting quality or adviser experience, we would be glad to discuss what a programme like this might look like for your organisation. Email sales@halfteck.com.