Publicité
ERP IMPLEMENTATION
🇫🇷 Lire en français

ERP Data Migration: 7-Step Methodology from Cleansing to Validation

Complete ERP data migration guide. 7-step ETL methodology: mapping, cleansing, validation, dry runs, cutover and post-migration controls.

ERP Data Migration: 7-Step Methodology from Cleansing to Validation

Data migration is the most underestimated workstream in any ERP project. While configuration workshops mobilize teams, training captures management attention, and change management occupies consultants, data quietly sits in a corner, managed by a skeleton crew who discovers the true scope of the problem just three weeks before go-live.

Field experience consistently shows the same pattern: according to a frequently cited Gartner report, 83% of data migration projects exceed their budget or timeline (Oracle, citing Gartner). Bloor Research measured average cost overruns of 30% and schedule delays of 41% in their industry surveys (Bloor Research, via ResearchGate). These figures aren’t inevitable—they reflect a lack of systematic methodology.

This article details a seven-step approach to migrate data from legacy ERP to new ERP, from initial inventory through post-cutover validation.


Why data migration is the most underestimated ERP workstream

The three most common failure patterns

Dirty data. A ten-year-old ERP contains duplicates (the same supplier under three different legal names), orphaned records (order lines pointing to deleted items), inconsistent formats (dates in DD/MM/YYYY in one module, MM/DD/YYYY in another), and null values where the new ERP expects mandatory data. The problem isn’t technical—it’s organizational: no one was ever appointed data quality owner.

Incomplete mapping. The source system has a data model. The target system has another. Between them lies a translation effort that integrators call “mapping.” When this mapping is done in workshops without business involvement (accounting, purchasing, logistics), it misses edge cases: historical item codes, cascading pricing conditions, third parties with multiple delivery addresses. These gaps surface during dry runs, sometimes in production.

Insufficient testing. One dry run before go-live is gambling. Two dry runs is minimum. Three is the norm in successful projects. Each iteration reveals errors that the previous one didn’t test, because scenario scope expands as business teams understand what they need to validate.

The true cost of failed migration

A failed migration isn’t measured only in correction person-days. It’s measured in delayed month-end closes, blocked orders, customer-rejected invoices, phantom inventory that distorts replenishment. A CFO unable to close for two months post go-live ties up the entire finance team on data cleansing instead of business steering. A logistics director discovering 30% item duplicates loses ground-level team confidence.

Preparation cost is predictable and controllable. Post go-live correction cost never is.


Step 1: Map source and target data landscapes

Business object inventory

The starting point is an exhaustive functional inventory of business objects requiring migration. Not a technical inventory (tables, fields, views), but a functional one: what objects do users manipulate daily that must exist in the new ERP for operations to run on day one?

Typical list for a mid-market industrial or services company:

  • Third parties: customers, prospects, suppliers, subcontractors (with contacts, addresses, payment terms, bank details)
  • Items: product or service references, bills of materials, routings, units of measure, classifications
  • Accounting entries: opening balances, customer and supplier outstanding, fixed assets
  • Inventory: stock quantities by location, lots, serial numbers
  • Open orders: purchase and sales orders, partial deliveries
  • Documents: attachments, scanned contracts, archived delivery notes

Each object must be documented with volume (record count), criticality (day-one blocker or can migrate later), and business owner (who validates quality).

Source-to-target mapping matrix

Mapping is a working document, not a consultant deliverable filed away on a shared drive. It describes field-by-field how source system data transforms to enter the target system.

Quality mapping contains at minimum:

ElementContent
Source fieldTechnical name + functional name + sample value
Target fieldTechnical name in new ERP + expected format
Transformation ruleConcatenation, code conversion, default value, lookup in correspondence table
Edge casesWhat happens if source field is empty? If value doesn’t exist in target table?
Validation ownerBusiness reference who signs off on the rule

Skipping this upfront work means doing it under pressure during dry runs, with ad hoc decisions that get documented nowhere.


Step 2: Audit and cleanse data before migration

Five problem types to track

Pre-migration data audit looks for five problem families:

  1. Duplicates. Same customer under three different codes. Same item with an extra space in the description. Two suppliers that are actually the same legal entity after a merger. Deduplication is the most time-consuming but also most valuable workstream: every duplicate eliminated before migration is one less reconciliation after.

  2. Orphaned records. Order line pointing to deleted item. Contact linked to non-existent customer. Stock movement on closed location. Orphans break referential integrity constraints in the new ERP, which is often stricter than the legacy system.

  3. Inconsistent formats. Phone numbers with or without country code. Postal codes with 4 digits in a field expecting 5. Dates in three different formats depending on source module. Amounts with comma or decimal point depending on workstation locale.

  4. Obsolete data. Items not ordered in five years. Suppliers in bankruptcy. Prospects never converted since 2018. Migrating this data pollutes the new ERP from day one. The “migrate or archive” decision must be made by business teams, not IT.

  5. Empty mandatory fields. New ERP requires VAT code on every supplier. Legacy system didn’t. Result: 40% of supplier records have no VAT code. Must enrich before migrating, not after.

Data profiling and quality tools

Tooling choice depends on volume and complexity:

  • Custom SQL scripts: sufficient for SMEs with limited scope (under 100,000 records, 3-4 business objects). IT reference writes counting queries, duplicate detection (GROUP BY + HAVING COUNT > 1), format verification (REGEXP). Advantage: zero license cost. Disadvantage: no traceability or integrated correction workflow.

  • ETL tools with integrated profiling: Talend Data Quality, Informatica Data Quality, Microsoft SSIS + DQS. These tools offer automatic profiling (value distribution, fill rates, anomaly detection) and standardization rules (address normalization, fuzzy matching for deduplication). Suitable for mid-market companies with large volumes or heterogeneous sources.

  • Native ERP modules: SAP Information Steward (for S/4HANA migrations), Dynamics 365 data import tools, Odoo import assistant. Less powerful than dedicated tools but integrated into loading workflow.

Regardless of approach, profiling must produce a quantified report: fill rate by field, detected duplicate count, aberrant value distribution. This report is the data steward’s dashboard throughout the cleansing phase.


Step 3: Design ETL pipeline (Extract, Transform, Load)

ETL tool vs manual scripts vs native connectors

ETL acronym covers three transfer phases: extraction from source system, transformation according to mapping rules, loading into target system. Approach choice depends on several factors.

Manual scripts (Python, SQL, PowerShell). For small migrations (single source, tens of thousands of records, simple mapping). Advantage: total flexibility, no licensing. Risk: no standardized error handling, no automatic replay, often absent documentation.

Professional ETL tools (Talend, Informatica PowerCenter, Microsoft SSIS, Apache NiFi). For complex migrations with multiple sources, cascading transformations, and traceability requirements. These tools offer visual flow designer, reject handling (error records routed to quarantine table), and exploitable logs. License cost is justified once volume exceeds 200,000 records or sources exceed three systems.

Native ERP connectors. SAP Migration Cockpit, Dynamics 365 Data Entities, Odoo CSV import files. These connectors cover standard objects (customers, items, opening entries) but show limitations on custom objects, large histories, or complex transformations. They’re good complements, rarely sufficient alone.

Managing complex transformations

Certain transformations deserve special attention:

Item codes. Legacy ERP uses free alphanumeric coding (“CABLE-COP-3x2.5-100M”), new one enforces structured 12-digit code. Correspondence table must be built with purchasing and production, not IT alone.

Multi-currency. Foreign currency entries must migrate with historical exchange rate, not current rate. Mapping must specify which rate table to use and how to handle conversion differences.

Historical data. Migrate five years of accounting entries in detail or only opening balances? Migrate order history or only open orders? Each choice has cost (data volume, loading time, test complexity) and value (year-over-year comparability, audit traceability). Trade-off is business decision, not technical.


Step 4: Execute dry runs and validate

Rehearsal planning

A dry run is a migration dress rehearsal, executed in test environment with real data (or representative sample). Its purpose isn’t to “see if it works” but to measure what doesn’t work and correct before actual cutover.

Dry run 1: Technical validation. Does the ETL pipeline run end-to-end without blocking errors? Are volumes consistent (records loaded vs extracted)? Are loading times compatible with planned cutover window? This first pass reveals gross mapping errors, format issues, and unanticipated integrity constraints.

Dry run 2: Functional validation. Business references verify that migrated data is usable. Can sales find customers with correct payment terms? Can accounting find opening balances? Can logistics find inventory by location? This second pass mobilizes business teams for two to five days.

Dry run 3: Dress rehearsal. Executed in conditions as close as possible to actual cutover: same time window, same loading sequence, same post-migration checks. This tests the procedure, not just the data. If dry run 3 passes without major incident, final migration go/no-go can be pronounced.

Validation strategy

Validation relies on three complementary mechanisms:

  • Count reconciliation. Records extracted vs loaded, by business object. Any variance must be explained (rejects, filters, voluntary deduplication).

  • Financial reconciliation. Opening balance totals, customer and supplier outstanding, valued inventory. Management control must validate that new ERP totals match legacy system, to the penny.

  • User testing on sample. Business references open 20-50 random records and visually verify information is complete and correct. This sampling control detects systematic errors (field shifted one column in mapping, transformation rule that swaps two codes).


Step 5: Final migration and cutover

Cutover window: Transaction freeze and sequencing

Cutover is when data permanently leaves the legacy system to feed the new one. It’s an irreversible operation in practice (even if rollback is theoretically possible), and must be planned with surgical precision.

Transaction freeze. During cutover window, no transactions may be entered in legacy system. This means: no orders, no receipts, no accounting entries. Freeze duration depends on data volume and ETL pipeline speed. For typical mid-market company, count 24 to 72 hours. This freeze must be planned with business teams (extended weekend, slow period, end of closed accounting month).

Loading sequence. Loading order respects object dependencies: first reference data (chart of accounts, units of measure, currencies), then third parties (customers, suppliers), then items, then work-in-progress (open orders, inventory), finally opening accounting entries. Loading orders before items creates orphans from first load.

Rollback plan in case of failure

Rollback plan isn’t luxury—it’s mandatory. It must answer three questions:

  1. When do we decide to rollback? Define post-load go/no-go criteria: accounting variance above threshold, reject rate above 5%, inability to validate critical process (invoicing, closing).

  2. How do we rollback? Is legacy system still operational? Are new system entries from first test hours recoverable? Snapshot of legacy system taken just before cutover is minimal safety net.

  3. How long does rollback take? If answer is “48 hours” and business cannot afford 48 hours without ERP, need plan B (degraded operation, manual entry, partial cutover).


Step 6: Post-migration controls and stabilization

Automated reconciliations

First 48 hours post-cutover are critical. Automated control set must run continuously:

  • Record counts by business object, compared to final extraction counts
  • Financial totals: general ledger, customer subsidiary ledger, supplier subsidiary ledger, valued inventory
  • Referential integrity: no orders pointing to non-existent items, no entries pointing to deleted accounts
  • Mandatory field completeness: no third parties without VAT code, no items without unit of measure

These controls must be scripted and executable with one click, not performed manually in spreadsheets. Objective is to detect issues in minutes, not days.

Hypercare period and correction process

Hypercare is enhanced support period immediately following go-live, typically four to twelve weeks. During this period:

  • Single help desk centralizes user reports (dedicated email, Teams/Slack channel, ticketing tool). Each report is classified: migration error (missing or incorrect data), configuration error (system works but not as expected), or user error (additional training needed).

  • Data steward (or data reference) handles ongoing data corrections. This role, often filled by key business user, is contact point between functional teams and technical team.

  • Weekly review committee tracks stabilization indicators: open ticket count, month-end close timeline, inbound and outbound invoice reject rate, inventory variances found during cycle counts.

Hypercare ends when indicators return to level comparable to legacy system. Not before.


ERP data migration checklist recap

StepDeliverablesValidation Criteria
1. MappingBusiness object inventory, mapping matrixMapping signed by each business reference
2. Audit and cleansingProfiling report, deduplication planDuplicate rate reduced below 2%, mandatory fields 100% filled
3. ETL pipelineDeveloped and documented ETL flows, code correspondence tablePipeline executable end-to-end without blocking error
4. Dry runs3 documented dry runs, business validation minutesFinancial variances under 0.01%, reject rate under 1%
5. CutoverTimed cutover procedure, tested rollback planCutover executed within planned window, rollback tested
6. Post-migration controlsReconciliation scripts, stabilization dashboardZero financial variance, confirmed referential integrity
7. HypercareOperational help desk, review committeeIndicators returned to pre-migration level within 4-12 weeks

Big-bang vs incremental migration: Which approach?

Two main approaches coexist, each with advantages and disadvantages.

Big-bang migration cuts over all data in single operation, during weekend or shutdown period. Advantage: single outage, single coordination effort, single post-migration control set. Disadvantage: if cutover fails, rollback concerns entire scope. Most common approach for single-site mid-market companies.

Incremental migration (or phased) cuts over data by entity, site, or module. Advantage: each wave is smaller, easier to test and correct. Disadvantage: source and target systems coexist for weeks or months, requiring temporary synchronization interfaces and managing dual-entry period. Preferred approach for multi-site groups or very large migrations.

Choice depends on site count, data volume, business outage tolerance, and project team capacity to manage temporary coexistence complexity.


Key data migration roles

Three roles are essential and often missing from project organization:

Data steward. Neither IT nor business: hybrid profile who understands data model and business rules. Drives cleansing, decides ambiguous deduplication cases, validates transformation rules. Without data steward, mapping decisions get made by integrator who doesn’t know data history.

Business reference. One per functional domain (finance, purchasing, logistics, sales). Validates that migrated data matches business reality. Signs dry run minutes. Last line of defense before go-live.

Cutover manager. Orchestrates cutover like conductor: manages sequencing, monitors loading times, triggers controls, pronounces go/no-go. This role often falls to ERP project manager, but deserves dedicated responsibility on large migrations.


Don’t forget these items

Points that project teams discover too late:

  • Attachments. Scanned contracts, signed delivery notes, product photos. They’re not in business tables—they’re in blob storage or network directory. Must migrate them too, or accept losing access.

  • Workflows and electronic signatures. Purchase order approved in legacy system no longer has “approved” status in new system. Rebuild validation history or start fresh?

  • Access rights. User profile mapping between legacy and new system is rarely copy-paste. Roles have changed, modules reorganized, segregation of duties (SoD) rules differ.

  • Historical data trade-off. Migrating five years of closed orders has cost (volume, loading time, test complexity) and value (comparative reporting, audit traceability). If value doesn’t justify cost, archive to data warehouse or dedicated consultation tool, and migrate only live data to ERP.


For deeper exploration, consult our complete ERP migration guide covering the entire system change project, our ERP implementation guide 2026 for overall project methodology, and our ERP testing and UAT checklist to structure the validation phase that follows data migration.