Publicité
ERP IMPLEMENTATION
🇫🇷 Lire en français

ERP Business Continuity and Disaster Recovery: Complete BCDR Guide

Complete guide to ERP business continuity and disaster recovery planning. RTO, RPO, testing methodologies, and compliance with proven strategies.

ERP Business Continuity and Disaster Recovery: Complete BCDR Guide

Your ERP system crashes on a Monday morning at 8:30 AM. Orders stop flowing, billing freezes, procurement goes blind. How long can your business survive? According to the Uptime Institute’s Annual Outage Analysis 2024, 54% of major outages cost over $100,000, and nearly one in five exceeds $1 million. For an SME where the ERP is the central nervous system, every hour of downtime directly translates to lost revenue and operational chaos.

Business Continuity Planning (BCP) and Disaster Recovery Planning (DRP) are no longer luxuries reserved for large corporations. With regulations like NIS2 directive requiring full compliance by October 17, 2026, these measures are becoming regulatory obligations for a growing number of European businesses.

This guide provides a concrete methodology for building BCP and DRP plans tailored to your ERP, with governance models, key metrics, and realistic implementation timelines.

BCP and DRP: Two complementary systems, not interchangeable

The confusion between BCP and DRP is common, even among experienced IT directors. This leads to mismatched architecture choices and poorly calibrated budgets.

BCP maintains operations without interruption

Business Continuity Planning aims to prevent any service disruption. It relies on redundancy: mirrored servers, real-time database replication, automatic failover to secondary sites. End users should ideally not perceive the incident.

In an ERP context, typical BCP involves:

  • high-availability infrastructure (active/active or active/passive clusters)
  • synchronous ERP database replication
  • automatic network failover (DNS failover, load balancer)
  • documented degraded procedures for critical processes (manual order entry, interim invoices)

DRP restarts after an interruption

Disaster Recovery Planning comes into play after an interruption. Operations have stopped, systems are unavailable, and DRP defines how and how quickly we return to normal operation.

In an ERP context, DRP includes:

  • regular backups (full and incremental) stored off-site
  • pre-configured recovery environment (backup server, cloud VMs, infrastructure as code)
  • tested restoration procedure with guaranteed recovery times
  • catch-up plan for data lost between the last backup and the incident

How to articulate both for your ERP

In practice, both are necessary. BCP covers minor to moderate incidents (server failure, disk corruption, local network outage). DRP takes over when BCP has failed or when the incident is major (site destruction, widespread ransomware, natural disaster).

CriteriaBCPDRP
ObjectiveZero interruptionQuick recovery
TriggerAutomaticManual or semi-automatic
Target data lossNone (RPO = 0)Minutes to hours (RPO > 0)
Infrastructure costHigh (redundancy)Moderate (cold/warm backup)
Testing complexityHighMedium

RTO and RPO: The two metrics that drive everything

Before designing your system, you must answer two fundamental questions.

RTO: How long can you afford to be down?

The Recovery Time Objective (RTO) is the maximum acceptable interruption duration for each business process supported by the ERP.

An RTO of 4 hours means your ERP must be operational again within 4 hours of the incident, regardless of its nature. This figure isn’t decreed arbitrarily: it’s calculated based on the financial impact of the outage.

RTO calculation method:

  1. Identify critical ERP processes (order entry, billing, production, payroll)
  2. Estimate hourly cost of interruption per process (lost revenue, contractual penalties, unproductive labor costs)
  3. Determine the threshold beyond which impact becomes unacceptable
  4. This threshold gives your target RTO

For a 100-employee industrial SME, a complete ERP outage typically costs between €5,000 and €20,000 per hour (production stoppage + salaries + delay penalties). The acceptable RTO often ranges from 2 to 8 hours.

RPO: How much data can you afford to lose?

The Recovery Point Objective (RPO) is the maximum amount of data you accept to lose, measured in time elapsed since the last exploitable backup.

An RPO of 1 hour means that in case of incident, you lose at most the ERP transactions from the last hour. An RPO of 0 (synchronous replication) means no data loss, but the cost is significantly higher.

RPO decision matrix by ERP module:

ERP ModuleRecommended RPOJustification
Accounting / Finance15 min to 1 hCritical financial data, audit constraints
Production / MES1 h to 4 hReproducible manufacturing orders, often independent IoT sensors
CRM / Sales1 h to 2 hCustomer orders to re-enter if lost
HR / Payroll4 h to 24 hMonthly payroll processing, less volatile data
Purchasing / Inventory1 h to 4 hInventory movements reconcilable with physical inventory

Building your DRP/BCP in 6 steps

Step 1: Map your ERP dependencies

Your ERP doesn’t operate in isolation. It depends on dozens of technical and organizational components. Before designing any recovery solution, map:

Technical dependencies:

  • Application server(s) and database(s)
  • Interconnections with other systems (supplier EDI, banking, e-commerce, external CRM)
  • Network infrastructure (site-to-site VPN, remote access, firewall)
  • Third-party components (email server, e-signature platform, geolocation service)

Human dependencies:

  • ERP administrator (internal or external)
  • DBA (database administrator)
  • Infrastructure team / hosting provider
  • ERP integrator (for incidents requiring business expertise)

Represent these dependencies in a diagram and assign each a criticality level (vital, important, secondary). This mapping will serve as the basis for sizing the budget and prioritizing efforts.

Step 2: Define disaster scenarios

Don’t prepare a generic plan. Prepare responses to specific scenarios:

Technical scenarios:

  • Main ERP server failure (hardware or hypervisor)
  • Database corruption (human error, application bug)
  • Ransomware encrypting the ERP server and local backups
  • Cloud hosting provider outage (zone or datacenter unavailability)

Environmental scenarios:

  • Loss of primary site (fire, flood, extended power outage)
  • Extended internet outage (cloud ERP unavailability)
  • Failure of critical service provider (integrator in liquidation, acquired publisher)

For each scenario, document the estimated probability, business impact (in hours of downtime and euros), and response strategy (BCP, DRP, or manual degraded procedure).

Step 3: Choose recovery architecture

The choice is made based on a cost/recovery time trade-off:

Cold recovery site

  • Basic infrastructure, no data loaded
  • RTO: 24 to 72 h
  • Annual cost: €5,000 to €15,000
  • Suitable for non-critical ERPs with acceptable 24h RPO

Warm recovery site

  • Pre-configured servers, daily data replication
  • RTO: 4 to 12 h
  • Annual cost: €15,000 to €40,000
  • Good compromise for most SMEs

Hot recovery site

  • Real-time replication, automatic failover
  • RTO: minutes to 1 h
  • Annual cost: €40,000 to €100,000+
  • Essential for critical ERPs (continuous production, 24/7 e-commerce)

Cloud DRaaS (Disaster Recovery as a Service)

  • On-demand recovery infrastructure from public cloud (Azure Site Recovery, AWS Elastic Disaster Recovery, OVHcloud DRP)
  • RTO: 1 to 4 h depending on configuration
  • Annual cost: €10,000 to €30,000 (pay-as-you-use)
  • Increasingly popular option as it eliminates initial hardware investment

Step 4: Document failover procedures

An ERP failover procedure must be executable by someone who isn’t the usual expert (because that expert may be unavailable on the day of the disaster). The document must contain:

For each disaster scenario:

  1. Trigger criteria (when do we failover?)
  2. Decision chain (who authorizes the failover?)
  3. Step-by-step technical actions (with screenshots and exact commands)
  4. Post-failover verifications (validation checklist: user access, data integrity, active interfaces)
  5. Return-to-normal procedure (failback)
  6. Internal and external communication (who notifies whom, with what message)

Critical point: the failback procedure (return to primary site after incident) is often neglected. Yet the return is at least as risky as the initial failover, as data modified during the degraded operation period must be resynchronized.

Step 5: Test, test, test again

An untested DRP is a DRP that doesn’t work. ANSSI explicitly recommends regular testing, and the NIS2 directive requires annual testing for essential and important entities.

Test types by maturity level:

LevelTest TypeFrequencyWhat it validates
1Tabletop reviewQuarterlyProcedures are current, responsible parties know their role
2Isolated restoration testSemi-annualBackups are exploitable, technical RTO is met
3Partial failover (1 module)AnnualRecovery site functions on reduced scope
4Complete failover (entire ERP)AnnualActual RTO and RPO match objectives
5Crisis exercise (with business units)AnnualBusiness teams know how to work in degraded mode

Golden rule: each test must produce a written report with identified gaps, measured times, and corrective action plan. Without this formalized feedback, the test has no lasting value.

Step 6: Maintain the system over time

DRP/BCP is a living document. It must be revised:

  • with each ERP version upgrade (new module, cloud migration, hosting provider change)
  • with each infrastructure change (new server, new network, new interconnection)
  • with each organizational evolution (new site, acquisition, outsourcing)
  • at least once a year, even without apparent changes

Designate a DRP/BCP owner (often the CIO or CISO) with clear mandate and dedicated budget. A plan without an owner is a dead plan.

DRP/BCP and cloud ERP: What changes

If your ERP is hosted as SaaS (Odoo Online, SAP S/4HANA Cloud Public, NetSuite, Sage Intacct), part of the DRP/BCP responsibility lies with the publisher. But attention: only part of it.

What the SaaS publisher handles

  • Infrastructure redundancy (datacenters, database replication)
  • Regular application backups
  • Recovery plan in case of platform outage
  • Contractual SLA (annual availability, typically 99.5% to 99.9%)

What remains your responsibility

  • Business data: the publisher backs up infrastructure, not necessarily your specific configurations, custom reports, or configuration data
  • Integrations: if your SaaS ERP is connected to EDI, e-commerce, or BI, the continuity of these flows is your problem
  • Network access: if your internet connection fails, your SaaS ERP is inaccessible, even if it works perfectly at the publisher’s end
  • Degraded processes: what do your teams do if the ERP is unavailable for 4 hours? This scenario must be documented and rehearsed
  • Data portability: in case of major publisher failure (business cessation, serious security incident), do you have a recent and exploitable export of your data?

Questions to ask your cloud publisher

Before signing or renewing a SaaS ERP contract, demand written answers to these points:

  1. What is the contractually guaranteed RPO (not in marketing documentation)?
  2. Is the platform DRP tested? How frequently? Can I see the latest report?
  3. In case of datacenter disaster, what is the measured RTO from the last test?
  4. Is my data replicated in a geographically distant datacenter?
  5. Can I export all my data in an exploitable format (SQL dump, structured CSV)?
  6. What happens if your company ceases operations? (escrow clause, source code access)

For more on choosing between cloud and on-premise, see our cloud vs on-premise ERP comparison.

Regulatory compliance: NIS2 and beyond

What NIS2 specifically requires

The NIS2 directive, with French transposition targeting full compliance by October 17, 2026, requires essential and important entities to implement cybersecurity risk management measures, including:

  • business continuity policies, including backup management and disaster recovery
  • incident management procedures with mandatory notification within 24 hours (initial alert) and 72 hours (detailed report)
  • effectiveness assessment of risk management measures, which implies regular testing of DRP/BCP

Penalties for non-compliance can reach €10 million or 2% of annual global revenue for essential entities.

Beyond NIS2

Other regulations strengthen continuity requirements:

  • DORA (Digital Operational Resilience Act) for financial sector: mandatory digital operational resilience testing, with stress scenarios including ICT third-party providers
  • GDPR (Article 32): obligation to ensure availability and resilience of personal data processing systems
  • ISO 22301: international standard for business continuity management systems, often required by contracting authorities in tenders

For an overview of cybersecurity applied to ERP, read our ERP and cybersecurity guide.

Budget: How much to invest in your ERP DRP/BCP

The budget depends on your target RTO and ERP infrastructure size. Here are indicative ranges for an SME with 50 to 200 employees:

Initial investment

ItemRange
Risk assessment and BIA (Business Impact Analysis)€5,000 to €15,000
DRP/BCP design (consulting firm or internal CISO)€10,000 to €25,000
Recovery infrastructure (depending on cold/warm/hot)€5,000 to €60,000
Technical implementation (configuration, replication, scripts)€8,000 to €20,000
Team training and first test€3,000 to €8,000
Initial Total€31,000 to €128,000

Annual recurring costs

ItemRange
Recovery site hosting / DRaaS€5,000 to €40,000
Procedure maintenance and updates€3,000 to €10,000
Annual testing (2 to 4 tests/year)€5,000 to €15,000
Replication/backup software licenses€2,000 to €8,000
Annual Total€15,000 to €73,000

Proportionality rule: DRP/BCP budget typically represents 5 to 15% of annual IT budget. If your ERP supports €10M revenue and a day’s outage costs €50,000, a €50,000/year investment in continuity is rational arbitrage.

Mistakes that make your DRP useless

Here are the most frequent mistakes observed in the field:

1. Backups never tested for restoration Having backups running every night is reassuring. But 30% of backup restorations fail on the first attempt, according to managed service provider feedback. Test complete restoration of your ERP database at least once per quarter.

2. DRP that doesn’t include interconnections Your ERP restarts in 2 hours, but EDI with suppliers takes 48 hours to reconnect. Result: the ERP runs idle. DRP must cover the complete ecosystem, not just the application server.

3. Obsolete procedures The DRP mentions a server decommissioned 6 months ago. The on-call contact left the company. The backup VPN uses an expired certificate. An unmaintained DRP is worse than no DRP: it gives false assurance.

4. No degraded business procedures The technical DRP is perfect, but no one explained to sales how to enter an order on paper while waiting for ERP return. Degraded processes must be documented, printed (yes, on paper), and accessible without IT access.

5. RTO and RPO defined by IT without business validation IT decrees a 24-hour RTO because it’s technically feasible. The sales director discovers on disaster day that 24 hours without ERP means €200,000 in lost orders. RTO is a general management decision, not a technical parameter.

Taking action: Your 3-month roadmap

Month 1: Assessment and scoping

  • Map ERP dependencies (technical + human)
  • Perform Business Impact Analysis (BIA) by process
  • Define target RTO/RPO with management
  • Choose recovery architecture (cold/warm/hot/DRaaS)

Month 2: Design and implementation

  • Write failover and failback procedures
  • Implement recovery infrastructure
  • Configure replication and backups
  • Document degraded business processes

Month 3: Testing and formalization

  • Execute first complete restoration test
  • Measure actual RTO and RPO vs objectives
  • Train teams (IT and business)
  • Formalize DRP/BCP with management signature

To validate your approach on a reduced scope, start with a 3-month POC targeting 1 critical process (billing or production). Typical budget: €15 to 30K. Result: actual measurement of your RTO and RPO, not theoretical estimation.

For deeper insights into ERP project structuring and role allocation, consult our guide on ERP project team composition and our complete ERP implementation guide.