Global Insurer Migrates 2,800 Alteryx Workflows to Databricks, Cuts Licensing Costs by $5.2M

MigryX Case Study • April 2026 • Insurance & Reinsurance

Executive Summary

A global insurance and reinsurance leader operating across 40 countries faced ballooning Alteryx Server and Designer licensing costs, brittle batch macro dependencies, and data pipeline throughput ceilings that limited their actuarial and risk modeling capabilities. Over 14 months, MigryX migrated all 2,800 Alteryx workflows — spanning .yxmd, .yxmc, and .yxzp package formats — to PySpark running natively on Databricks, backed by a Delta Lake storage layer. The engagement delivered over 700,000 lines of production-quality PySpark code, performance improvements of 3–7X on benchmark pipelines, and a projected $5.2 million in total cost savings over three years. The client decommissioned all Alteryx Server nodes within 60 days of final cutover.

Client Overview

The client is a multinational insurance and reinsurance group with operations spanning property & casualty, life, and specialty lines. Their data engineering function supports underwriting, actuarial reserving, claims processing, and regulatory reporting across multiple regulatory jurisdictions including Solvency II, NAIC, and Lloyd's market requirements. The organization had built a substantial Alteryx estate over eight years, starting as a business-analyst-friendly tool for ad hoc analytics and gradually expanding into a core data pipeline platform that was never designed to carry production-grade, enterprise workloads at scale.

By 2024, the Alteryx estate had grown to the point where licensing, infrastructure, and operational support consumed over $1.8 million annually. Workflows ran on a Windows-based Alteryx Server cluster that required specialized administrator knowledge and was increasingly difficult to integrate with the organization's cloud-first data strategy built on Azure Databricks and Azure Data Lake Storage Gen2.

Business Challenge

The decision to migrate was driven by a convergence of technical debt, cost pressure, and strategic platform direction. Key challenges identified during the MigryX discovery phase included:

The MigryX Approach

MigryX began with a six-week discovery and complexity classification phase. The MigryX Discovery Engine ingested all 2,800 workflow files — including nested macro packages (.yxzp) — and produced a full inventory report classifying each workflow by tool composition, macro nesting depth, data volume characteristics, In-DB connectivity, and estimated migration complexity. Of the 2,800 workflows, 61% were classified as straightforward, 29% as moderate complexity, and 10% as high complexity requiring human review. This classification governed sprint prioritization throughout the 14-month engagement.

The core migration engine parsed each Alteryx workflow's XML structure at the tool-node level, extracting the directed acyclic graph (DAG) of transformations and the configuration parameters for each tool. MigryX maintains a comprehensive mapping library covering all 250+ Alteryx tool types, including the full Input/Output suite, Preparation tools (Select, Formula, Filter, Sample), Join family (Join, Find Replace, Append Fields), Spatial tools, Reporting tools, and the complete Predictive analytics palette. Each tool mapping was tested against the client's actual data to validate output equivalence before promotion to production.

Batch and iterative macros — the highest-risk component of the estate — were converted using MigryX's macro expansion engine, which analyzes the macro's control flow, resolves parameter bindings, and emits equivalent PySpark loop constructs or Databricks Workflow parameter sweeps. For the 38 most complex macros with recursive or conditionally branching iteration logic, MigryX engineers conducted manual review sessions with the client's SMEs to validate business intent before finalizing conversion.

In-DB tool chains were converted to native PySpark DataFrame operations with Databricks-optimized execution, eliminating the mixed-execution complexity of Alteryx's push-down mode. Delta Lake was adopted as the persistent storage format, enabling ACID transactions, time travel for audit requirements, and Z-order clustering on high-cardinality join keys — directly addressing the performance bottlenecks in actuarial batch jobs. The resulting code was deployed to Databricks Workflows for orchestration, providing a fully auditable execution history and native integration with the organization's Azure DevOps CI/CD pipelines.

Migration Architecture

Component Before (Alteryx) After (Databricks)
Compute Alteryx Server (Windows, single-node per job) Databricks All-Purpose & Job Clusters (autoscaling)
Workflow Format .yxmd / .yxmc / .yxzp (binary XML) PySpark .py + Databricks Workflow JSON
Storage Layer Local Windows file shares + direct ODBC Delta Lake on ADLS Gen2
Macro Execution Iterative/Batch Macros (single-threaded) PySpark loops / Databricks parameter sweeps
In-DB Processing Alteryx In-DB connectors (Oracle, SQL Server, Snowflake) Native PySpark with Databricks connectors
Orchestration Alteryx Scheduler + Alteryx Server API Databricks Workflows + Azure DevOps triggers
Monitoring Alteryx Server admin console (manual) Databricks job run history + Azure Monitor alerts
Data Catalog None (workflow-embedded metadata only) Databricks Unity Catalog with lineage

Key Migration Highlights

Security & Compliance

The client operates under Solvency II (EU), NAIC Model Audit Rule (US), and Lloyd's Market Association data requirements. All migration activities were conducted within the client's private Azure tenant with no data leaving the client's network boundary. MigryX's on-premise migration tooling was deployed within the client's Azure environment, and all generated code was reviewed by the client's security architecture team prior to production promotion.

Results & Business Impact

The migration delivered measurable improvements across every dimension tracked in the program's success criteria framework, validated through 90 days of parallel production operation before full Alteryx Server decommissioning.

2,800
Alteryx Workflows Migrated
700K+
Lines of PySpark Generated
3–7X
Pipeline Performance Improvement
$5.2M
Projected Savings Over 3 Years
14 Mo
End-to-End Migration Duration
100%
Output Equivalence Validated

The $5.2 million three-year savings figure includes $3.1 million in eliminated Alteryx licensing and infrastructure costs, $1.4 million in reduced Alteryx Server administration labor (previously requiring a dedicated Windows infrastructure team), and $700K in avoided hardware refresh costs for the Windows-based Alteryx Server cluster. These savings are partially offset by increased Databricks compute spend for always-on cluster configurations, but the net position strongly favors the Databricks platform at the client's current data volumes.

"We had been living with the Alteryx estate for so long that we assumed the complexity was inherent to our workflows. MigryX showed us that most of that complexity was an artifact of the tool, not the business logic. The generated PySpark code was cleaner and more maintainable than what our own team would have written from scratch, and the performance improvements on our actuarial batches were a major improvement for our daily reporting cadence."

— Head of Data Engineering, Global Insurance & Reinsurance Group

Ready to Modernize Your Alteryx Estate?

See how MigryX can accelerate your migration to Databricks with parser-driven automation. Minimal manual intervention. Full output validation.

Explore Databricks Migration →