Databricks, a leading data and AI company, announced today that it has agreed to acquire data replication startup Arcion in a deal valued at over $100 million. The acquisition marks Databricks’ first since its $1.3 billion purchase of generative AI startup MosaicML in June and comes on the heels of $500 million in new funding last month.
This strategic move is anticipated to bolster Databricks' capabilities in importing data from a myriad of enterprise databases and SaaS applications directly into their renowned Lakehouse Platform which is used by over half of the Fortune 500 companies.
Data lakehouse platforms have gained traction as the standard for enterprise data and AI platforms. Yet, their efficacy is contingent on the quality and comprehensiveness of the data they contain. Ingesting data from existing databases and applications remains complicated, fragile, and costly. Databricks' acquisition of Arcion is poised to address these challenges head-on.
Founded in 2016, Arcion provides change data capture (CDC) technology to replicate data in real-time across databases, data warehouses and other enterprise systems. The startup offers over 20 connectors for platforms including Oracle, MySQL, PostgreSQL, Salesforce and SAP. Integrating Arcion will advance Databricks’ strategy of breaking down data silos to fuel its Lakehouse Platform and new generative AI capabilities.
Databricks secured a massive $500 million funding round in September at a valuation of $43 billion, with plans to invest heavily in the rapidly growing generative AI space. The company aims to drive enterprise adoption of large language models and other generative AI technologies by integrating them natively into its Lakehouse Platform.
According to Ali Ghodsi, CEO and co-founder of Databricks, “Arcion’s highly reliable and easy-to-use solution will enable our customers to make that data available almost instantly for faster and more informed decision-making.”
By acquiring Arcion, Databricks aims to provide native, scalable data ingestion from a wide variety of sources into customer Lakehouses. This will simplify streaming data in real-time as well as batch data transfer to power next-generation analytics and AI initiatives.
"To build analytical dashboards, data applications, and AI models, data needs to be replicated from the systems of record like CRM, ERP, and enterprise apps to the Lakehouse,” said Ghodsi.
Arcion will operate as a business unit within Databricks, leveraging the generative AI expertise gained from the MosaicML acquisition. Databricks plans to continue scaling Arcion’s CDC technology across its customer base to increase data accessibility for training and deploying large neural networks.
This latest acquisition demonstrates Databricks' strategy of accelerating product innovation in the data lakehouse and generative AI markets through targeted technology purchases. Simplifying enterprise data ingestion and reducing silos will empower customers to achieve faster ROI as they adopt next-generation analytics and AI capabilities.