How to manage integration of a Medallion Architecture in healthcare
While FHIR provides an easier data ingestion gateway, that’s only part of the work that must happen to support data mesh approaches.

After a healthcare organization determines that HL7’s Fast Healthcare Interoperability Resources standard belongs at the perimeter as a clean ingestion gateway, the internal domain data engineers face their most critical task – refining those raw, hierarchical streams into high-performance analytical assets.
To achieve this, organizations frequently turn to the Medallion Architecture. Originally developed as a data lakehouse design pattern by Databricks, Medallion organizes data into three progressive layers of refinement – Bronze, or raw ingestion; Silver, or cleaned and conformed; and Gold, or aggregated and business -ready data or products.
However, when health systems deploy this pattern, they frequently walk into a second major architectural trap, which is establishing a massive, centrally managed Medallion lakehouse run by a core enterprise IT team. This creates a centralized anti-pattern that completely defeats the purpose of a data mesh.
It’s easy to understand why. A single column change in an upstream EHR module or an updated operational code breaks the massive central pipeline, causing a cascading failure that blinds analysts across the entire enterprise.
To maintain true domain autonomy, health systems must adopt a strict micro-architectural framework. The Medallion pattern is not an enterprise-wide platform topology; it is an internal data engineering design pattern that must live safely within the autonomous boundary of a single domain.
The inner-domain pipeline
In a true healthcare data mesh, there is no single enterprise data lake. Instead, each individual domain – such as cardiology, radiology or revenue cycle management – runs its own isolated micro-Medallion pipeline. The rest of the organization has zero visibility into a domain's internal Bronze or Silver layers; they interact exclusively with the final Gold layer, which serves as the certified, stable data product.
To understand how this operates without centralized friction, it’s important to trace how a specific clinical domain handles incoming telemetry and transactional data. Here’s a typical flow.
The domain Bronze layer (raw capture). The domain team configures an automated, secure gateway to stream operational data or incoming edge-FHIR streams into an append-only, immutable storage layer within their cloud account.
The structure involves highly volatile, nested JSON objects, raw device outputs, semi-structured objects or structured tables. Access control is completely restricted. Outside data scientists and enterprise analysts are strictly barred from querying this layer, ensuring that external dependencies never bind to volatile, uncleaned data.
The domain Silver layer (the conforming engine). The domain's embedded data engineers write localized processing jobs to clean, filter and restructure the Bronze data. In this layer, the team executes tasks unique to their clinical or operational reality.
• Flattening. In the case of FHIR sources, complex, nested JSON arrays are unnested and mapped into highly compressed, columnar formats like Apache Parquet or Delta Lake tables.
• Clinical imputation. Known source artifacts (such as a temporary signal drop when an ICU monitor is disconnected for patient transport) are filtered out using localized clinical rules.
• Terminology normalization. Local clinical abbreviations or custom billing codes are mapped directly to global vocabularies (like LOINC, SNOMED-CT, or RxNorm) via a domain-managed terminology service.
The domain Gold layer (the certified data product). This layer represents the culmination of the domain's engineering lifecycle. The team takes the clean, conformed, row-level data from their Silver layer and aggregates it into highly specific, business-oriented schemas tailored to their enterprise consumers' needs.
This Gold layer is the only component exposed to the wider enterprise data catalog. It is assigned a permanent, version-controlled endpoint (such as a secure view in a cloud data warehouse). If the domain modifies its internal Bronze parsing logic because of an upstream application update, the final Gold product contract remains entirely unchanged, shielding downstream consumers from breaking changes.

Transforming FHIR to OLAP
To visualize this micro-pattern without getting bogged down in code, imagine a single laboratory test result — for example, a patient's blood glucose reading — as it traverses through the domain cleanroom.
At the organizational boundary, the data arrives as a raw, multi-layered FHIR text payload. This format is highly transactional, containing nested blocks of text that bundle together the lab category, the universal medical codes, the patient and encounter reference numbers, the timestamp and the final measurement values.
When this payload lands in the domain's Bronze layer, it is stored exactly as it arrived — untouched, deeply nested and highly relational.
After the domain’s internal processing cycle triggers, the engineering logic shifts the data into the Silver layer by executing three conceptual steps.
Flattening. The engine strips away the text brackets and arrays, flattening the nested hierarchy into a simple, high-speed table consisting of clean rows and columns.
Type casting. Text strings are converted into true numerical decimals for the glucose value and automated timestamps for the time of the test, preparing the data for mathematical calculations.
Clinical enrichment. The local engine automatically evaluates the result against established clinical ranges, computationally injecting a "High," "Low" or "Normal" flag directly onto the record, based on the domain's internal guidelines.
Finally, the data enters the Gold layer, where it is transformed into the consumer-facing data product. Here, the engine filters specifically for the standardized blood glucose codes and performs advanced analytical calculations, such as tracking a rolling average of a patient's last six laboratory entries over time.
The resulting output is exposed as a beautifully clean, relational “spreadsheet” view. Downstream analytics teams or machine learning models can query this view instantly to monitor patient crises, entirely insulated from the structural clutter of the original transactional text string.
The executive execution blueprint
Transitioning from a legacy centralized architecture to a decentralized, domain-driven mesh is an operational journey that cannot be achieved overnight. Health data executives should deploy a phased, value-driven execution roadmap such as the following.
Phase 1: Establish the platform foundation. Before modifying any clinical code, the centralized platform team should deploy the underlying infrastructure. They stand up a decentralized, enterprise-wide data catalog capable of tracking data product registrations and other capabilities needed by the domain owners to onboard to the enterprise’s mesh-like deployment templates. Simultaneously, the federated governance council programmatically embeds global security policies, such as automated PHI masking and patient identity matching controls, directly into the platform's deployment templates. This ensures that any infrastructure provisioned by a domain is secure and compliant by default.
Phase 2: Deploy a lighthouse domain. Do not attempt a massive, multi-department launch. Select a single, high-value clinical unit with strong leadership, such as cardiology or emergency department operations, to serve as the pioneer node. Detach a dedicated squad of data engineers from the central IT department and embed them permanently within the clinical domain. Have this team build the first micro-Medallion pipeline, register their finalized Gold table in the catalog, and verify that downstream data consumers can successfully extract insights without central intervention, iterating as required to ensure the tooling is ready for broader adoption.
Phase 3: Federated scaling and monolith decommissioning. After the lighthouse domain proves the model's viability, scale horizontally. Use the validated templates to simultaneously spin up independent workspaces for additional domains like radiology, oncology or other strategically aligned domains like RCM or human capital. As these units assume total operational ownership of their respective Gold data products, risk related to reliance on maintaining legacy data warehouses will decline.
Conclusion
By combining the semantic power of edge-FHIR frameworks with the operational resiliency of localized, micro-Medallion data engineering pipelines, healthcare organizations can finally break free from the monolithic traps that have bottlenecked digital health’s use of advanced analytics.
This architecture transforms data management from a rigid IT chore into an agile, domain-driven strategic asset. Health systems that adopt this decentralized blueprint will possess the ultimate competitive advantage – the ability to scale safely, reduce central engineering backlogs and convert raw clinical streams into actionable, life-saving insights at the point of care.
In real-world healthcare enterprises, much, if not most, of the source data desired for advanced analytics does not present as FHIR data. The critical information required to run a multi-billion-dollar health system spans far beyond clinical APIs. It includes massive, non-healthcare native datasets from corporate financials, supply chain logistics, enterprise resource planning (ERP), physical plant operations and IoT devices.
Furthermore, even within the clinical sphere, massive analytical repositories — such as legacy unstructured clinical notes, longitudinal actuarial tables and raw genomic sequencing pipelines — defy traditional FHIR modeling.
Forcing these diverse enterprise data pipelines into a transactional FHIR format introduces unnecessary computational overhead, data truncation and architectural friction. True enterprise intelligence relies on multi-format ecosystems where FHIR is simply an ingestion mechanism for a specific subset of data, coexisting alongside robust relational, analytical and event-driven big-data frameworks.
In upcoming articles, we will shift our lens away from clinical exchange boundaries to explore these real-world enterprise architectures where FHIR isn't applicable. We will examine how federal and private healthcare enterprises build comprehensive data fabrics, meshes and modern data lakehouses designed to ingest, harmonize and analyze massive, non-FHIR heterogeneous data sources and FHIR sourced data, ensuring that enterprise intelligence is limited only by strategic insight, not by exchange formats.
Thank you for following this three-part series on re-engineering the healthcare data ecosystem. Navigating the intersection of emerging architectures and strict health data compliance requires continuous collaboration across our industry, and feedback, critiques and shared experiences on these strategies is welcome.
Aaron Seib, PMP, FACHDM, CDMP-practitioner, is chief data interoperability officer at Goldbelt Apex LLC, and former senior vice president of strategy and innovation for NewWave Telecom & Technologies.
