Back to portfolio
Nordvik Financial Group·Mar 2022 — Present

Principal Data Architect

Remote (Amsterdam, NL)

Tech stack

AWSApache IcebergApache KafkaApache SparkdbtApache AirflowSnowflakeTerraformPythonOpenLineage

Context

Nordvik Financial Group is a Nordic payments processor handling over €40 billion in annual transaction volume across 12 European markets. When I joined, the data estate consisted of five siloed data warehouses (two Redshift clusters, one Snowflake, one BigQuery sandbox, and a legacy on-premises Teradata) with no unified lineage, 400+ dbt models with undocumented dependencies, and a 36-hour SLA for regulatory reporting.

Challenge

The immediate mandate was to consolidate the estate and reduce the time-to-insight for the risk and compliance teams from 36 hours to under four. The underlying challenge was harder: each business unit owned its own warehouse and resisted centralisation, so the architecture had to be federated by design — shared storage and compute separation, independent transformation ownership.

Architectural decisions

I proposed and built an open lakehouse on Apache Iceberg (storage on S3, query engines via Athena for ad-hoc and Spark on EMR for batch), with Snowflake retained as the serving layer for dashboards. Key decisions:

  • Medallion architecture with clear ownership contracts: raw (bronze) ingested by a shared platform team; silver and gold owned by domain teams via isolated dbt projects in a monorepo.
  • Apache Kafka for real-time ingestion: replaced overnight SFTP drops with sub-minute CDC from core banking via Debezium → Kafka → Iceberg compaction jobs.
  • dbt-core + Airflow for orchestration: migrated all 400+ models to a modular monorepo with CI-enforced contracts, reducing broken-pipeline incidents from 12/month to 1.
  • Column-level lineage via OpenLineage + Marquez: gave compliance the audit trail for DORA and PSD2 without custom tooling.

Outcomes

  • Regulatory reporting SLA: 36 hours → 3.5 hours.
  • Analytical query p95 latency: 45 seconds → 12 seconds.
  • Infrastructure cost: -28% YoY after right-sizing EMR clusters and enabling Iceberg compaction.
  • Team: grew the platform squad from 4 to 11 engineers; introduced RFC process for architectural decisions.

Architecture

Architecture diagram 1 of 2 — Principal Data Architect
Architecture diagram 2 of 2 — Principal Data Architect