Case Study
Healthcare Services Company
Data Warehouse Upgrade to Enable Scalable Analytics

Overview

A national Healthcare Services Company engaged Object Partners to upgrade their underperforming data warehouse into a modern data platform to enable scalable analytics. This client now has a more valuable data architecture and higher quality data environment.

Challenge

Our client relied on a SQL Server instance to support their business operations, essentially using custom data analyses through a web portal. With 10s of millions of rows, thousands of lines of ETL code, and dozens of different search and aggregation patterns, the SQL server was overloaded, resulting in latency in live queries and delays in data updates. This complexity was also apparent in the data and code management for all the web and reporting systems. Our client needed to scale their reporting solution to accommodate more business cases, more data sources, and use more real-time sources.

The passing of new healthcare regulations to provide better patient/provider financial protections accelerated the demand for our client’s services, and the team needed additional support. OPI’s data engineering team worked to accelerate the business and technical scale, improved the reporting structure, and designed a modern data warehouse solution in Snowflake.

Solution

OPI provided multiple resources to modernize their data platform on two fronts:

  • Migrate to a scalable modern data warehouse (Snowflake) to reduce the analytical load on the operational SQL Server
  • Accelerate SQL Server ELT development and warehouse performance to handle more use cases with higher complexity
  • Solution Part 1: Snowflake Migration
    OPI recognized the scalability issues and complexities of using a transactional RDBMS for such complex reporting, especially since much of the reporting needed to be optimized using denormalization and pre-aggregations, which reduced the fidelity of the data used in reporting. OPI recommended Snowflake and brokered a conversation between our client, Snowflake, and internal architects.

    OPI has been rebuilding the data warehouse in Snowflake using modern data engineering tools and practices, with the following highlights:

  • Ingested data from multiple sources (Salesforce, CDC, Kafka, BDPaaS, Hive, SQL Server)
  • Integrated multiple technologies as part of the stack (Snowflake, Kafka Connect, Talend, Airflow, Kubernetes)
  • Extended API connectivity to use Snowflake, in addition to SQL Server
  • Implemented data patterns storage (raw → consumption layers), historical reporting (time travel)
  • Solution Part 2: SQL Server ETL/Warehouse
    OPI onboarded quickly to own and develop SQL Server ETL improvements, expand the warehouse, and organize the development process. We identified that with the speed of development, a lot of technical debt was left behind. While adding business features, OPI implemented a modern development process including:

  • Helped improve the agile processes (story writing, acceptance testing, and story organization) to fit data analytics use cases better
  • Implemented source control and change management for the database, procedure logic, and ETL in Github
  • Migrated deployments to use automated CI/CD process in Azure DevOps instead of cumbersome manual deployments
  • Developed python-based automated testing framework used for test-driven development of new database features
  • Refactored ETL and slow-running processes using clean coding principles
  • Implemented data cataloging microsite using markdown and Github pages for users to discover, search and understand database technical design
  • Trained team on data engineering best practices like test automation, CI/CD, and documentation
  • Business Benefits

    OPI suggested various solutions for our client to consider, helped them analyze the best options, recommended the best solutions for this specific situation, and then implemented those solutions. The result for our client is a more valuable data architecture and higher quality data environment.

    The benefits of bringing skilled data engineers with experience in modern distributed data processing meant that our client could transition from maintaining existing SQL data to scaling human and system resources more effectively. The development improvements made it possible to scale from 1-2 data engineers to enabling multiple resources to contribute in a scalable, quality-focused manner. Specifically:

  • Using Snowflake reduced data merges from over 6 hours to just minutes. Other queries which took 6-10 minutes were reduced to seconds.
  • Reduced data latency by using data that was available in Kafka instead of using the batch-loaded data lake.
  • ETL Processes that took over 48 hours to complete were refactored (and tested) to run in 10 hours or less.
  • Failed ETL could be started from the last stage instead of having to start from the beginning.
  • Developers and users have been able to understand the system and answer data questions quicker with improved structure, standards, and documentation.
  • Implemented database development standards that improved code quality, consistency, and maintainability.
  • Why Object Partners

    Most data engineering teams treat testing, documentation, and the agile process as secondary concerns. OPI treats them as primary concerns because they are the foundation of having a trustworthy data system. Clean coding and proper engineering standards increase the velocity of the team in the long term while also improving the reliability and predictability of data. Our consultants build production-ready systems and not just code that gets put in production.

    OPI understands that we shouldn’t build complex systems that the client is unable to understand or own. We work very hard to prove they meet the acceptance criteria, i.e., that we built the right system. We’ve seen too many technologists at other clients build novel but very complex solutions, and the client couldn’t effectively evaluate if they got what they needed. We really care about delivering what was wanted and needed because consultants should do more than “implement the requirements.” When OPI builds out a data system, we solve problems with a holistic product view rather than focusing on one specific engineering problem.