Eric Olmsted, Ph.D.’s Post

View profile for Eric Olmsted, Ph.D., graphic

Value Based Care Analytics

I mentioned in a previous post that successful VBC organizations must strengthen their data pipelines in order to take advantage of analytics that can drive clinical results. To that end below are my 4 pillars of healthcare data pipeline design. Symptoms of a data pipeline with issues include delays in reporting, inconsistent values across your organizational domains, and a lack of trust from clinical users. 1) Raw Record Primacy - Healthcare data is continuously managed, massaged, and warehoused. When incorporating any healthcare data into your analytic structure favor data that is as close to the original source as possible. Trust fields from the billing data over fields from the warehouse (e.g. UB Type of Bill is of more value than an Inpatient Flag or ED Visit ID; MRN will track patient data through an EHR better than a payer member ID). 2) Transparency - Good analysts must understand the data that is being passed to them at the end of the data pipeline. To enable trust and understanding it is critical to provide accurate transparency as to how the data was processed at each step of the journey. Two techniques I have had success with include 'direct documentation' and data lineage. 'Direct documentation' is my term for using the same tables for both data processing and data documentation. There is no need to maintain separate documentation from your code stack as everything can be converted to a table (or file) driven structure. This prevents the inevitable disconnect between your documentation of the code and the actual operation of the code. This further allows for a data lineage engine that can walk specific fields from raw through analytic datamart to accurately explain how the analytic data was created. 3) Conceptual Design - Many healthcare datamarts suffer from concept agglomeration whereby mutiple fields accumulate over time that represent the same underlying concept. This frequently happens during data ingestion as engineers may be unaware of a field that already exists and mistakenly create another to serve the same need. This can happen at any point during data processing. Be ruthless in your conceptual design and create ontologies and hierarchies that organize the data into higher level concepts such that humans can drill down to the specific field they need when doing data mapping. A place for everything and everything in its place will prevent significant downstream confusion. 4) Fail as Fast as Possible - The key to this is to understand what your data processing algorithm 'knows' about the data at each step in the data pipeline. It is impossible to check PMPMs on raw data so don't design your QC process to only fail at the end. Check control totals and field gaps at the start. Once your data mapping is complete you can add field-specific validation. Your data processing algorithm should be learning about the data at each step of the process and the QC should be designed to fail at each step when possible.

Eric Olmsted, Ph.D. this is so spot on. If you pull the threads, much of what you describe is dependent on good #clinicalinformatics work, especially the themes you have of getting close to the data source as possible. Good #clinicalinformatics also facilitates far more rapid iteration as you fail fast. In addition to curating the data as close to the data source as possible, they can then take the insights from the output of your data science tools and directly alter the transactional workflows...which then creates more data. If you do it right, you create a virtuous cycle where the data feeds the workflows and the workflows feed the data: a true learning healthcare system.

Matt Schroder

Software Engineer at Optum

6mo

Ah, this puts a lot of what we’ve done into perspective for me. Thank you 💯

Alexandra Schweitzer

Executive Fellow at the Harvard Business School Social Enterprise Initiative

6mo

Gaby Alcala-Levy True for RA data too?

Yubin Park, PhD

CEO at mimilabs | LinkedIn Top Voice | Ph.D., Machine Learning and Health Data

6mo

Eric Olmsted, Ph.D. I really love this! Thanks for sharing your valuable insights!

See more comments

To view or add a comment, sign in

Explore topics