Needs assessment for an educational LibreHealth EHR

Really good discussions by all:

  1. We all agree to proceed with creating the educational and research EHR. At this point the NHANES data set may be the most attractive, but with limitations. Marshfield Clinic stated it might take 2 weeks to give us an estimate of the cost to duplicate the NHANES data with “EHR patient data”. We might not be able to afford the data plus programming time. Sunbiz found a reasonable grant for research education but I think IUPUI is much better positioned to apply due to its huge department than UWF with limited faculty members. Hopefully, we can be a partner with IUPUI on grants. Regardless of the origin of the data sets I think we should proceed with a statement of work and timeline so we can set the funding mechanism in action.

  2. Behind the scenes we should continue the good discussions on the viability of adding a diabetes registry function and machine learning. Both will require further research and thought

Speaking of “R”, does anyone other than me have any contacts at the Rand Corporation?

A study to objectively determine the efficacy of donated medications in free clinics might be something pharmaceutical companies would be willing to fund.

In the meantime I started some technical analysis to understand how to use the NHANES data. Using R, I’ve successfully generated a CSV file of 2013-2014 NHANES demographics data, which could next be used to generate patient records in LibreEHR.

Not sure what it’ll take to get NHANES data into WEKA, but will do some investigation. It seems reasonable to have both options of newer machine learning tools and traditional statistical analysis packages.

It would be great to approach analytics with NHANES data using R and WEKA. WEKA can analyze CSV files or connect directly to the MYSQL database (I have not done that yet). The reason I selected the 2011-2012 time period was that it included medications. In the time periods following that, medications have not been published yet. Unfortunately, there is a 3-5 year delay in publishing data which can be quite frustrating

@aethelwulffe has tools for generating random patient demographic records… so we should be able to connect the NHANES information is some interesting ways to the generated patient demographics …

From what I recall of Art’s tool, it generates patients from the “database end” (direct SQL into the tables) which bypasses abstraction layers of the application.

Another approach to consider is to programatically create the patients (and other information) through the application itself with the front-end/web layer. (e.g through scripts that normally are used for automated testing).

This has a disadvantage of complexity and speed, but the framework developed will be useful in the long run and be more flexible. Instead of mapping NHANES data to specific columns in tables across the database, mapping would be done to fields on forms in the application. That would make things like creating Layout Based Forms from NHANES data easier than having to recreate the multiple insert queries in SQL. It would also make “partial” mapping easier. As an example, if a LibreEHR field exists, with no source from NHANES, for consistency, should the database record “null” or “empty string?” If inserting the record directly through SQL, the code would need to be examined to know for sure what LibreEHR does, but if it’s done through the application itself, the field can be left empty when the page is submitted and the result will be consistent with manually entered info.

To overcome some of the disadvantage of speed, we can publish “pre-populated” databases so that everyone doesn’t have to run the record generation script for themselves.

Good to know about the completeness of the datasets and the update delays. Eventually, the tools should be able to handle data from any of the available time periods seamlessly, (maybe not all the way back to NHANES I,II and III, just the continuous datasets).

It would be great if we could start with one time period, such as 2011-2012 and then down the road, depending on our goals, add more patients from other time periods. The point was made that the patients studied are not as sick as a hospital based population, so you need more patients for data mining so you have an adequate number of abnormals

A proposed “educational curriculum” for Informatics Students using this data would be to calculate MIPS quality measures from the NHANES population.

For example from https://qpp.cms.gov/measures/quality Diabetes: Hemoglobin A1c (HbA1c) Poor Control (>9%) Coronary Artery Disease (CAD): Angiotensin-Converting Enzyme (ACE) Inhibitor or Angiotensin Receptor Blocker (ARB) Therapy - Diabetes or Left Ventricular Systolic Dysfunction (LVEF < 40%) Statin Therapy for the Prevention and Treatment of Cardiovascular Disease

Students could learn about generating these results using R, WEKA or directly from LibreEHR

The clinical attributes in the table Dr. Hoyt provided in an early message would be enough to calculate these measure since it includes medication lists and glycohemoglobin in the lab data.

Some other possible measures to look at are: Diabetes: Medical Attention for Nephropathy Diabetic Retinopathy: Communication with the Physician Managing Ongoing Diabetes Care Diabetes Mellitus: Diabetic Foot and Ankle Care, Peripheral Neuropathy - Neurological Evaluation Diabetes Mellitus: Diabetic Foot and Ankle Care, Ulcer Prevention - Evaluation of Footwear Diabetes: Eye Exam Diabetes: Foot Exam

Not sure if NHANES captures information related to these measures, but a follow up “project” for the HIT student would be gap analysis for these measures.

There actually is a full set of all measure calculations for all current US quality measures coded and probably easily importable as a module to the current LibreHealthEHR scheme. They have not been released to the project, because they took over a year of work, we are starving, and we have not made the code pay us yet, but if we starve to death by the end of March or so, the code might go to the project. I will be working in a boatyard or something to pay the bills by then anyway. I am not plugging it, but as we are on the subject, you could look at the videos on suncoastconnection.com to get the idea of how extensive they are.

There is no question that mining quality measures would be good practice for most students in healthcare. Note that the measures related to diabetes list as sources the EHR and/or Registry. At this point, just getting students to perform descriptive statistics with Excel would be a step in the right direction and get them ready for predictive analytics with R, Python, Statistics Packages or machine learning

How is this scheme layout progressing, and are you two still working on an SOW?

Tony is working on a SOW, to my knowledge Bob

There is another relevant topic related to enhancing LibreHealth EHR----creating SMART apps using FHIR. The app gallery was updated today and includes several predictive analytical apps. As you may or may not know, FHIR apps are an extremely hot topic in informatics today and may help solve interoperability challenges.

Has anyone on the forum created a SMART app? The goal would be to link an app to LibreHealth EHR and call up data using FHIR standards. I think we could create e.g. a diabetes “dashboard” using this technology. Additionally, I suppose we could have predictive analytics for specific clinical problems. Most predictive analytics is aimed at inpatients (predict mortality, morbidity, sepsis, readmissions, etc.) Using the ambulatory EHR model we might predict 1. Diabetes 2. Hypertension, etc.

@rhoyt one of the model I would like to see is where educators make pull requests to the main project… and his would be a good example to look at with FHIR – So we can have suggested topics for students to work against the LibreHealth education toolkit … and have a forum where students showcase their work and this is subsequently displayed for other students and educators

Just a suggestion

I think that would be great. Another initiative we might link to down the road is the medical algorithm project (MEDAL) that has over 2000 algorithms that are evidence based. I was unaware that they have APIs so might be another good student project. I have an email in with them for more details

I am currently using about 13,000 calculations (about 34,000 actual elements) just to do the PQRS bit. MEDAL is also pretty restrictive in it’s use…and expensive, though I am not sure about educational licenses.

The scheme is still where I left it at the time of writing. Unfortunately, the NHANES data lacks some important keys for collation in some ways as far as making a complete record is concerned. It always seems to come down to the fact that you don’t have a procedure code to identify stuff…

I would be happy to add procedure codes where appropriate.

Today in JMIR is a review article about open source EHRs. While they highlight OpenEMR, they don’t mention LibreHealth EHR, simply because the article was actually written before its appearance on the scene. I didn’t see an option to write a comment to this open access article but there is an opportunity to tweet back. Might be a good opportunity to promote LibreHealth

1 Like

What am I missing with regards to procedure codes that would be an obstacle to creating useful records?

All of the tables have: SEQN - Respondent sequence number as one of the fields which would be the appropriate primary key for collation.

Data to look at something like Diabetes control via A1c levels across education levels seems available.

https://wwwn.cdc.gov/Nchs/Nhanes/2011-2012/GHB_G.htm LBXGH - Glycohemoglobin (%)

https://wwwn.cdc.gov/Nchs/Nhanes/2011-2012/DEMO_G.htm DMDEDUC2 - Education level - Adults 20+

We don’t have ICD10 codes per say,but we could add some diagnoses based on responses to the medical conditions questionaire. https://wwwn.cdc.gov/Nchs/Nhanes/2011-2012/MCQ_G.htm

For billing the encounters could be coded as the CPT for E/M with either the same or a randomly chosen level (99201-99205 or 99211-99215).

A “laboratory code” for each of the results in Dr. Hoyt’s list of important clinical attributes might be useful but doesn’t seem strictly required.

The NHANES is missing data correspond to “visits” that could map to “encounters” as they are tracked in LibreEHR, but it seems reasonable to just create one arbitrarily. At first maybe just make them all happen on January 1,2011 for simplicity, but can come up with a better scheme later (pick a pseudo random date, assign each record to successive days on the calendar, (exclude weekends.)