Project: Improving FHIR Analytics Using Apache Spark and Cassandra

fhir
gsoc2019-project
gsoc2019
toolkit
(Prashadi) #1

Hi @sunbiz, @namratanehete, @r0bby,

During the last year GSoC we have built the base of the FHIR analytics module, now we can focus on integrating the module to the platform and improve the features of it.

The current Librehelath FHIR analytics module is powered by Apache Spark, Apache Cassandra and Cerner Bunsen modules with Spring framework. This module provide flexibility to analyze FHIR data which stored in the Cassandra Database. Module utilize the functionality provided by the Cerner Bunsen module with Spark SQL which provide convenient interface to query FHIR data using different parameters. Even this module allow users to join multiple FHIR resources and find relationships through Spark SQL query language.

The repository can be found here.

Objectives

  • Integrate the FHIR analytics module with the platform

  • Upgrading Cerner Bunsen Module to get the latest functionality

  • Integrate with Apache Drill

  • Enhance query interface and incorporate with FHIR query builder

Prerequisites

  • Non trivial contribution for Librehealth platform

  • Good understanding of Spring, FHIR, Cassandra and Spark SQL

  • Try out the module functionality

Extra Credit

  • Write a paper on the accomplished work during the project
3 Likes

(Shubhankar Mohapatra) #14

Hey @prashadi,

I am interested in this project and would be very glad to know more about it. I see that there are no issues to work on currently. I have built the repo from Gitlab and would be interested in knowing how to start contributing more from you.

0 Likes

(Prashadi) #15

Thank you for your interest @mshubhankar I’ll check and add few issues. But meantime it’s not must to do a contribution for this project only. You can do a contribution for existing projects such as toolkit and radialogy.

0 Likes

(Shubhankar Mohapatra) #16

Sure. I have been trying to solve issues from lh-ehr project but would also like to get some hands on experience in this project and exactly understand how it is going to be.

1 Like

(Prashadi) #17

@manushree That’s great. I’m hoping to add set of issues soon.

1 Like

(Manthan Admane) #18

Hello. My name is Manthan Admane [UserID: MisterAwesome]

By looking at an overall abstract, I am definitely interested in contributing and pushing this project positive steps forward.

At the same time being very honest most of the terms [“Apache Spark”, " Apache Cassandra", “Cerner Bunsen modules”] in project description are daunting to me, infact to be equally honest I’ve heard of them for the first time. XD.

Present condition is, I am pursuing my B.Tech in Computer Science (Junior/Third Year) and aware of related terms like MYSQL DB, Apache servers, etc. I am interested in web technology hence have a basic competency in that section too. I do feel confident (and hope) that I will catch onto all the technical requisites for this project by putting in the required efforts. By the way I am already on with Non trivial contribution for Librehealth platform with lh-ehr : )

Looking forward to this positively. Thank you.

P.S: I am already on research and to know more of all the new terms and technologies ; )

1 Like

(Shubhankar Mohapatra) #19

Hey @prashadi ! Haven’t heard from you in a long time. How far are we with the new issues ? I have been through your blogs and I would like really like to know more about where you have stopped and where I can begin contributing. As I see it, bunsen has been upgraded to 0.4 after where you have stopped and so has spark to 2.3 and hapi to 3.6 . After we are done with that, we can add functionality for Drill and enhance the query interface (need to know more about that as well), which should be the realtively harder part of the project.

1 Like

(Prashadi) #20

@mshubhankar sorry for the delay. Saw you have sent a pull request to the project. Were you able to test the functionality after the upgrade? Pull request is looking good to me.

0 Likes

(Shubhankar Mohapatra) #21

@prashadi Yeah there’s no problem in creating the war file and deploying it on the server. But there has been a weird problem lately. It is also happening without the changes that I have made. The sql query is timing out and the app gets stuck on “please wait…” . I am attaching an image for your reference.

Note that this is happening also without the changes in PR and this wasn’t happening before.

0 Likes

(Prashadi) #22

@mshubhankar can you redeploy the entire application by cleaning the webapps directory? Do you see any backend error logs?

1 Like

(Shubhankar Mohapatra) #23

@prashadi I found the problem. I have both Java 8 and Java 11 on my laptop and the war file was built on Java 8 whereas tomcat was running on Java 11. Anyway, things are good and up running now. I tested the functionality after the PR and it’s working just fine :slight_smile:

1 Like

(Prashadi) #24

Great I’ll merge the pull request after I tested locally as well. Good work.

0 Likes

(Shubhankar Mohapatra) #25

Hi @prashadi @sunbiz @namratanehete @r0bby,

I’m starting to write my proposal and wanted to get my objectives sorted out. I have some questions related to some objectives in mind and would be glad if someone can help me out with these :

  1. Upgrade FHIR analytics to work with latest cerner bunsen - The latest cerner bunsen depends on Spark 2.3 and Hapi 0.4.6 . I have made these changes and requested a PR with the updates. Another objective of the project is also to get the latest functionality of bunsen but I’m not sure which functionalities we can add to the project. Do we want some added functionalities from here ?

  2. Complete the TODO Implementations in utils/LibrehealthAnalyticsUtil.java file

  3. Integrate with Apache Drill - Apache drill has a good documentation to be built with JDBC drivers and also in the Spark environment but there seems to be some problem with it being integrated with cassandra. The documentation mentions that a custom plugin can be made for cassandra but I don’t find any prior work to refer to. I have also found a post which is old and deprecated but can be used as reference. It would be great to have some information about this.

  4. Enhance query interface - Are there any particular interface enhancements ?

Have I missed out on something except of writing a research paper which I hope can be done after being done with the objectives ?

1 Like

(Shubhankar Mohapatra) #26

Hey @prashadi @sunbiz @namratanehete @r0bby,

I’ve written the first draft of my proposal here. Please have a look and revert me back with your feedback and valuable comments to make it better.

0 Likes

(Robby O'Connor) #27

I would like to remind you and everyone reading this that without doing the starter tasks, we will not consider your application at all, no matter how strong it is;

0 Likes

(Shubhankar Mohapatra) #28

I am sorry I missed the Intro task. I have now completed the Intro task 1 and my code can be found here. I hope you can now review my proposal.

0 Likes

(Robby O'Connor) #29

Link to it in your proposal.

0 Likes

(Shubhankar Mohapatra) #30

I have added it under the “Contributions and work done” section of my proposal :smiley:

1 Like

(Prashadi) #31

@mshubhankar Sorry for the delay. I’ll have a look at the proposal and get back to you. Was away just over a week.

1 Like

(Judy Gichoya) #32

@mshubhankar great start on your proposal

I don’t see you recognizing the work @prashadi worked on last year in GSOC , and how your work will improve that

This year the outcome of this project is to make it usable - esp to non technical users , so expect a UI improvement too

3 Likes