Project: Implement Spring Data to LibreHealth Toolkit

gsoc2018-project
gsoc2018
toolkit

(Emmanuel Nyachoke) #33

@judywawira Thanks for guidance . After reading through the older posts I understand that the plan is to use spring,hapi fhir pojos with cassandra for storage to build some sort of a data layer that can then be used to either build a rest api or be consumed by other projects. Is this correct?


(Prashadi) #34

@sunbiz Yes I was looking at option of integrating the spark and cassandra. This option is present with the datastax driver. So next challenge is loading the data from cassandra use Bunsen. FHIR bundles can load through file system and JSON/XML format to Bunsen with the provided APIs. I also will work on writing a blog post after I further investigate on this.


(Yash D. Saraf) #35

@sunbiz @namratanehete @prashadi
I’m defining my own resource providers which would in turn call the spring data repository interfaces. These resource providers can be directly plugged in HAPI JPA example here.
But when I use the HAPI FHIR structures in the repository interface, I get an IllegalArgumentException saying it isn’t a managed type.
I also tried porting the app to spring boot and using spring boot’s EntityScan but that threw the same error as well.
Any ideas?
I am currently trying to get a barebone spring boot app to work with HAPI FHIR DSTU3 structures. I’ll report back if I find anything new.


(Yash D. Saraf) #36

I was able to get rid of the Bean isn’t a managed type error by adding a hibernate mapping file (I presume similar results can be achieved using JPA’s orm xml as well) for the model (for now I’m working with just the Patient).
The app now uses my custom resource provider which in turn call the spring data repository for the model. For now all the Patients are returned on each query, I’ll work on fixing this after I get the app running on Cassandra.


(Prashadi) #37

@yashdsaraf when we moving to spring data we will store the resources in following table format? Will there be separate table per resource?

TABLE PATIENT

(

key INT PRIMARY KEY,

content TEXT

)

@sunbiz I was able to try out cassandra sample. I’m looking at the possibility to load spark via cassandra with above type of table. Will blog this as soon as possible.


(Yash D. Saraf) #38

Yes, since spring data repositories treat each model as a table, we will be treating each resource (at least all those which are exposed via the api) as a table.


(Prashadi) #39

Great thanks for the information.


(Prashadi) #40

@sunbiz @namratanehete @judywawira I have written a blogpost about my current findings in https://medium.com/@prkpbandara/gsoc-librehealth-working-with-bunsen-for-fhir-analytics-using-spark-5d198057e5c5. I got simple spark cassandra sample to work. Now I’m looking at the possibility of integrating it to Bunsen.


(Yash D. Saraf) #41

@sunbiz Is it alright if I clone all the HAPI FHIR DSTU3 structures and modify them locally? Because unlike with JPA, with cassandra there are two issues,

  1. Only primitive types and collections or maps of primitive types are allowed
  2. The models have to be annotated with Table and PrimaryKey annotations

This will require me to modify seemingly all the structures.


(Saptarshi Purkayastha) #42

I don’t think we should duplicate the libraries. Can you extend the HAPI DSTU3 structures instead? The annotations can also be done through external configuration class. I recommend that you use that instead.


(Saptarshi Purkayastha) #43

Can you please create a separate forum post to track your work and what you will deliver in your project. Your project is for building an analytics engine and UI for it based on Apache Spark and Cassandra, which may or may not use Bunsen, depending on how much we get out of it.


(Prashadi) #44

@sunbiz Sure I’ll create a separate post.


(Yash D. Saraf) #45

@sunbiz
For the issue of only primitive types and collections or maps of primitive types being allowed, I can use custom converters to map the non primitive types to primitive ones.
For the second issue, you mention the annotations can also be done through external configuration class, I imagine you’re talking about the XML based configuration like JPA or Hibernate mapping; I’ve looked through the Spring Data Cassandra docs, it doesn’t seem to allow XML based configuration for mapping tables and primary keys.


(Emmanuel Nyachoke) #46

Hi it will be nice if you use java an annotations for this. I think what @sunbiz is saying is you will need to extend the hapi models like so

` @table

Custompatient extends Patient{ }`

then you can add any other annotations you need. You should generally try to avoid xml based configurations if they can be done with annotations.


(Prashadi) #47

@yashdsaraf one more question. If we take patient as example. Will patient representation stored in a single column or will you use columns such as name, age, dateOfBirth and etc to store the patient details separately?


(Lenya Hope Nembi) #48

@enyachoke I tried this approach in my spring-data app for Radiology but got the same error only primitive types and Collections or Maps of primitive types are allowed Full Stack Trace

This is my example model

package com.example.cassandra.model;

import org.hl7.fhir.dstu3.model.ImagingStudy;
import org.springframework.data.cassandra.core.mapping.PrimaryKey;
import org.springframework.data.cassandra.core.mapping.Table;

@Table
public class CustomImagingStudy extends ImagingStudy {

    @PrimaryKey
    private String id;

}


(Yash D. Saraf) #49

@prashadi I created a sample Patient model locally with three fields

@Table
public class Patient {

 @PrimaryKey
 private UUID id;

 private String name;

 private Date birthDate;

getters, setters, constructors...

and attached it to a spring data repository with cassandra configuration’s schema action set to drop all and create only known tables on application startup.
This gave me the following schema,

cqlsh> describe cassandra.patient

CREATE TABLE cassandra.patient (
    id uuid PRIMARY KEY,
    birthdate timestamp,
    name text
) WITH bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99PERCENTILE';

So it seems spring data cassandra treats each field as a separate column.


(Prashadi) #50

Thanks for the information @yashdsaraf. @sunbiz previously thought of storing the entire resource in the single column. For me I just need the format so I can load the data from Cassandra appropriately.


(Saptarshi Purkayastha) #51

No, I didnt mean using the XML configuration. I dont see any repo for this work, so that I can point out how to solve this.


(Yash D. Saraf) #52

@sunbiz I was trying cassandra out on a sample spring boot app, here’s the link. When you run it, it’ll throw an error showing Unconfigured table Patient. I was able to solve this by annotating a subclass of HAPI’s Patient model with @Table (I haven’t pushed that yet). But it still throws an error of missing primary key attribute which needs either @PrimaryKey or @Id annotation to be applied on a field (I tried to apply them on overridden getter/setter, that didn’t seem to work though).

PS: That repo also has a custom converter for BooleanType to Boolean and vice versa.