Project: Implement Spring Data to LibreHealth Toolkit

toolkit
gsoc2018-project
gsoc2018

(Prashadi) #40

@sunbiz @namratanehete @judywawira I have written a blogpost about my current findings in https://medium.com/@prkpbandara/gsoc-librehealth-working-with-bunsen-for-fhir-analytics-using-spark-5d198057e5c5. I got simple spark cassandra sample to work. Now I’m looking at the possibility of integrating it to Bunsen.


(Yash D. Saraf) #41

@sunbiz Is it alright if I clone all the HAPI FHIR DSTU3 structures and modify them locally? Because unlike with JPA, with cassandra there are two issues,

  1. Only primitive types and collections or maps of primitive types are allowed
  2. The models have to be annotated with Table and PrimaryKey annotations

This will require me to modify seemingly all the structures.


(Saptarshi Purkayastha) #42

I don’t think we should duplicate the libraries. Can you extend the HAPI DSTU3 structures instead? The annotations can also be done through external configuration class. I recommend that you use that instead.


(Saptarshi Purkayastha) #43

Can you please create a separate forum post to track your work and what you will deliver in your project. Your project is for building an analytics engine and UI for it based on Apache Spark and Cassandra, which may or may not use Bunsen, depending on how much we get out of it.


(Prashadi) #44

@sunbiz Sure I’ll create a separate post.


(Yash D. Saraf) #45

@sunbiz
For the issue of only primitive types and collections or maps of primitive types being allowed, I can use custom converters to map the non primitive types to primitive ones.
For the second issue, you mention the annotations can also be done through external configuration class, I imagine you’re talking about the XML based configuration like JPA or Hibernate mapping; I’ve looked through the Spring Data Cassandra docs, it doesn’t seem to allow XML based configuration for mapping tables and primary keys.


(Emmanuel Nyachoke) #46

Hi it will be nice if you use java an annotations for this. I think what @sunbiz is saying is you will need to extend the hapi models like so

` @table

Custompatient extends Patient{ }`

then you can add any other annotations you need. You should generally try to avoid xml based configurations if they can be done with annotations.


(Prashadi) #47

@yashdsaraf one more question. If we take patient as example. Will patient representation stored in a single column or will you use columns such as name, age, dateOfBirth and etc to store the patient details separately?


(Lenya Hope Nembi) #48

@enyachoke I tried this approach in my spring-data app for Radiology but got the same error only primitive types and Collections or Maps of primitive types are allowed Full Stack Trace

This is my example model

package com.example.cassandra.model;

import org.hl7.fhir.dstu3.model.ImagingStudy;
import org.springframework.data.cassandra.core.mapping.PrimaryKey;
import org.springframework.data.cassandra.core.mapping.Table;

@Table
public class CustomImagingStudy extends ImagingStudy {

    @PrimaryKey
    private String id;

}


(Yash D. Saraf) #49

@prashadi I created a sample Patient model locally with three fields

@Table
public class Patient {

 @PrimaryKey
 private UUID id;

 private String name;

 private Date birthDate;

getters, setters, constructors...

and attached it to a spring data repository with cassandra configuration’s schema action set to drop all and create only known tables on application startup.
This gave me the following schema,

cqlsh> describe cassandra.patient

CREATE TABLE cassandra.patient (
    id uuid PRIMARY KEY,
    birthdate timestamp,
    name text
) WITH bloom_filter_fp_chance = 0.01
    AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
    AND comment = ''
    AND compaction = {'class': 'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 'max_threshold': '32', 'min_threshold': '4'}
    AND compression = {'chunk_length_in_kb': '64', 'class': 'org.apache.cassandra.io.compress.LZ4Compressor'}
    AND crc_check_chance = 1.0
    AND dclocal_read_repair_chance = 0.1
    AND default_time_to_live = 0
    AND gc_grace_seconds = 864000
    AND max_index_interval = 2048
    AND memtable_flush_period_in_ms = 0
    AND min_index_interval = 128
    AND read_repair_chance = 0.0
    AND speculative_retry = '99PERCENTILE';

So it seems spring data cassandra treats each field as a separate column.


(Prashadi) #50

Thanks for the information @yashdsaraf. @sunbiz previously thought of storing the entire resource in the single column. For me I just need the format so I can load the data from Cassandra appropriately.


(Saptarshi Purkayastha) #51

No, I didnt mean using the XML configuration. I dont see any repo for this work, so that I can point out how to solve this.


(Yash D. Saraf) #52

@sunbiz I was trying cassandra out on a sample spring boot app, here’s the link. When you run it, it’ll throw an error showing Unconfigured table Patient. I was able to solve this by annotating a subclass of HAPI’s Patient model with @Table (I haven’t pushed that yet). But it still throws an error of missing primary key attribute which needs either @PrimaryKey or @Id annotation to be applied on a field (I tried to apply them on overridden getter/setter, that didn’t seem to work though).

PS: That repo also has a custom converter for BooleanType to Boolean and vice versa.


(Yash D. Saraf) #53

@sunbiz @lehone It looks like the primary key annotation problem can be solved with declaring a new getter which returns the super getter of id and annotating it, like so

@Table
@AccessType(AccessType.Type.PROPERTY)
public class CustomPatient extends Patient {
     @PrimaryKey
     public IdType getPrimaryId() {
        return super.getIdElement();
    }
}

I’m still getting some data type mapping errors but the primary key not found error is not thrown.

Update: I’ll push this as soon as I get the app to build


(Yash D. Saraf) #54

@sunbiz The solution I mentioned above is giving me null pointer errors, perhaps your way will work. Could you please point out your solution? Here’s the repo link again if you missed it in my previous post.


(Yash D. Saraf) #55

@sunbiz For filtering results based on parameters passed in queries, I have looked at
Reactive Cassandra Template and QueryDsl.
Of these,

  • QueryDsl is not supported in reactive apps
  • Reactive Cassandra Template will render the app database dependent (which I believe is against the database independency goal).

LibreHealth toolkit originally uses Hibernate criteria for filteration/search purposes, which again cannot be used in reactive apps. Any thoughts on this?


(Prashadi) #56

@yashdsaraf @sunbiz initally we discuss storing the entire content of the FHIR resource. Since FHIR resources contains attributes and hierarchical attributes, we might need to check how we can represent in cassandra bases on key value pairs.


(Saptarshi Purkayastha) #57

I suggest storing each resource JSON in it’s own datastore. And references should be reference only, with the referenced resource in it’s own datastore. I don’t think we should call these tables because it’s no longer in a relational database. @prashadi does that answer your question??

@yashdsaraf we don’t want to use criteria or filtering. It should be MapReduce or CQL. This is a useful example that I’ve based a project on recently - https://github.com/spring-projects/spring-data-cassandra/blob/master/src/main/asciidoc/reference/reactive-cassandra.adoc


(Yash D. Saraf) #58

@sunbiz ReactiveCassandraTemplate is the preferred way to go for using CQL in spring data cassandra. My question is wouldn’t using CQL render the project to be Cassandra dependent?
Since one of the goals is to “Document examples that show how spring data can generate a non relational and relational database”.


(Saptarshi Purkayastha) #59

That’s fine dependence for the GSoC project