Project: Neural network-based object detection of anatomical structures and medical artifacts in Virtual Reality

This is the thread for Project: Neural network-based object detection of anatomical structures and medical artifacts in Virtual Reality

@sshivaditya this is where you will communicate – you may also use our chat. I am considering this Jitsi meet room public if you need video conferencing.

Thank you for providing this opportunity to work on this project. @judywawira @pri2si17 I have few question regarding the implementation of the model.

  1. Since Kvasir Dataset provides images with just class labels is a classification model enough ?

  2. Live Inference of model requires a lot of computational power so is it fine if the model runs on a video of GI Endoscopy and the class labels are displayed at the corner of the frame. Is this approach fine ?

Hi @sshivaditya Please fork from below repository and send us your code as MR.

LibreHealth / LibreHealth Radiology / LH Radiology NN VR Detection · GitLab

Hi @sshivaditya Please update here the docs as discussed last Saturday.

Observations:

Game Engine

Both Unity and Unreal Engine support VR Development. VR applications for Oculus, Steam, Google Cardboard can be designed with both. But Unreal provides support for Python so using external libraries tensorflow can be made to work on the Unreal engine. Unity does not support tensorflow natively as tensorflow C# has been phased out. So libraries like barracuda have to be used which can not be well optimised.

Platform to Develop for

Developing for Oculus and Steam VR would be better than Google Cardboard because of two reasons. One, Google cardboard API support is being phased out in both unity and unreal and daydream is no longer supported. Secondly, Models get compressed during android build process and the inference doesn’t work properly in unity

1 Like

Model Development

Dataset

The dataset has labelled-images, unlabelled-images, labelled videos and segmented images. For the classification process I am going to use the labelled-images. They have 22 classes labelled in them.

Model

I am training two model, out of which one model will be a custom model and other one will be pre-trained one which is adapted to Kvasir Dataset. A size of 224x224 will be used as the input shape for the model. For data logging purpose i have used tensorboard.

Results

Both the models were trained. The pre-trained model provides low generalisation error in-comparison to the Custom-Model.

CustomModel: loss: 2.4398 - accuracy: 0.7283 - f1_m: 1.1505 - precision_m: 142484.9571 - recall_m: 1.1362 - val_loss: 3.0672 - val_accuracy: 0.0990 - val_f1_m: 0.9958 - val_precision_m: 0.9958 - val_recall_m: 0.9958

PreTrained Model: loss: 0.9046 - acc: 0.7174 - f1_m: 0.2450 - precision_m: 301254.4277 - recall_m: 0.2275 - val_loss: 1.0795 - val_acc: 0.6811 - val_f1_m: 0.2366 - val_precision_m: 248592.8750 - val_recall_m: 0.2270

Virtual Reality

For handling virtual reality portion of the project I am going to use WebXR. This doe not require installation of application on the phone. It supports various platforms like oculus and android

References

MobileNetV2

WebXR

CustomModel

PreTrained

Project Diagram:

@sshivaditya Can you please put the F1 Scores for each class?

           precision    recall  f1-score   support

       0       0.50      0.10      0.17        10
       1       0.00      0.00      0.00        13
       2       0.98      0.52      0.68       126
       3       0.76      0.99      0.86       222
       4       0.87      0.96      0.91       206
       5       0.73      0.24      0.37       201
       6       0.55      0.91      0.68       215
       7       0.29      0.35      0.32        77
       8       0.41      0.22      0.29        55
       9       0.00      0.00      0.00         1
      10       0.89      0.95      0.92       186
      11       0.89      0.53      0.67        30
      12       0.75      0.87      0.80       202
      13       0.91      0.54      0.68        89
      14       0.92      0.81      0.86       149
      15       0.00      0.00      0.00         6
      16       0.29      0.13      0.18        46
      17       0.00      0.00      0.00         4
      18       0.00      0.00      0.00        90
      19       0.02      0.25      0.03         4
      20       0.23      0.88      0.36        25
      21       0.62      0.68      0.65       175

Net Accuracy Obtained 68% . Each Number represents a class. These score are obtained on the validation set. @pri2si17

This week’s blog post link

@pri2si17 I have sent a MR

@sshivaditya Can you please put test dataset metrics as that is what matters. Also is this for baseline model or pretrained mobilenet?

The metrics are of test dataset on which model was not trained. This metric is for pre-trained model.

Well you have written validation set. The model is not good then. :expressionless: You need to improve it.

@pri2si17 I trained a new model . I have a BiT pre-trained model.

loss: 0.6331 - acc: 0.8733 - f1_m: 0.8899 - precision_m: 195757.4552 - recall_m: 0.9773 - val_loss: 1.2562 - val_acc: 0.7955 - val_f1_m: 0.9079 - val_precision_m: 126641.6953 - val_recall_m: 1.0619

          precision    recall  f1-score   support

       0       0.10      0.80      0.17        10
       1       0.00      0.00      0.00        13
       2       0.98      0.96      0.97       126
       3       0.97      0.97      0.97       222
       4       0.98      0.98      0.98       206
       5       0.71      0.95      0.81       201
       6       0.92      0.71      0.80       215
       7       0.37      0.40      0.39        77
       8       0.65      0.44      0.52        55
       9       0.00      0.00      0.00         1
      10       0.78      0.95      0.86       186
      11       0.81      0.83      0.82        30
      12       0.98      0.99      0.99       202
      13       1.00      0.40      0.58        89
      14       0.95      0.99      0.97       149
      15       0.00      0.00      0.00         6
      16       0.40      0.43      0.42        46
      17       0.00      0.00      0.00         4
      18       0.64      0.30      0.41        90
      19       0.00      0.00      0.00         4
      20       0.33      0.92      0.48        25
      21       0.82      0.56      0.66       175
    accuracy                           0.80      2132
   macro avg       0.56      0.57      0.54      2132
weighted avg       0.83      0.80      0.80      2132

These scored were obtained when the model was tested on testing set. Testing set was kept seperate and model was not trained on this dataset. Accuracy Obtained 80%

BiT Model

Hi @sshivaditya well the F1 score doesn’t seems to be good. We can’t rely on accuracy only, you need to look for it. Also, can you try ResNext ( ResNeXt | Papers With Code)?

Also any updates on development of VR prototype and segmentation/detection for polyps class? Like with the current model itself (though the performance won’t be good, but it will be helpful for the prototype.)

@pri2si17 Sure will try with ResNext

For implementing the tensorflowJS pipeline, it took some time as some of the layers used are not supported in tensorflowJS. For the VR portion , the basic WebXR setup is ready.

This week’s blog link

I agree with this.

Even if I am not mentoring this project, my suggestion would be to get the end-to-end working first. i.e. get the Unreal engine-based VR working with any model, even if it doesn’t have good accuracy/f1 scores. The WebXR support is still flaky and slow on most devices, so I’d like to see the program work on Oculus or another VR headset.

I echo both @sunbiz and @pri2si17, get something basic working. Doesn’t even have to be perfect.

We don’t expect production-ready code from GSoC projects since we know you are still learning. We just need you to have something that works and does the bare minimum, the Minimum viable product (MVP).

@sshivaditya Please follow what @sunbiz suggested. We need a POC ready with your current model. Model can be improved later.

<