Several medical procedures in surgery, or interventional radiology are recorded as videos that are used for review, training and quality monitoring. These videos have at least 3 interesting artifacts - (1.) anatomical structures such as organs, tumors, tissues etc. (2.) medical equipment and (3.) medical information overlayed or described about the patient. It will be immensely helpful for review and search purposes if these can be identified and automatically labeled in the videos.
Parellely, there is a need to scale the apprenticeship model of being in a procedure room. Virtual reality and live video streams seem to have picked up steam in the recent years as being able to provide immersive experience to participate in such procedures. Thus, this project will use deep learning approaches to scale the apprenticeship model of training future providers by doing object detection and then automatic labeling the artifacts of interest.
The following will mean successful completion of the project:
- Train a model that can do object detection on Kvasir dataset
- Convert the Kvasir video to an immersive experience on VR headset like Google cardboard (or another mobile VR) or Oculus
- Implement inference of the object detection model from Step #1 in the VR experience.
Mentors: @judywawira @pri2si17 Skills required: Python or ML.Net and C# programming for Unity SDK