GSoC - Coding Period Week 1
This blog post showcases what I did last week and what I plan to do next.
Work Done This Week (June 7th to June 13th)
1. JPEG or PNG?
-
Last week I was able to convert the
.tif
files from the cell-tracking-challenge dataset into NumPy arrays. The next goal was to save these NumPy arrays as image files. -
Here’s where I made a mistake, I saved the images as
.jpeg
files. -
What I didn’t know is that JPEG is a lossy compression method, it changes pixel level values as long as it “looks good” to the human observer.
-
I used plotly to zoom into the saved
.jpeg
files, and noticed that the edges of the segmentation masks were noisy, then I plotted the images in 3D, with z axis as pixel value.
-
Turns out, JPEG is a lossy compression method - it is designed to save images using less bits even if it changes the pixel values as long as it “looks good” to a human observer.
-
I decided to save the images in
.png
format; PNG is a lossless compression method. -
Each
.tif
file contains 3D data (as slices) from a single instant of time, I’ve named the corresponding.png
files accordingly.features/F{time-point}_{slice}.png
segmentation_maps/L{time-point}_{slice}.png
-
Link to code:
2. Defining The Dataset and Dataloader in PyTorch:
- I did not want to use
torchvision.datasets.ImageFolder()
for the dataset class. - The filenames (both features and labels) contain the time-point and slice number, which could be used to identify unique feature-label pairs.
- I created a Pandas DataFrame that contains these unique id’s and saved it as a csv file, using which the custom dataloader loads valid feature-label pairs.
3. Experimenting with Gradio:
- Gradio has been all over the ML community lately, I decided to give it a try.
- The Gif below showcases a simple GUI (prototype) for the existing DevoLearn embryo segmentation model.
- This still needs a lot more work. I’ll try to resume work on this after I’m done with the proposed work for next week.
Planned:
- Upgrade the existing embryo segmentation model.
- Spend more time on the GUI for the existing segmentation model.
- Look into ways of hosting PyTorch + Gradio based apps online.