Getting Started with Instance Segmentation using IceVision
Introduction
This tutorial walk you through the different steps of training the fridge dataset. the IceVision Framework is an agnostic framework. As an illustration, we will train our model using both the fastai library, and pytorch-lightning libraries.
For more information about how the fridge dataset as well as its corresponding parser check out the pennfudan folder in icedata.
Installing IceVision and IceData
If on Colab run the following cell, else check the installation instructions
# IceVision - IceData - MMDetection - YOLO v5 Installation
!wget https://raw.githubusercontent.com/airctic/icevision/master/install_colab.sh
!chmod +x install_colab.sh && ./install_colab.sh
Imports
from icevision.all import *
Model
To create a model, we need to:
- Choose one of the models supported by IceVision
- Choose one of the backbones corresponding to a chosen model
- Determine the number of the object classes: This will be done after parsing a dataset. Check out the Parsing Section
Choose a model and backbone
TorchVision
model_type = models.torchvision.mask_rcnn
backbone = model_type.backbones.resnet34_fpn()
Datasets : Pennfudan
Fridge Objects dataset is tiny dataset that contains 134 images of 4 classes: - can, - carton, - milk bottle, - water bottle.
IceVision provides very handy methods such as loading a dataset, parsing annotations, and more.
# Loading Data
data_dir = icedata.pennfudan.load_data()
train_ds, valid_ds = icedata.pennfudan.dataset(data_dir)
Displaying the same image with different transforms
Note:
Transforms are applied lazily, meaning they are only applied when we grab (get) an item. This means that, if you have augmentation (random) transforms, each time you get the same item from the dataset you will get a slightly different version of it.
samples = [train_ds[0] for _ in range(3)]
show_samples(samples, ncols=3)
DataLoader
# DataLoaders
train_dl = model_type.train_dl(train_ds, batch_size=8, num_workers=4, shuffle=True)
valid_dl = model_type.valid_dl(valid_ds, batch_size=8, num_workers=4, shuffle=False)
# show batch
model_type.show_batch(first(valid_dl), ncols=4)
Model
Now that we determined the number of classes (num_classes
), we can create our model
object.
# TODO: Better flow for train_ds
model = model_type.model(backbone=backbone, num_classes=icedata.pennfudan.NUM_CLASSES)
Metrics
metrics = [COCOMetric(metric_type=COCOMetricType.mask)]
Training
IceVision is an agnostic framework meaning it can be plugged to other DL framework such as fastai2, and pytorch-lightning.
You could also plug to oth DL framework using your own custom code.
Training using fastai
learn = model_type.fastai.learner(dls=[train_dl, valid_dl], model=model, metrics=metrics)
learn.lr_find()
SuggestedLRs(lr_min=0.00010000000474974513, lr_steep=1.737800812406931e-05)
learn.fine_tune(20, 1e-4, freeze_epochs=1)
epoch | train_loss | valid_loss | COCOMetric | time |
---|---|---|---|---|
0 | 2.361193 | 1.672817 | 0.000040 | 00:16 |
epoch | train_loss | valid_loss | COCOMetric | time |
---|---|---|---|---|
0 | 1.536968 | 1.494403 | 0.000147 | 00:19 |
1 | 1.516644 | 1.430333 | 0.000538 | 00:23 |
2 | 1.496924 | 1.343002 | 0.001926 | 00:22 |
3 | 1.435968 | 1.183041 | 0.002488 | 00:21 |
4 | 1.343802 | 0.962206 | 0.000928 | 00:20 |
5 | 1.199133 | 0.734787 | 0.004951 | 00:20 |
6 | 1.071547 | 0.682442 | 0.005912 | 00:20 |
7 | 0.973078 | 0.692099 | 0.006344 | 00:20 |
8 | 0.892403 | 0.649406 | 0.006396 | 00:21 |
9 | 0.837712 | 0.690478 | 0.007804 | 00:22 |
10 | 0.790692 | 0.647615 | 0.007775 | 00:23 |
11 | 0.758785 | 0.626387 | 0.002885 | 00:20 |
12 | 0.727605 | 0.609429 | 0.010509 | 00:20 |
13 | 0.706678 | 0.611301 | 0.012746 | 00:20 |
14 | 0.684067 | 0.616399 | 0.008784 | 00:20 |
15 | 0.668815 | 0.621261 | 0.011861 | 00:19 |
16 | 0.652843 | 0.615020 | 0.008686 | 00:19 |
17 | 0.649229 | 0.612351 | 0.008146 | 00:20 |
18 | 0.641258 | 0.609879 | 0.007551 | 00:22 |
19 | 0.640558 | 0.608780 | 0.007924 | 00:19 |
Training using Lightning
class LightModel(model_type.lightning.ModelAdapter):
def configure_optimizers(self):
return SGD(self.parameters(), lr=1e-4)
light_model = LightModel(model, metrics=metrics)
trainer = pl.Trainer(max_epochs=20, gpus=1)
trainer.fit(light_model, train_dl, valid_dl)
Show Results
model_type.show_results(model, valid_ds, detection_threshold=.5)
Inference
Predicting a batch of images
Instead of predicting a whole list of images at one, we can process small batches at the time: This option is more memory efficient.
NOTE: For a more detailed look at inference check out the inference tutorial
infer_dl = model_type.infer_dl(valid_ds, batch_size=4, shuffle=False)
preds = model_type.predict_from_dl(model, infer_dl, keep_images=True)
show_preds(preds=preds[:4], ncols=3)
Happy Learning!
If you need any assistance, feel free to join our forum.