Skip to content

Open In Colab

How to use Mask RCNN

Installing IceVision

We ussually install IceVision with [all], but we can also use [inference] to install only the packages that inference methods depend on.

!pip install icevision[all] icedata

Imports

from icevision.all import *

Data

We'll be using the Penn-Fudan dataset, which is already available under datasets.

data_dir = icedata.pennfudan.load_data()
class_map = icedata.pennfudan.class_map()

As usual, let's create the parser and perfom a random data split.

parser = icedata.pennfudan.parser(data_dir)

train_records, valid_records = parser.parse()
HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=170.0), HTML(value='')))

Let's use the usual aug_tfms for training transforms with two small modifications: - Decrease the rotation limit from 45 to 10. - Use a more aggresive crop function.

shift_scale_rotate = tfms.A.ShiftScaleRotate(rotate_limit=10)
crop_fn = partial(tfms.A.RandomSizedCrop, min_max_height=(384//2, 384), p=.5)
train_tfms = tfms.A.Adapter(
    [
        *tfms.A.aug_tfms(size=384, presize=512, shift_scale_rotate=shift_scale_rotate, crop_fn=crop_fn),
        tfms.A.Normalize(),
    ]
)

And for validation transforms, the simple resize_and_pad.

valid_tfms = tfms.A.Adapter([*tfms.A.resize_and_pad(size=348), tfms.A.Normalize()])

Now we can create the Dataset and take a look on how the images look after the transforms.

train_ds = Dataset(train_records, train_tfms)
valid_ds = Dataset(valid_records, valid_tfms)
samples = [train_ds[1] for _ in range(6)]
show_samples(samples, denormalize_fn=denormalize_imagenet, ncols=3, display_label=False, show=True)

png

Now we're ready to create the DataLoaders:

train_dl = mask_rcnn.train_dl(train_ds, batch_size=16, shuffle=True, num_workers=4)
valid_dl = mask_rcnn.valid_dl(valid_ds, batch_size=16, shuffle=False, num_workers=4)

Metrics

Metrics are a work in progress for Mask RCNN.

# metrics = [COCOMetric(COCOMetricType.mask)]

Model

Similarly to faster_rcnn, we just need the num_classes to create a Mask RCNN model.

model = mask_rcnn.model(num_classes=len(class_map))

Training - fastai

We just need to create the learner and fine-tune.

Optional

You can use learn.lr_find() for finding a good learning rate.

learn = mask_rcnn.fastai.learner(dls=[train_dl, valid_dl], model=model)
learn.fine_tune(10, 5e-4, freeze_epochs=2)
epoch train_loss valid_loss time
0 1.609195 0.804607 00:19
1 1.115979 0.536064 00:15
epoch train_loss valid_loss time
0 0.777911 0.435238 00:22
1 0.618750 0.363259 00:19
2 0.555032 0.341701 00:18
3 0.507919 0.335720 00:19
4 0.486197 0.330338 00:19
5 0.450105 0.278701 00:19
6 0.426801 0.280237 00:18
7 0.417322 0.296473 00:16
8 0.411688 0.289382 00:17
9 0.401172 0.283575 00:17

Visualize predictions

Let's grab some images from valid_ds to visualize. For more info on how to do inference, check the inference tutorial.

mask_rcnn.show_results(model, valid_ds, class_map=class_map)

png

Happy Learning!

If you need any assistance, feel free to join our forum.