How to use 10,582 trainaug images on DeeplabV3 code?

You know what I mean if you have experience on training segmentation network models on Pascal VOC dataset. The dataset only provides 1464 pixel-level image annotations for training. But every paper uses 10,582 images for training, which is usually called trainaug. The additional annotations are from SBD, but the annotation format is not the same as Pascal VOC. Fortunately someone has already made a converted version, which is SegmentationClassAug.

DeeplabV3 code do not contain SBD annotations for some reasons that we can understand. So I wrote a simple script to solve this.

To use 10,582 trainaug images on DeeplabV3 code, you just need to do the following steps:

1. Create a script named convert_voc2012_aug.sh.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
# Exit immediately if a command exits with a non-zero status.
set -e

CURRENT_DIR=$(pwd)
WORK_DIR="./pascal_voc_seg"
mkdir -p ${WORK_DIR}

cd ${WORK_DIR}
tar -xf "../VOCtrainval_11-May-2012.tar"
cp "../trainaug.txt" "./VOCdevkit/VOC2012/ImageSets/Segmentation"
unzip "../SegmentationClassAug.zip" -d "./VOCdevkit/VOC2012"
rm -r "./VOCdevkit/VOC2012/__MACOSX"

cd ${CURRENT_DIR}

# Root path for PASCAL VOC 2012 dataset.
PASCAL_ROOT="${WORK_DIR}/VOCdevkit/VOC2012"

# Remove the colormap in the ground truth annotations.
SEG_FOLDER="${PASCAL_ROOT}/SegmentationClassAug"
SEMANTIC_SEG_FOLDER="${PASCAL_ROOT}/SegmentationClassAugRaw"

echo "Removing the color map in ground truth annotations..."
python ./remove_gt_colormap.py \
--original_gt_folder="${SEG_FOLDER}" \
--output_dir="${SEMANTIC_SEG_FOLDER}"

# Build TFRecords of the dataset.
# First, create output directory for storing TFRecords.
OUTPUT_DIR="${WORK_DIR}/tfrecord"
mkdir -p "${OUTPUT_DIR}"

IMAGE_FOLDER="${PASCAL_ROOT}/JPEGImages"
LIST_FOLDER="${PASCAL_ROOT}/ImageSets/Segmentation"

echo "Converting PASCAL VOC 2012 dataset..."
python ./build_voc2012_data.py \
--image_folder="${IMAGE_FOLDER}" \
--semantic_segmentation_folder="${SEMANTIC_SEG_FOLDER}" \
--list_folder="${LIST_FOLDER}" \
--image_format="jpg" \
--output_dir="${OUTPUT_DIR}"

2. Create a txt file named trainaug.txt with this content.

3. Download Pascal VOC dataset and SegmentationClassAug annotations.

4. Put all of them (‘convert_voc2012_aug.sh’, ‘trainaug.txt’, ‘VOCtrainval_11-May-2012.tar’, ‘SegmentationClassAug.zip’) to the research/deeplab/datasets folder.

5. Execute convert_voc2012_aug.sh (give it execute permission) in research/deeplab/datasets.

6. Change the code in research/deeplab/datasets/segmentation_dataset.py from:

1
2
3
4
5
6
7
8
9
_PASCAL_VOC_SEG_INFORMATION = DatasetDescriptor(
splits_to_sizes={
'train': 1464,
'trainval': 2913,
'val': 1449,
},
num_classes=21,
ignore_label=255,
)

to:

1
2
3
4
5
6
7
8
9
10
_PASCAL_VOC_SEG_INFORMATION = DatasetDescriptor(
splits_to_sizes={
'train': 1464,
'trainaug': 10582,
'trainval': 2913,
'val': 1449,
},
num_classes=21,
ignore_label=255,
)

7. Don’t forget to change the train_split parameter in research/deeplab/train.py to trainaug.