[Mask R-CNN] Segmentating DeepFashion2

Deep-Learning/[Vision] 실습 2020. 2. 11. 15:34

Data-Set : https://github.com/switchablenorms/DeepFashion2

1. Intro

Semantic Segmentation을 Fashion과 접목시켜보고자 한다. Fashion 관련 데이터셋 중 공개가 되어 있으며, 옷에 대한 방대한 사진과 annotation이 있는 DeepFashion2 Dataset을 이용하여 연구를 약 1년간 진행할 예정이다.

DeepFashion2 data-set은 말 그대로 포괄적인 패션 관련 데이터셋이다. 쇼핑 상점과 소비자의 사진(후기 사진으로 추측됨)을 13개의 의류 카테고리로 나눠놓은 데이터로써, 학습 세트 (391K 이미지), 유효성 검사 세트 (34k 이미지) 및 테스트 세트 (67k 이미지)로 분할되어 있다.

또한, 이미지의 각 항목에는 크기, 폐색, 확대, 관점, 범주, 스타일, 경계 상자, 짙은 랜드 마크 및 픽셀별 마스크로 레이블이 지정되어 있다.

하단의 데이터셋 예시를 보면 이해가 빠를 것이다.

2. Data Organization

Data Organization은 000001.jpg와 000001.json을 같이 보며 확인해보자.

1
2
3
4
5
6
7
8
9

import json
 
with open('000001.json', 'r') as f:
    content = json.load(f)
 
print(content.keys())
 
for i in content.keys():
        print("\n",content[i])

먼저 위 소스코드를 실행해보면 key값은 ['item2', 'source', 'pair_id', 'item1'] 로 구성되어 있음을 알 수 있다.

여기서, source는 'shop'의 사진인지 아니면 'user'가 찍은 사진인지 를 나타내며, pair_id는 동일한 상점에서 가지고 온 것인지 를 의미하는 값이다.(데이터셋 공홈을 보면 이 두 개의 데이터는 건들지 않는 것이 좋을 것이라고 권장한다.)

나머지 item1, item2, ..., itemn은 아래의 image와 함께 보자.

위 사진은 000001의 사진이다. 위 사진을 보면, 옷이 '바지'와 '티셔츠' 두 개를 착용하고 있음을 알 수 있다. 그렇기에 json의 label의 item이 2개가 잡힌 것이다. item2 혹은 item1이 어떤 상품을 가르키는지에 대한 것은 천천히 살펴보도록 하자.

1

print(content['item1'].keys())

이제 item에 담긴 data를 살펴보자. item1에는 ['segmentation', 'scale', 'viewpoint', 'zoom_in', 'landmarks', 'style', 'bounding_box', 'category_id', 'occlusion', 'category_name'] 가 담겨있다. 공홈의 정보를 빌려 순서대로 어떤 데이터가 담긴 것인지 살펴보도록 하겠다.

1) segmentation : 여기서 [x1, y1, xn, yn]은 다각형을 나타내며 단일 의류 아이템은 하나 이상의 다각형을 포함 할 수 있다. 즉, 세그멘테이션 좌표값임을 알 수 있겠다.

2) scale : 1은 소규모를 나타내고 2는 보통 규모를, 3은 대규모를 나타낸다. 즉, 해당 item이 얼마나 큰가를 나타내주는 값이다.

3) viewpoint : 1은 마모가 없음을 나타내고 2는 정면 시점을 나타내고 3은 측면 또는 후면 시점을 나타낸다.

4) zoom_in : 여기서 1은 확대하지 않음, 2는 중간 확대, 3은 큰 확대를 나타낸다.

5) landmarks : 세그멘테이션 테두리의 좌표값을 나타낸다. [x1, y1, v1, ..., xn, yn, vn]으로 구성되어 있으며, 여기서 v는 가시성을 나타낸다.(v = 2 visible, v = 1 폐색, v = 0 레이블이 없음)

6) style : 동일한 페어 ID를 가진 이미지와 의류 아이템을 구별하기위한 숫자이다. (shop에 나온 상품과 동일하면 0에 가깝다는데 무엇을 의미하는지는 정확히 모르겠다....)

7) bounding_box : bounding_box의 좌표값을 나타낸다.

8) category_id : 1은 짧은 소매상의, 2는 긴 소매상의, 3은 짧은 소매 의류, 4는 긴 소매 의류, 5는 조끼, 6은 슬링, 7은 반바지, 8은 바지, 9는 치마, 10은 짧은 소매 드레스, 11은 긴 소매 드레스, 12는 조끼 드레스, 13은 슬링 드레스

9) occlusion : 폐색의 정도를 나타내며 1은 약간의 폐색, 2는 중간 폐색, 3은 완전 폐색을 나타낸다.

10) category_name : str으로 표현한 카테고리 이름이다.

landmarks를 아래의 사진과 함께 조금 더 자세히 살펴보자.

그림의 숫자는 annotation 파일에서 각 카테고리의 랜드 landmark 순서를 나타낸다. 13개의 카테고리를 포함하는 총 294 개의 랜드 마크가 정의되어 있다.

또한, 데이터셋의 구성은 이미지는 'user'의 이미지와 'shop'의 이미지를 포함하여 연속적인 'pair_id'로 구성되어 있다. 예를 들어, 000001.jpg는 'user'의 이미지이며, 000002.jpg는 'shop'의 이미지라는 것이다.

그렇기에 pair로 묶여있는 이미지에 대해서는 style이 같을 수 밖에 없기에 bounding_box의 color도 하단의 사진처럼 동일할 수 밖에 없다.

3. Mask R CNN

DeepFashion2 Data-Set을 Segmentation 하기 위해 Mask R-CNN의 sample 코드를 활용해보고자 한다. Balloon.py 코드를 활용하였으며, Balloon.py에 대한 내용은 하단의 링크를 참고하기 바란다.

* balloon.py 트레이닝 과정 및 학습 결과 : https://kuklife.tistory.com/124?category=875154

[Mask R-CNN] Balloon.py 트레이닝(Window10)

Python/Tensorflow/Keras를 이용한 Mask RCNN : https://github.com/matterport/Mask_RCNN 0. 개요 Mask R-CNN에 대해 알아보고 Sample Code인 Balloon.py를 트레이닝 시킨다. 트레이닝 후 얻은 가중치를 이용해..

kuklife.tistory.com

3-1. deepfashion2 to COCO format

DeepFashion2는 논문을 읽어보면 COCO annotation format을 사용하였다. 하지만, DeepFashion2에서 제공하는 annotation format은 coco 형태가 아니다. 따라서, DeepFashion2의 annotation을 COCO annotation format으로 수정해야한다. 다행히도 github에 annotation을 coco format으로 변환시켜주는 코드를 함께 공개하였기에 이를 사용하여 format을 바꾸어보자.

* 변환 코드 : https://github.com/switchablenorms/DeepFashion2/blob/master/evaluation/deepfashion2_to_coco.py

코드를 사용하는 방법은 다음과 같다. 먼저 소스코드를 다운 받은 후, 107번째 줄의 num_images 변수 값을 바꿀 annotation의 개수로 바꾼다.

예를 들면, train의 json 파일은 191,961개이고 validation의 json 파일은 32,153개이다. train의 annotation을 COCO format으로 바꾸기 위해선 소스코드를 다음과 같이 수정하면 된다.

106
107
108
109

sub_index = 0 # the index of ground truth instance
for num in range(1,191961+1):
    json_name = '/.../val_annos/' + str(num).zfill(6)+'.json'
    image_name = '/.../val/' + str(num).zfill(6)+'.jpg'

 

그 후, 234번째 줄의 절대경로를 바꿀 annotation의 루트로 바꾸어 주면 된다. 그럼 해당 루트에 deepfashion2.json 파일이 생성되는데, 이름이 헷갈리지 않도록 "validation_json"으로 바꿀 것을 권장한다.

3-2. Import & ROOT_DIR 설정

위와 같이 셋팅이 완료되었으면 필요한 라이브러리들을 올리고 ROOT_DIR 설정을 하면 된다.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25

import os
import sys
import json
import datetime
import numpy as np
import skimage.draw
 
# Root directory of the project
ROOT_DIR = os.path.abspath("C:/Users/whtjd/Mask_RCNN")
 
# Import Mask RCNN
sys.path.append(ROOT_DIR)
 
from cocoapi.PythonAPI.pycocotools.coco import COCO
from cocoapi.PythonAPI.pycocotools import mask as maskUtils
from mrcnn.config import Config
from mrcnn import utils
from mrcnn.model import MaskRCNN
 
# Path to trained weights file
COCO_WEIGHTS_PATH = os.path.join(ROOT_DIR, "mask_rcnn_coco.h5")
 
# Directory to save logs and model checkpoints, if not provided
# through the command line argument --logs
DEFAULT_LOGS_DIR = os.path.join(ROOT_DIR, "logs")

 

 
3-3. Configurations 설정

3-1. 에서 바꿔주었던 coco format 형태의 annotation과 image가 있는 절대경로를 사전에 변수로 저장해둔 후 config를 만들면 나중에 편하기에 사전에 아래와 같이 설정해두는 것이 좋다.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28

############################################################
#  Configurations
############################################################
 
class DeepFashion2Config(Config):
    """Configuration for training on DeepFashion2.
    Derives from the base Config class and overrides values specific
    to the DeepFashion2 dataset.
    """
    # Give the configuration a recognizable name
    NAME = "deepfashion2"
 
    # We use a GPU with 12GB memory, which can fit two images.
    # Adjust down if you use a smaller GPU.
    IMAGES_PER_GPU = 2
 
    # Uncomment to train on 8 GPUs (default is 1)
    GPU_COUNT = 1
 
    # Number of classes (including background)
    NUM_CLASSES = 1 + 13  # Background + category
    
    USE_MINI_MASK = True
 
    train_img_dir = "C:/Users/whtjd/Mask_RCNN/DeepFashion2/train/image"
    train_json_path = "C:/Users/whtjd/Mask_RCNN/DeepFashion2/train/train_json.json"
    valid_img_dir = "C:/Users/whtjd/Mask_RCNN/DeepFashion2/validation/image"
    valid_json_path = "C:/Users/whtjd/Mask_RCNN/DeepFashion2/validation/validation_json.json"

 

3-4. coco, keypoint, mask, image 등 Load Data-set 관련 Class

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150

############################################################
#  Dataset
############################################################
class DeepFashion2Dataset(utils.Dataset):
    def load_coco(self, image_dir, json_path, class_ids=None,
                  class_map=None, return_coco=False):
        """Load the DeepFashion2 dataset.
        """
 
        coco = COCO(json_path)
 
        # Load all classes or a subset?
        if not class_ids:
            # All classes
            class_ids = sorted(coco.getCatIds())
 
        # All images or a subset?
        if class_ids:
            image_ids = []
            for id in class_ids:
                image_ids.extend(list(coco.getImgIds(catIds=[id])))
            # Remove duplicates
            image_ids = list(set(image_ids))
        else:
            # All images
            image_ids = list(coco.imgs.keys())
 
        # Add classes
        for i in class_ids:
            self.add_class("deepfashion2", i, coco.loadCats(i)[0]["name"])
 
        # Add images
        for i in image_ids:
            self.add_image(
                "deepfashion2", image_id=i,
                path=os.path.join(image_dir, coco.imgs[i]['file_name']),
                width=coco.imgs[i]["width"],
                height=coco.imgs[i]["height"],
                annotations=coco.loadAnns(coco.getAnnIds(
                    imgIds=[i], catIds=class_ids, iscrowd=None)))
        if return_coco:
            return coco
        
    def load_keypoint(self, image_id):
        """
        """
        image_info = self.image_info[image_id]
        if image_info["source"] != "deepfashion2":
            return super(DeepFashion2Dataset, self).load_mask(image_id)
 
        instance_keypoints = []
        class_ids = []
        annotations = self.image_info[image_id]["annotations"]
 
        for annotation in annotations:
            class_id = self.map_source_class_id(
                "deepfashion2.{}".format(annotation['category_id']))
            if class_id:
                keypoint = annotation['keypoints']
 
                instance_keypoints.append(keypoint)
                class_ids.append(class_id)
 
        keypoints = np.stack(instance_keypoints, axis=1)
        class_ids = np.array(class_ids, dtype=np.int32)
        return keypoints, class_ids
 
    def load_mask(self, image_id):
        """Load instance masks for the given image.
        Different datasets use different ways to store masks. This
        function converts the different mask format to one format
        in the form of a bitmap [height, width, instances].
        Returns:
        masks: A bool array of shape [height, width, instance count] with
            one mask per instance.
        class_ids: a 1D array of class IDs of the instance masks.
        """
        # If not a COCO image, delegate to parent class.
        image_info = self.image_info[image_id]
        if image_info["source"] != "deepfashion2":
            return super(DeepFashion2Dataset, self).load_mask(image_id)
 
        instance_masks = []
        class_ids = []
        annotations = self.image_info[image_id]["annotations"]
        # Build mask of shape [height, width, instance_count] and list
        # of class IDs that correspond to each channel of the mask.
        for annotation in annotations:
            class_id = self.map_source_class_id(
                "deepfashion2.{}".format(annotation['category_id']))
            if class_id:
                m = self.annToMask(annotation, image_info["height"],
                                   image_info["width"])
                # Some objects are so small that they're less than 1 pixel area
                # and end up rounded out. Skip those objects.
                if m.max() < 1:
                    continue
                # Is it a crowd? If so, use a negative class ID.
                if annotation['iscrowd']:
                    # Use negative class ID for crowds
                    class_id *= -1
                    # For crowd masks, annToMask() sometimes returns a mask
                    # smaller than the given dimensions. If so, resize it.
                    if m.shape[0] != image_info["height"] or m.shape[1] != image_info["width"]:
                        m = np.ones([image_info["height"], image_info["width"]], dtype=bool)
                instance_masks.append(m)
                class_ids.append(class_id)
 
        # Pack instance masks into an array
        if class_ids:
            mask = np.stack(instance_masks, axis=2).astype(np.bool)
            class_ids = np.array(class_ids, dtype=np.int32)
            return mask, class_ids
        else:
            # Call super class to return an empty mask
            return super(DeepFashion2Dataset, self).load_mask(image_id)
        
    def image_reference(self, image_id):
        """Return a link to the image in the COCO Website."""
        super(DeepFashion2Dataset, self).image_reference(image_id)
 
    # The following two functions are from pycocotools with a few changes.
 
    def annToRLE(self, ann, height, width):
        """
        Convert annotation which can be polygons, uncompressed RLE to RLE.
        :return: binary mask (numpy 2D array)
        """
        segm = ann['segmentation']
        if isinstance(segm, list):
            # polygon -- a single object might consist of multiple parts
            # we merge all parts into one mask rle code
            rles = maskUtils.frPyObjects(segm, height, width)
            rle = maskUtils.merge(rles)
        elif isinstance(segm['counts'], list):
            # uncompressed RLE
            rle = maskUtils.frPyObjects(segm, height, width)
        else:
            # rle
            rle = ann['segmentation']
        return rle
 
    def annToMask(self, ann, height, width):
        """
        Convert annotation which can be polygons, uncompressed RLE, or RLE to binary mask.
        :return: binary mask (numpy 2D array)
        """
        rle = self.annToRLE(ann, height, width)
        m = maskUtils.decode(rle)
        return m

 

3-5. Train

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15

def train(model, config):
    """
    """
    dataset_train = DeepFashion2Dataset()
    dataset_train.load_coco(config.train_img_dir, config.train_json_path)
    dataset_train.prepare()
 
    dataset_valid = DeepFashion2Dataset()
    dataset_valid.load_coco(config.valid_img_dir, config.valid_json_path)
    dataset_valid.prepare()
 
    model.train(dataset_train, dataset_valid,
                learning_rate=config.LEARNING_RATE,
                epochs=30,
                layers='3+')

 

3-6. Splash

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74

############################################################
#  Splash
############################################################
 
def color_splash(image, mask):
    """Apply color splash effect.
    image: RGB image [height, width, 3]
    mask: instance segmentation mask [height, width, instance count]
 
    Returns result image.
    """
    # Make a grayscale copy of the image. The grayscale copy still
    # has 3 RGB channels, though.
    gray = skimage.color.gray2rgb(skimage.color.rgb2gray(image)) * 255
    # Copy color pixels from the original color image where mask is set
    if mask.shape[-1] > 0:
        # We're treating all instances as one, so collapse the mask into one layer
        mask = (np.sum(mask, -1, keepdims=True) >= 1)
        splash = np.where(mask, image, gray).astype(np.uint8)
    else:
        splash = gray.astype(np.uint8)
    return splash
 
 
def detect_and_color_splash(model, image_path=None, video_path=None):
    assert image_path or video_path
 
    # Image or video?
    if image_path:
        # Run model detection and generate the color splash effect
        print("Running on {}".format(args.image))
        # Read image
        image = skimage.io.imread(args.image)
        # Detect objects
        r = model.detect([image], verbose=1)[0]
        # Color splash
        splash = color_splash(image, r['masks'])
        # Save output
        file_name = "splash_{:%Y%m%dT%H%M%S}.png".format(datetime.datetime.now())
        skimage.io.imsave(file_name, splash)
    elif video_path:
        import cv2
        # Video capture
        vcapture = cv2.VideoCapture(video_path)
        width = int(vcapture.get(cv2.CAP_PROP_FRAME_WIDTH))
        height = int(vcapture.get(cv2.CAP_PROP_FRAME_HEIGHT))
        fps = vcapture.get(cv2.CAP_PROP_FPS)
 
        # Define codec and create video writer
        file_name = "splash_{:%Y%m%dT%H%M%S}.avi".format(datetime.datetime.now())
        vwriter = cv2.VideoWriter(file_name,
                                  cv2.VideoWriter_fourcc(*'MJPG'),
                                  fps, (width, height))
 
        count = 0
        success = True
        while success:
            print("frame: ", count)
            # Read next image
            success, image = vcapture.read()
            if success:
                # OpenCV returns images as BGR, convert to RGB
                image = image[..., ::-1]
                # Detect objects
                r = model.detect([image], verbose=0)[0]
                # Color splash
                splash = color_splash(image, r['masks'])
                # RGB -> BGR to save image to video
                splash = splash[..., ::-1]
                # Add image to video writer
                vwriter.write(splash)
                count += 1
        vwriter.release()
    print("Saved to ", file_name)

 

3-7. main

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100

############################################################
#  main
############################################################
 
 
if __name__ == "__main__":
    ROOT_DIR = os.path.abspath("./")
    DEFAULT_LOGS_DIR = os.path.join(ROOT_DIR, "logs")
    COCO_WEIGHTS_PATH = os.path.join(ROOT_DIR, "mask_rcnn_coco.h5")
    import argparse
 
    # Parse command line arguments
    parser = argparse.ArgumentParser(
        description='Train Match R-CNN for DeepFashion.')
    parser.add_argument("command",
                        metavar="<command>",
                        help="'train' or 'splash'")
    parser.add_argument('--weights', required=True,
                        metavar="/path/to/weights.h5",
                        help="Path to weights .h5 file or 'coco'")
    parser.add_argument('--logs', required=False,
                        default=DEFAULT_LOGS_DIR,
                        metavar="/path/to/logs/",
                        help='Logs and checkpoints directory (default=logs/)')
    parser.add_argument('--image', required=False,
                        metavar="path or URL to image",
                        help='Image to apply the color splash effect on')
    parser.add_argument('--video', required=False,
                        metavar="path or URL to video",
                        help='Video to apply the color splash effect on')
    args = parser.parse_args()
 
    """
    # Validate arguments
    if args.command == "train":
        assert args.dataset, "Argument --dataset is required for training"
    elif args.command == "splash":
        assert args.image or args.video,\
               "Provide --image or --video to apply color splash"
    """
 
    print("Weights: ", args.weights)
    print("Logs: ", args.logs)
 
 
    # Configurations
    if args.command == "train":
        config = DeepFashion2Config()
    else:
        class InferenceConfig(DeepFashion2Config):
            # Set batch size to 1 since we'll be running inference on
            # one image at a time. Batch size = GPU_COUNT * IMAGES_PER_GPU
            GPU_COUNT = 1
            IMAGES_PER_GPU = 1
        config = InferenceConfig()
    config.display()
 
    # Create model
    if args.command == "train":
        model = MaskRCNN(mode="training", config=config,
                                  model_dir=args.logs)
    else:
        model = MaskRCNN(mode="inference", config=config,
                                  model_dir=args.logs)
 
    # Select weights file to load
    if args.weights.lower() == "coco":
        weights_path = COCO_WEIGHTS_PATH
        # Download weights file
        if not os.path.exists(weights_path):
            utils.download_trained_weights(weights_path)
    elif args.weights.lower() == "last":
        # Find last trained weights
        weights_path = model.find_last()
    elif args.weights.lower() == "imagenet":
        # Start from ImageNet trained weights
        weights_path = model.get_imagenet_weights()
    else:
        weights_path = args.weights
 
    # Load weights
    print("Loading weights ", weights_path)
    if args.weights.lower() == "coco":
        # Exclude the last layers because they require a matching
        # number of classes
        model.load_weights(weights_path, by_name=True, exclude=[
            "mrcnn_class_logits", "mrcnn_bbox_fc",
            "mrcnn_bbox", "mrcnn_mask"])
    else:
        model.load_weights(weights_path, by_name=True)
 
    # Train or evaluate
    if args.command == "train":
        train(model, config)
    elif args.command == "splash":
        detect_and_color_splash(model, image_path=args.image,
                                video_path=args.video)
    else:
        print("'{}' is not recognized. "
              "Use 'train' or 'splash'".format(args.command)) 

 

4. 트레이닝

0) 정확도를 향상시키는 방법(소스코드 내에서는 적용시켜놓은 상태)

Train 함수 부분에 layers='3+'라고 적힌 부분이 있는데, Mask R-CNN은 친절하게도 head만 학습할건지, 네트워크 전체를 학습할건지 에 대한 옵션 선택이 가능하다. 해당 옵션을 조정해가며 정확도를 향상시킬 수도 있다. 초기에는 head로 잡아 두었다가 늦게 알게되어 3+로 변경하여 학습하였다.

1) Training

위와 같이 코드를 작업하였으면, balloon.py와 비슷한 방식으로 아래와 같이 실행해주면 된다. 훈련된 가중치는 루트 디렉토리 내의 logs 폴더에 저장되며, 1000 step에 30 epochs이기에 시간이 굉장히 오래걸린다.(1080Ti 기준으로 6~8시간 가량 걸렸던걸로 기억한다.)

python Mask_RCNN/Mask_RCNN_DeepFashion2.py train --weights=coco

2) 학습 완료된 가중치 확인 및 학습시킨 모델 적용

test 폴더에 있는 image를 통해 splash를 시도해보자. 여러가지 결과를 확인해보기 위해 나는 000003, 000004, 000006, 000017, 000035 총 5개를 확인해보았다. 아래의 코드는 예시이다.

python Mask_RCNN/Mask_RCNN_DeepFashion2.py splash --weights=Mask_RCNN/mask_rcnn_deepfashion2_0030.h5 --image=Mask_RCNN/DeepFashion2/test/test/image/000004.jpg

5. 평가

Segmentation의 평가지표로 잘 알려져 있는 MIoU 값을 활용하여 위의 결과를 평가해보고자 한다.(MIoU에 대한 설명은 여기 클릭)

To be continue.......

6. 결론

결과를 보면 옷이 전체적으로 보이지 않는 경우는 segmentation을 못하는 모습이 많이 보였으며, 3번째 결과 같은 경우 segmentation을 하면 안되는 책을 mask 씌웠다. 또한, 자세히 보면 옷의 겉부분은 segmentation을 잘하지 못하는 경향을 보였다.

즉, 전체적인 결과를 볼 필요도 없이 결과 자체는 좋지 못하며 완벽하게 segmentation 되지는 않지만 추후 데이터셋을 늘리거나 레이어 조정 등을 시도하면 더 정확한 segmentation을 할 것으로 보인다.

다음에는 segmentation에서 가장 좋은 모델이라고 불리는 deeplab v3+와 Panoptic DeepLab으로 실험을 해볼 예정이다. (논문을 쓸 아이디어가 나올 때까지....)

7. github link

To be continue.......

오류가 발생되어 실험이 불가능하다면 댓글 혹은 방명록에 작성해주시기 바랍니다. 또한, 더 좋은 방향으로 학습시키는 방법이 있다면 말씀해주시면 감사하겠습니다.

저작자표시

'Deep-Learning > [Vision] 실습' 카테고리의 다른 글

[Mask R-CNN] Balloon.py 트레이닝(Window10) (1)	2020.02.24
[Deep Learning] Neural Network for predicting XOR operations in Python (0)	2019.04.09

ABOUT ME

KUKLIFE KUKLIFE

'Deep-Learning > [Vision] 실습' 카테고리의 다른 글

티스토리툴바

ABOUT ME

'Deep-Learning > [Vision] 실습' 카테고리의 다른 글

관련글 관련글 더보기

티스토리툴바