PyTorch 튜토리얼 (3): 합성곱신경망 CNN 을 활용한 CIFAR-10 이미지 분류기

이번 시간에는 합성곱신경망 CNN(Convolutional Neural Network)를 구현해볼 것이다.

데이터셋은 옛 현인이 공개해둔 CIFAR-10 데이터셋을 활용할 것이다.

이번 예제는 꼭 GPU를 사용하길 바란다.

혹은 CPU를 사용하겠다면 더욱 간단한 네트워크를 작성하길 바란다.

아래 작성한 코드 기준으로 거의 30배 가까이 속도 차이가 난다는 사실을 유념하길 바란다.

PyTorch를 사용하여 CIFAR-10 데이터셋을 처리하고 CNN(Convolutional Neural Network)을 구현한 코드입니다. 아래는 주요 내용을 간략히 설명한 내용입니다

1. CIFAR-10 데이터셋 개요

CIFAR-10은 컴퓨터 비전 분야에서 널리 사용되는 데이터셋으로, 10개의 클래스(비행기, 자동차, 새, 고양이 등)로 구성된 60,000개의 32x32 컬러 이미지로 이루어져 있습니다.
50,000개의 훈련 이미지와 10,000개의 테스트 이미지로 나뉩니다.

2. 코드 주요 단계

(1) 데이터 다운로드

torchvision 라이브러리를 활용해서 이미지 데이터셋을 다운로드받습니다.

# Dowload the dataset
from torchvision.datasets.utils import download_url

dataset_url = "https://s3.amazonaws.com/fast-ai-imageclas/cifar10.tgz"
download_url(dataset_url, '.')

# Extract from archive
import tarfile

data_dir = './data'
with tarfile.open('./cifar10.tgz', 'r:gz') as tar:
    tar.extractall(path=data_dir)

(2) 데이터 전처리

압축 파일을 열어서 데이터 구성을 확인합니다.
- 훈련용 데이터셋은 50,000개, 테스트용 데이터셋은 10,000개로 구성되었습니다.
- 분류된 이미지 종류는 10가지입니다.

import os
data_dir = data_dir + "/cifar10"

print(os.listdir(data_dir))
classes = os.listdir(data_dir + "/train")
print(classes)

for c in classes:
    print(f"{c:10s} (train): {len(os.listdir(data_dir + '/train/' + c))}")
    print(f"{c:10s} ( test): {len(os.listdir(data_dir + '/test/' + c))}")

이미지 데이터를 픽셀 단위로 쪼개서 Tensor로 변환합니다.

from torchvision.datasets import ImageFolder
from torchvision.transforms import ToTensor
dataset_train = ImageFolder(data_dir+'/train', transform=ToTensor())

print(f"Total size of the dataset: {len(dataset_train)}")

# RGB 32x32 pixels per image; 3 (channel) * 32 (width) * 32 (height)
img, label = dataset_train[0]
print(img.shape, label)
img

훈련용 데이터셋을 9:1로 쪼개서 학습용 데이터셋과 검증용 데이터셋을 분리합니다.

from torch.utils.data import random_split

val_size = int(len(dataset_train)*.1)
train_size = len(dataset_train) - val_size

train_ds, val_ds = random_split(dataset_train, [train_size, val_size])
test_ds = ImageFolder(data_dir+'/test', transform=ToTensor())
len(train_ds), len(val_ds), len(test_ds)

최종적으로 CIFAR-10 데이터를 PyTorch의 DataLoader로 준비합니다:
- train_loader: 학습용 데이터 로더
- val_loader: 검증용 데이터 로더
- test_loader: 테스트용 데이터 로더

from torch.utils.data.dataloader import DataLoader

batch_size=128
train_loader = DataLoader(train_ds, batch_size, shuffle=True)
val_loader = DataLoader(val_ds, batch_size*2)
test_loader = DataLoader(test_ds, batch_size*2)

# Train data
for images, labels in train_loader:
    print('images.shape:', images.shape)
    break

# Validation data
for images, labels in val_loader:
    print('images.shape:', images.shape)
    break
    
# Test data
for images, labels in test_loader:
    print('images.shape:', images.shape)
    break

(3) CNN 모델 설계

모델은 PyTorch의 nn.Module 클래스를 상속받아 정의됩니다.
일반적인 CNN 구조:
1. 여러 개의 Convolutional Layer와 ReLU 활성화 함수
2. Max Pooling으로 다운샘플링
3. Fully Connected Layer로 출력 계산

from torch import nn

class SimpleCNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(in_channels=3, out_channels=32, kernel_size=3, padding=1)
        self.conv2 = nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3, padding=1)
        self.conv3 = nn.Conv2d(in_channels=64, out_channels=128, kernel_size=3, padding=1)
        self.conv4 = nn.Conv2d(in_channels=128, out_channels=128, kernel_size=3, padding=1)
        self.conv5 = nn.Conv2d(in_channels=128, out_channels=256, kernel_size=3, padding=1)
        self.conv6 = nn.Conv2d(in_channels=256, out_channels=256, kernel_size=3, padding=1)

        self.batchn1 = nn.BatchNorm2d(32)
        self.batchn2 = nn.BatchNorm2d(64)
        self.batchn3 = nn.BatchNorm2d(128)
        self.batchn4 = nn.BatchNorm2d(128)
        self.batchn5 = nn.BatchNorm2d(256)
        self.batchn6 = nn.BatchNorm2d(256)
        self.relu = nn.ReLU()
        self.maxpool = nn.MaxPool2d(kernel_size=2, stride=2)

        self.flatten = nn.Flatten()
        self.fc1 = nn.Linear(256*4*4, 1024)
        self.fc2 = nn.Linear(1024, 512)
        self.fc3 = nn.Linear(512, 10)

    def forward(self, x):                           # Output shape
        x = self.relu(self.batchn1(self.conv1(x)))  #  32 * 32 * 32
        x = self.relu(self.batchn2(self.conv2(x)))  #  64 * 32 * 32
        x = self.maxpool(x)                         #  64 * 16 * 16

        x = self.relu(self.batchn3(self.conv3(x)))  # 128 * 16 * 16
        x = self.relu(self.batchn4(self.conv4(x)))  # 128 * 16 * 16
        x = self.maxpool(x)                         # 128 *  8 *  8

        x = self.relu(self.batchn5(self.conv5(x)))  # 256 *  8 *  8
        x = self.relu(self.batchn6(self.conv6(x)))  # 256 *  8 *  8
        x = self.maxpool(x)                         # 256 *  4 *  4

        x = self.flatten(x)                         # 4096 = 256*4*4
        x = self.relu(self.fc1(x))                  # 1024
        x = self.relu(self.fc2(x))                  # 512
        x = self.fc3(x)                             # 10
        return x

import torch
from torchsummary import summary

device = 'cuda' if torch.cuda.is_available() else 'cpu'
model = SimpleCNN().to(device)
summary(model, (3, 32, 32))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
================================================================
            Conv2d-1           [-1, 32, 32, 32]             896
       BatchNorm2d-2           [-1, 32, 32, 32]              64
              ReLU-3           [-1, 32, 32, 32]               0
            Conv2d-4           [-1, 64, 32, 32]          18,496
       BatchNorm2d-5           [-1, 64, 32, 32]             128
              ReLU-6           [-1, 64, 32, 32]               0
         MaxPool2d-7           [-1, 64, 16, 16]               0
            Conv2d-8          [-1, 128, 16, 16]          73,856
       BatchNorm2d-9          [-1, 128, 16, 16]             256
             ReLU-10          [-1, 128, 16, 16]               0
           Conv2d-11          [-1, 128, 16, 16]         147,584
      BatchNorm2d-12          [-1, 128, 16, 16]             256
             ReLU-13          [-1, 128, 16, 16]               0
        MaxPool2d-14            [-1, 128, 8, 8]               0
           Conv2d-15            [-1, 256, 8, 8]         295,168
      BatchNorm2d-16            [-1, 256, 8, 8]             512
             ReLU-17            [-1, 256, 8, 8]               0
           Conv2d-18            [-1, 256, 8, 8]         590,080
      BatchNorm2d-19            [-1, 256, 8, 8]             512
             ReLU-20            [-1, 256, 8, 8]               0
        MaxPool2d-21            [-1, 256, 4, 4]               0
          Flatten-22                 [-1, 4096]               0
           Linear-23                 [-1, 1024]       4,195,328
             ReLU-24                 [-1, 1024]               0
           Linear-25                  [-1, 512]         524,800
             ReLU-26                  [-1, 512]               0
           Linear-27                   [-1, 10]           5,130
================================================================
Total params: 5,853,066
Trainable params: 5,853,066
Non-trainable params: 0
----------------------------------------------------------------
Input size (MB): 0.01
Forward/backward pass size (MB): 4.77
Params size (MB): 22.33
Estimated Total Size (MB): 27.11
----------------------------------------------------------------

(4) 손실 함수 및 옵티마이저

손실 함수: CrossEntropyLoss
옵티마이저: Adam

# Loss function & Optimizer
from torch import optim

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

(5) 훈련

데이터를 모델에 입력하고 Forward Propagation 및 Backward Propagation을 실행하여 가중치를 업데이트합니다.
시도(epoch)마다 손실(loss)을 기록합니다.
- 검증 손실의 변화를 확인합니다.

import torch.nn.functional as F

epochs = 5
for epoch in range(epochs):
    # Training phase
    running_loss_train = 0.0
    model.train()
    for inputs, labels in train_loader:
        inputs, labels = inputs.to(device), labels.to(device)
        optimizer.zero_grad()
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        running_loss_train += loss.item()

    # Validation phase
    running_loss_val = 0.0
    running_acc_val = 0
    model.eval()
    with torch.no_grad():
        for inputs, labels in val_loader:
            inputs, labels = inputs.to(device), labels.to(device)
            outputs = model(inputs)
            loss = criterion(outputs, labels)
            running_loss_val += loss.item()
            probabilities = F.softmax(outputs, dim=1)
            preds = torch.argmax(outputs, 1)
            running_acc_val += torch.sum(preds == labels).item()

    print(f'Epoch [{epoch+1:2d}/{epochs:2d}], TRN Loss: {running_loss_train / len(train_loader):.4f}, VLD Loss: {running_loss_val / len(val_loader):.4f}, VLD Acc: {running_acc_val / len(val_ds):.4f}')

3. 학습 결과

이 코드의 CNN 모델은 CIFAR-10 데이터셋에 대해 적절한 학습을 수행하며, 일정 수준의 정확도를 달성합니다.

acc_test = 0
model.eval()
with torch.no_grad():
    for inputs, labels in test_loader:
        inputs, labels = inputs.to(device), labels.to(device)
        outputs = model(inputs)
        probabilities = F.softmax(outputs, dim=1)
        preds = torch.argmax(outputs, 1)
        acc_test += torch.sum(preds == labels).item()
print(f'Test Acc: {acc_test / len(test_loader.dataset)*100:.2f}%')

Test Acc: 78.02%

더 높은 정확도를 위해 데이터 증강(data augmentation), 더 복잡한 모델 설계, 하이퍼파라미터 튜닝 등이 필요할 수 있습니다.

4. 결론

이 코드는 PyTorch를 처음 배우거나, CNN으로 이미지 분류 작업을 수행하려는 사람들에게 적합합니다. CIFAR-10처럼 비교적 간단한 데이터셋으로 시작해 CNN의 기초 개념을 익히기에 좋은 예제입니다.

5. 참고자료

https://www.kaggle.com/code/shadabhussain/cifar-10-cnn-using-pytorch

CIFAR 10- CNN using PyTorch

Explore and run machine learning code with Kaggle Notebooks | Using data from No attached data sources

www.kaggle.com

https://medium.com/@sergioalves94/deep-learning-in-pytorch-with-cifar-10-dataset-858b504a6b54

Deep Learning in PyTorch with CIFAR-10 dataset

In this post, we will learn how to build a deep learning model in PyTorch by using the CIFAR-10 dataset.

medium.com

저작자표시 비영리 (새창열림)

'Computer > 인공지능' 카테고리의 다른 글

PyTorch 튜토리얼 (2): 심층신경망 DNN 을 활용한 집값 예측 (7)	2025.01.14
PyTorch 튜토리얼 (1): 구글 Colab을 활용한 입문 (6)	2025.01.13

Dr.Penguin

PyTorch 튜토리얼 (3): 합성곱신경망 CNN 을 활용한 CIFAR-10 이미지 분류기

1. CIFAR-10 데이터셋 개요

2. 코드 주요 단계

(1) 데이터 다운로드

(2) 데이터 전처리

(3) CNN 모델 설계

(4) 손실 함수 및 옵티마이저

(5) 훈련

3. 학습 결과

4. 결론

5. 참고자료

'Computer > 인공지능' 카테고리의 다른 글

티스토리툴바

PyTorch 튜토리얼 (3): 합성곱신경망 CNN 을 활용한 CIFAR-10 이미지 분류기

1. CIFAR-10 데이터셋 개요

2. 코드 주요 단계

(1) 데이터 다운로드

(2) 데이터 전처리

(3) CNN 모델 설계

(4) 손실 함수 및 옵티마이저

(5) 훈련

3. 학습 결과

4. 결론

5. 참고자료

'Computer > 인공지능' 카테고리의 다른 글

'Computer/인공지능' Related Articles

티스토리툴바