Pytorch学习笔记(八)——AlexNet

Pytorch学习笔记(八)——AlexNet,第1张

目录
  • 一、构造 Experiment 类
  • 二、AlexNet 简介
  • 三、搭建 AlexNet
  • 四、CIFAR-10 数据集
  • 五、训练/测试 AlexNet
  • 附录:完整代码

一、构造 Experiment 类

卷积神经网络训练和测试的代码结构大部分都相同,为了方便今后更加简洁地书写代码,这里先构造一个 Experiment 类,并将其保存在 Experiment.py 文件中。


Experiment.py \textcolor{blue}{\text{Experiment.py}} Experiment.py

import torch
import time
import numpy as np
import matplotlib.pyplot as plt


class Experiment:

    def __init__(self, train_loader, test_loader, model, num_epochs, lr, optimizer='SGD', quiet=False):
        self.train_loader = train_loader
        self.test_loader = test_loader
        self.model = model.cuda()
        self.num_epochs = num_epochs
        self.lr = lr
        self.loss_fn = torch.nn.CrossEntropyLoss()
        self.quiet = quiet
        self.speed = 0
        self.train_loss, self.train_acc = [], []
        self.test_loss, self.test_acc = [], []
        if optimizer == 'SGD':
            self.optimizer = torch.optim.SGD(self.model.parameters(), lr=self.lr)
        elif optimizer == 'Adam':
            self.optimizer = torch.optim.Adam(self.model.parameters(), lr=self.lr)
        else:
            raise ValueError

    def train(self):
        correct, avg_loss = 0, 0
        for batch_idx, (X, y) in enumerate(self.train_loader):
            tic = time.time()
            X, y = X.cuda(), y.cuda()
            pred = self.model(X)
            loss = self.loss_fn(pred, y)
            self.optimizer.zero_grad()
            loss.backward()
            self.optimizer.step()
            toc = time.time()
            self.speed += toc - tic
            correct += (pred.argmax(dim=1) == y).sum().item()
            avg_loss += loss
        avg_loss /= (batch_idx + 1)
        correct /= len(self.train_loader.dataset)
        self.train_loss.append(avg_loss.item())
        self.train_acc.append(correct)
        if not self.quiet:
            print('Train Avg Loss: {:.6f},'.format(avg_loss), end=' ')
            print('Train Accuracy: {:.6f}'.format(correct))

    def test(self):
        correct, avg_loss = 0, 0
        with torch.no_grad():
            for batch_idx, (X, y) in enumerate(self.test_loader):
                X, y = X.cuda(), y.cuda()
                pred = self.model(X)
                loss = self.loss_fn(pred, y)
                correct += (pred.argmax(dim=1) == y).sum().item()
                avg_loss += loss
        avg_loss /= (batch_idx + 1)
        correct /= len(self.test_loader.dataset)
        self.test_loss.append(avg_loss.item())
        self.test_acc.append(correct)
        if not self.quiet:
            print('Test  Avg Loss: {:.6f},'.format(avg_loss), end=' ')
            print('Test  Accuracy: {:.6f}\n'.format(correct))

    def main(self):
        for epoch in range(self.num_epochs):
            if not self.quiet:
                print('Epoch {}\n'.format(epoch + 1) + '-' * 50)
            self.train()
            self.test()
        self.speed = self.num_epochs * len(self.train_loader.dataset) / self.speed
        print('-' * 50)
        print('{:.1f} samples/sec'.format(self.speed))
        print('-' * 50 + '\n\n' + 'Done!')

    def show(self):
        x = np.arange(1, self.num_epochs + 1)
        plt.plot(x, self.train_loss, c='royalblue', label='train loss')
        plt.plot(x, self.train_acc, c='seagreen', label='train acc', ls='dashed')
        plt.plot(x, self.test_loss, c='darkorchid', label='test loss')
        plt.plot(x, self.test_acc, c='firebrick', label='test acc', ls='dashed')
        plt.legend(loc='best')
        plt.xlabel('epoch')
        plt.grid()
        plt.show()

具体使用方法:

from Experiment import Experiment as E

net = Net()  # 初始化我们的神经网络(无需移动到GPU,因为Experiment会自动将其移动到GPU上)
num_epochs = 20  # 决定有多少个Epoch
lr = 1e-2  # 学习率
e = E(train_loader, test_loader, net, num_epochs, lr)
e.main()  # 进行训练和测试
e.show()  # 绘制损失曲线和准确率曲线

Experiment 仅支持 SGD 和 Adam 优化器,默认使用 SGD 作为优化器,如需更换优化器,则需要为 optimizer 指定参数:

e = E(train_loader, test_loader, net, num_epochs, lr, optimizer='Adam')

默认情况下,执行 e.main() 会不断输出每一个 Epoch 的结果,如果只想看最后的曲线图而不看这些频繁生成结果,需要指定 quiet 参数的值为 True

e = E(train_loader, test_loader, net, num_epochs, lr, quiet=True)

此外,Experiment 默认在 GPU 上进行训练。

二、AlexNet 简介

AlexNet 和 LeNet 的设计理念非常相似(可以粗略地认为 AlexNet 是更大更深的 LeNet),但依然存在显著差异。相比 LeNet,AlexNet 采用 ReLU 作为激活函数,平均池化换成了最大池化,全连接层中增加了 Dropout 用作正则,此外对于输入的数据集也进行了图像增强(Image Augmentation)。

不考虑每层的参数,AlexNet 相比 LeNet 新增加了三个卷积层一个汇聚层,两者的结构比较如下:

可以看出 AlexNet 针对的数据是RGB彩图,且图像尺寸为 224 × 224 224\times 224 224×224

三、搭建 AlexNet

为了方便训练和测试,我们这里只考虑十分类任务,即修改最后一层的神经元个数为 10 10 10,不再是原论文中的 1000 1000 1000(否则训练时间会过长)。

from torch import nn


class AlexNet(nn.Module):

    def __init__(self):
        super().__init__()
        self.conv = nn.Sequential(
            nn.Conv2d(3, 96, kernel_size=11, stride=4, padding=1), nn.ReLU(),
            nn.MaxPool2d(kernel_size=3, stride=2),
            nn.Conv2d(96, 256, kernel_size=5, padding=2), nn.ReLU(),
            nn.MaxPool2d(kernel_size=3, stride=2),
            nn.Conv2d(256, 384, kernel_size=3, padding=1), nn.ReLU(),
            nn.Conv2d(384, 384, kernel_size=3, padding=1), nn.ReLU(),
            nn.Conv2d(384, 256, kernel_size=3, padding=1), nn.ReLU(),
            nn.MaxPool2d(kernel_size=3, stride=2),
        )
        self.fc = nn.Sequential(
            nn.Linear(6400, 4096), nn.ReLU(),
            nn.Dropout(p=0.5),
            nn.Linear(4096, 4096), nn.ReLU(),
            nn.Dropout(p=0.5),
            nn.Linear(4096, 10)
        )

    def forward(self, x):
        x = self.conv(x)
        x = torch.flatten(x, 1)
        x = self.fc(x)
        return x
四、CIFAR-10 数据集

CIFAR-10 数据集(官网链接)共包含了 60000 60000 60000 32 × 32 32\times 32 32×32 像素的图片。一共有 10 10 10 类,每一类有 6000 6000 6000 张图片。其中训练集中共有 50000 50000 50000 张图片,测试集中共有 10000 10000 10000 张图片。

需要注意的是,AlexNet 针对的是 224 × 224 224\times 224 224×224 的图片,因此我们需要调整图片大小(实际上这不是一个明智的做法,这里仅仅是为了使用 AlexNet)。

数据预处理代码:

import torchvision
from torch.utils.data import DataLoader
from torchvision.transforms import ToTensor, Resize

transformer = torchvision.transforms.Compose([Resize(224), ToTensor()])
train_data = torchvision.datasets.CIFAR10('/mnt/mydataset', train=True, transform=transformer, download=True)
test_data = torchvision.datasets.CIFAR10('/mnt/mydataset', train=False, transform=transformer, download=True)
train_loader = DataLoader(train_data, batch_size=100, shuffle=True, num_workers=4)
test_loader = DataLoader(test_data, batch_size=100, num_workers=4)
五、训练/测试 AlexNet

设置学习率为 0.05,训练 50 个 Epoch:

alexnet = AlexNet()
e = E(train_loader, test_loader, alexnet, 50, 0.05)
e.main()
e.show()

在 NVIDIA GeForce RTX 3080 Ti 上的训练结果如下(这里仅展示第 50 个Epoch 以及整体变化图):

Epoch 50
--------------------------------------------------
Train Avg Loss: 0.011900, Train Accuracy: 0.995940
Test  Avg Loss: 1.381492, Test  Accuracy: 0.792600

--------------------------------------------------
4657.1 samples/sec
--------------------------------------------------

Done!


可以看出大约从第15个Epoch起,测试集的损失在逐步上升,且 AlexNet 在测试集上的精度并未得到进一步提高,反倒在训练集上的精度逐步逼近1。这种情况可能是因为我们的学习率始终为一个常数,需要在第15个Epoch后进一步调小学习率才有可能提升 AlexNet 在测试集上的精度。

附录:完整代码
import torch
import torchvision
from torch import nn
from torch.utils.data import DataLoader
from torchvision.transforms import ToTensor, Resize
from Experiment import Experiment as E


class AlexNet(nn.Module):

    def __init__(self):
        super().__init__()
        self.conv = nn.Sequential(
            nn.Conv2d(3, 96, kernel_size=11, stride=4, padding=1), nn.ReLU(),
            nn.MaxPool2d(kernel_size=3, stride=2),
            nn.Conv2d(96, 256, kernel_size=5, padding=2), nn.ReLU(),
            nn.MaxPool2d(kernel_size=3, stride=2),
            nn.Conv2d(256, 384, kernel_size=3, padding=1), nn.ReLU(),
            nn.Conv2d(384, 384, kernel_size=3, padding=1), nn.ReLU(),
            nn.Conv2d(384, 256, kernel_size=3, padding=1), nn.ReLU(),
            nn.MaxPool2d(kernel_size=3, stride=2),
        )
        self.fc = nn.Sequential(
            nn.Linear(6400, 4096), nn.ReLU(),
            nn.Dropout(p=0.5),
            nn.Linear(4096, 4096), nn.ReLU(),
            nn.Dropout(p=0.5),
            nn.Linear(4096, 10)
        )

    def forward(self, x):
        x = self.conv(x)
        x = torch.flatten(x, 1)
        x = self.fc(x)
        return x


transformer = torchvision.transforms.Compose([Resize(224), ToTensor()])
train_data = torchvision.datasets.CIFAR10('/mnt/mydataset', train=True, transform=transformer, download=True)
test_data = torchvision.datasets.CIFAR10('/mnt/mydataset', train=False, transform=transformer, download=True)
train_loader = DataLoader(train_data, batch_size=100, shuffle=True, num_workers=4)
test_loader = DataLoader(test_data, batch_size=100, num_workers=4)

alexnet = AlexNet()
e = E(train_loader, test_loader, alexnet, 50, 0.05)
e.main()
e.show()

欢迎分享,转载请注明来源:内存溢出

原文地址:https://54852.com/langs/916853.html

(0)
打赏 微信扫一扫微信扫一扫 支付宝扫一扫支付宝扫一扫
上一篇 2022-05-16
下一篇2022-05-16

发表评论

登录后才能评论

评论列表(0条)

    保存