返回

MegEngine公开项目CatVsDog让您和喵星人更亲密

人工智能

引言

“喵星人”的拥趸们,你们好!这篇文章将为您带来MegEngine开源项目CatVsDog,它可以让您轻松实现猫狗图像的区分,准确率高达98%。有了它,妈妈再也不用担心您认不出自己的猫咪啦!

项目概览

CatVsDog是一个使用MegEngine框架进行猫狗图像识别的项目。它基于ResNet网络架构,通过深度学习训练而成。您可以使用该项目轻松构建自己的猫狗图像分类器。

项目优势

  • 准确率高: CatVsDog使用ResNet网络架构,经过深度学习训练,准确率高达98%。
  • 易于使用: 该项目提供了详细的教程和示例代码,即使您是深度学习新手,也可以轻松上手。
  • 开源免费: CatVsDog是一个开源项目,您可以免费使用它。

如何使用

  1. 克隆CatVsDog项目:
git clone https://github.com/megstudio/CatVsDog.git
  1. 进入CatVsDog项目目录:
cd CatVsDog
  1. 安装项目依赖:
pip install -r requirements.txt
  1. 下载并解压CatVsDog数据集:
wget https://github.com/megstudio/CatVsDog/releases/download/v1.0/CatVsDog.zip
unzip CatVsDog.zip
  1. 运行训练脚本:
python train.py
  1. 运行测试脚本:
python test.py

示例代码

import megengine as mge
import megengine.module as M
import megengine.functional as F
import numpy as np

class ResNet(M.Module):
    def __init__(self, num_classes=2):
        super(ResNet, self).__init__()

        self.conv1 = M.Conv2d(3, 64, kernel_size=7, stride=2, padding=3)
        self.bn1 = M.BatchNorm2d(64)
        self.relu = M.ReLU()
        self.maxpool = M.MaxPool2d(kernel_size=3, stride=2, padding=1)

        self.res2a = self._make_res_block(64, 64)
        self.res2b = self._make_res_block(64, 64)

        self.res3a = self._make_res_block(64, 128, stride=2)
        self.res3b = self._make_res_block(128, 128)

        self.res4a = self._make_res_block(128, 256, stride=2)
        self.res4b = self._make_res_block(256, 256)

        self.res5a = self._make_res_block(256, 512, stride=2)
        self.res5b = self._make_res_block(512, 512)

        self.avgpool = M.AvgPool2d(kernel_size=7)
        self.fc = M.Linear(512, num_classes)

    def _make_res_block(self, in_channels, out_channels, stride=1):
        shortcut = M.Sequential()

        if stride != 1 or in_channels != out_channels:
            shortcut = M.Sequential(
                M.Conv2d(in_channels, out_channels, kernel_size=1, stride=stride),
                M.BatchNorm2d(out_channels)
            )

        return M.Sequential(
            M.Conv2d(in_channels, out_channels, kernel_size=3, stride=stride, padding=1),
            M.BatchNorm2d(out_channels),
            M.ReLU(),
            M.Conv2d(out_channels, out_channels, kernel_size=3, stride=1, padding=1),
            M.BatchNorm2d(out_channels),
            M.ReLU(),
            shortcut
        )

    def forward(self, x):
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.maxpool(x)

        x = self.res2a(x)
        x = self.res2b(x)

        x = self.res3a(x)
        x = self.res3b(x)

        x = self.res4a(x)
        x = self.res4b(x)

        x = self.res5a(x)
        x = self.res5b(x)

        x = self.avgpool(x)
        x = x.flatten(1)
        x = self.fc(x)

        return x


if __name__ == '__main__':
    # 加载CatVsDog数据集
    train_dataset = mge.data.dataset.ImageFolderDataset('CatVsDog/train')
    test_dataset = mge.data.dataset.ImageFolderDataset('CatVsDog/test')

    # 创建数据加载器
    train_loader = mge.data.DataLoader(train_dataset, batch_size=32, shuffle=True)
    test_loader = mge.data.DataLoader(test_dataset, batch_size=32, shuffle=False)

    # 创建ResNet模型
    model = ResNet()

    # 定义优化器和损失函数
    optimizer = mge.optimizer.SGD(model.parameters(), lr=0.01, momentum=0.9)
    loss_fn = M.CrossEntropyLoss()

    # 训练模型
    for epoch in range(10):
        for i, (images, labels) in enumerate(train_loader):
            # 前向传播
            outputs = model(images)

            # 计算损失
            loss = loss_fn(outputs, labels)

            # 反向传播
            optimizer.clear_grad()
            loss.backward()

            # 更新权重
            optimizer.step()

            # 打印训练信息
            if i % 100 == 0:
                print(f'Epoch: {epoch+1}, Batch: {i+1}, Loss: {loss.item()}')

    # 评估模型
    correct = 0
    total = 0
    with mge.no_grad():
        for images, labels in test_loader:
            outputs = model(images)
            _, predicted = F.argmax(outputs, axis=1).numpy()
            total += labels.shape[0]
            correct += (predicted == labels.numpy()).sum()

    accuracy = correct / total
    print(f'Accuracy: {accuracy*100:.2f}%')

结论

MegEngine公开项目CatVsDog是一款功能强大、易于使用且免费的猫狗图像识别项目。它可以帮助您轻松构建自己的猫狗图像分类器,让您和喵星人更亲密!