返回
MegEngine公开项目CatVsDog让您和喵星人更亲密
人工智能
2023-11-14 16:57:07
引言
“喵星人”的拥趸们,你们好!这篇文章将为您带来MegEngine开源项目CatVsDog,它可以让您轻松实现猫狗图像的区分,准确率高达98%。有了它,妈妈再也不用担心您认不出自己的猫咪啦!
项目概览
CatVsDog是一个使用MegEngine框架进行猫狗图像识别的项目。它基于ResNet网络架构,通过深度学习训练而成。您可以使用该项目轻松构建自己的猫狗图像分类器。
项目优势
- 准确率高: CatVsDog使用ResNet网络架构,经过深度学习训练,准确率高达98%。
- 易于使用: 该项目提供了详细的教程和示例代码,即使您是深度学习新手,也可以轻松上手。
- 开源免费: CatVsDog是一个开源项目,您可以免费使用它。
如何使用
- 克隆CatVsDog项目:
git clone https://github.com/megstudio/CatVsDog.git
- 进入CatVsDog项目目录:
cd CatVsDog
- 安装项目依赖:
pip install -r requirements.txt
- 下载并解压CatVsDog数据集:
wget https://github.com/megstudio/CatVsDog/releases/download/v1.0/CatVsDog.zip
unzip CatVsDog.zip
- 运行训练脚本:
python train.py
- 运行测试脚本:
python test.py
示例代码
import megengine as mge
import megengine.module as M
import megengine.functional as F
import numpy as np
class ResNet(M.Module):
def __init__(self, num_classes=2):
super(ResNet, self).__init__()
self.conv1 = M.Conv2d(3, 64, kernel_size=7, stride=2, padding=3)
self.bn1 = M.BatchNorm2d(64)
self.relu = M.ReLU()
self.maxpool = M.MaxPool2d(kernel_size=3, stride=2, padding=1)
self.res2a = self._make_res_block(64, 64)
self.res2b = self._make_res_block(64, 64)
self.res3a = self._make_res_block(64, 128, stride=2)
self.res3b = self._make_res_block(128, 128)
self.res4a = self._make_res_block(128, 256, stride=2)
self.res4b = self._make_res_block(256, 256)
self.res5a = self._make_res_block(256, 512, stride=2)
self.res5b = self._make_res_block(512, 512)
self.avgpool = M.AvgPool2d(kernel_size=7)
self.fc = M.Linear(512, num_classes)
def _make_res_block(self, in_channels, out_channels, stride=1):
shortcut = M.Sequential()
if stride != 1 or in_channels != out_channels:
shortcut = M.Sequential(
M.Conv2d(in_channels, out_channels, kernel_size=1, stride=stride),
M.BatchNorm2d(out_channels)
)
return M.Sequential(
M.Conv2d(in_channels, out_channels, kernel_size=3, stride=stride, padding=1),
M.BatchNorm2d(out_channels),
M.ReLU(),
M.Conv2d(out_channels, out_channels, kernel_size=3, stride=1, padding=1),
M.BatchNorm2d(out_channels),
M.ReLU(),
shortcut
)
def forward(self, x):
x = self.conv1(x)
x = self.bn1(x)
x = self.relu(x)
x = self.maxpool(x)
x = self.res2a(x)
x = self.res2b(x)
x = self.res3a(x)
x = self.res3b(x)
x = self.res4a(x)
x = self.res4b(x)
x = self.res5a(x)
x = self.res5b(x)
x = self.avgpool(x)
x = x.flatten(1)
x = self.fc(x)
return x
if __name__ == '__main__':
# 加载CatVsDog数据集
train_dataset = mge.data.dataset.ImageFolderDataset('CatVsDog/train')
test_dataset = mge.data.dataset.ImageFolderDataset('CatVsDog/test')
# 创建数据加载器
train_loader = mge.data.DataLoader(train_dataset, batch_size=32, shuffle=True)
test_loader = mge.data.DataLoader(test_dataset, batch_size=32, shuffle=False)
# 创建ResNet模型
model = ResNet()
# 定义优化器和损失函数
optimizer = mge.optimizer.SGD(model.parameters(), lr=0.01, momentum=0.9)
loss_fn = M.CrossEntropyLoss()
# 训练模型
for epoch in range(10):
for i, (images, labels) in enumerate(train_loader):
# 前向传播
outputs = model(images)
# 计算损失
loss = loss_fn(outputs, labels)
# 反向传播
optimizer.clear_grad()
loss.backward()
# 更新权重
optimizer.step()
# 打印训练信息
if i % 100 == 0:
print(f'Epoch: {epoch+1}, Batch: {i+1}, Loss: {loss.item()}')
# 评估模型
correct = 0
total = 0
with mge.no_grad():
for images, labels in test_loader:
outputs = model(images)
_, predicted = F.argmax(outputs, axis=1).numpy()
total += labels.shape[0]
correct += (predicted == labels.numpy()).sum()
accuracy = correct / total
print(f'Accuracy: {accuracy*100:.2f}%')
结论
MegEngine公开项目CatVsDog是一款功能强大、易于使用且免费的猫狗图像识别项目。它可以帮助您轻松构建自己的猫狗图像分类器,让您和喵星人更亲密!