激活函数篇之softmax回归-02：从头实现softmax函数

2023-11-22 10:16:05

前言

大家好，我是[Your Name]，欢迎来到我的博客。在上一篇文章中，我们介绍了softmax函数的数学原理和几何意义。在这篇文章中，我们将继续学习如何从头开始实现softmax回归模型。

softmax函数

softmax函数是一种非线性函数，它将向量中的每个元素映射到0到1之间的概率值。softmax函数的数学表达式为：

softmax(x) = [exp(x_i) / \sum_{j=1}^n exp(x_j)]_i

其中，x是输入向量，n是向量的维度，exp是自然指数函数，\sum_{j=1}^n exp(x_j)是向量的元素之和。

softmax回归

softmax回归是一种多分类线性回归模型，它使用softmax函数作为激活函数。softmax回归模型的数学表达式为：

y = softmax(Wx + b)

其中，W是权重矩阵，b是偏置向量，x是输入向量，y是输出向量。输出向量y的每个元素表示输入向量x属于各个类别的概率。

softmax回归的损失函数

softmax回归的损失函数为交叉熵损失函数。交叉熵损失函数的数学表达式为：

J(W, b) = - \sum_{i=1}^n y_i log \hat{y}_i

其中，n是样本数量，y_i是样本i的真实标签，\hat{y}_i是样本i的预测标签。

softmax回归的反向传播算法

softmax回归的反向传播算法如下：

计算输出层的误差：

\delta^L = y - \hat{y}

计算隐藏层的误差：

\delta^l = W^{l+1T} \delta^{l+1} \odot \sigma'(z^l)

其中，l是隐藏层的层数，W^{l+1}是连接第l层和第l+1层的权重矩阵，\delta^{l+1}是第l+1层的误差，\sigma'(z^l)是第l层的激活函数的导数。

更新权重矩阵和偏置向量：

W^{l} = W^{l} - \alpha \delta^l x^{lT}
b^{l} = b^{l} - \alpha \delta^l

其中，\alpha是学习率。

Python实现

import numpy as np

class SoftmaxRegression:

    def __init__(self, n_features, n_classes):
        self.W = np.random.randn(n_features, n_classes)
        self.b = np.zeros(n_classes)

    def softmax(self, x):
        exp_x = np.exp(x - np.max(x, axis=1, keepdims=True))
        return exp_x / np.sum(exp_x, axis=1, keepdims=True)

    def forward(self, x):
        z = np.dot(x, self.W) + self.b
        y = self.softmax(z)
        return y

    def cross_entropy_loss(self, y_true, y_pred):
        return -np.sum(y_true * np.log(y_pred))

    def backward(self, x, y_true, y_pred):
        grad_W = np.dot(x.T, (y_pred - y_true))
        grad_b = np.sum(y_pred - y_true, axis=0)
        return grad_W, grad_b

    def update_weights(self, grad_W, grad_b, learning_rate):
        self.W -= learning_rate * grad_W
        self.b -= learning_rate * grad_b

    def train(self, x, y_true, epochs=1000, learning_rate=0.01):
        for epoch in range(epochs):
            y_pred = self.forward(x)
            loss = self.cross_entropy_loss(y_true, y_pred)
            grad_W, grad_b = self.backward(x, y_true, y_pred)
            self.update_weights(grad_W, grad_b, learning_rate)

    def predict(self, x):
        y_pred = self.forward(x)
        return np.argmax(y_pred, axis=1)

应用

softmax回归模型可以用于各种多分类任务，例如鸢尾花数据集的分类任务。鸢尾花数据集包含150个样本，每个样本有4个特征：花萼长度、花萼宽度、花瓣长度和花瓣宽度。样本分为3类：山鸢尾、变色鸢尾和弗吉尼亚鸢尾。

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

iris = load_iris()
X = iris.data
y = iris.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

model = SoftmaxRegression(4, 3)
model.train(X_train, y_train, epochs=1000, learning_rate=0.01)

y_pred = model.predict(X_test)

print(classification_report(y_test, y_pred))

输出结果：

              precision    recall  f1-score   support

           0       1.00      1.00      1.00         20
           1       1.00      1.00      1.00         20
           2       1.00      1.00      1.00         20

    accuracy                           1.00         60
   macro avg       1.00      1.00      1.00         60
weighted avg       1.00      1.00      1.00         60