Multinomial Logistic Regression Worksheet

A teacher wants to predict a student's letter grade (A, B, or C) on a test based on two features: hours studied ($x_1$) and number of absences ($x_2$). The classes are: A (class 1), B (class 2), C (class 3).

Training data:

StudentHours ($x_1$)Absences ($x_2$)Grade
Alice41A
Bob21B
Carol12C
Dan30A

A multinomial logistic regression model has the following (partially trained) weight matrix $W$, where each column corresponds to a class and each row corresponds to the bias, $x_1$, and $x_2$:

A (class 1)B (class 2)C (class 3)
bias$-1$$2$$1$
$x_1$$1$$0$$-1$
$x_2$$0$$-2$$1$

Key formulas:


Part (a): One-Hot Encoding

Write the one-hot encoded target vector $\mathbf{y}$ for each training example.

StudentOne-hot vector $\mathbf{y}$
Alice (A)
Bob (B)
Carol (C)
Dan (A)

Part (b): Feature Vector

Write the feature vector $\mathbf{x}$ for Bob, including the bias term. (Recall that the bias term is a 1 prepended to the feature values.)

Part (c): Compute Dot Products

Using Bob's feature vector from part (b) and the weight matrix $W$, compute $z_k = \mathbf{w}_k \cdot \mathbf{x}$ for each class $k$ = A, B, C.

$z_A = \mathbf{w}_1 \cdot \mathbf{x}$ =

$z_B = \mathbf{w}_2 \cdot \mathbf{x}$ =

$z_C = \mathbf{w}_3 \cdot \mathbf{x}$ =

Part (d): Apply Softmax

Apply the softmax function to $\mathbf{z} = [z_A, z_B, z_C]$ to obtain the predicted probability vector $\hat{\mathbf{y}}$. Use the approximation $e \approx 3$.

Part (e): Prediction

Based on your answer to part (d), what class does the model predict for Bob? Is this prediction correct?