Home / Expert Answers / Computer Science / nbsp-2-10-points-assume-we-have-mathrm-k-different-classes-in-a-multi-class-softmax-pa807

(Solved):   2. [10 points] Assume we have \( \mathrm{K} \) different classes in a multi-class Softmax ...



2. [10 points] Assume we have \( \mathrm{K} \) different classes in a multi-class Softmax Regression model. The posterior pro

 

2. [10 points] Assume we have \( \mathrm{K} \) different classes in a multi-class Softmax Regression model. The posterior probability is \( \hat{p}_{k}=\delta\left(s_{k}(x)\right)_{k}=\frac{\exp \left(s_{k}(x)\right)}{\sum_{j=1}^{K} \exp \left(s_{j}(x)\right)} \) for \( k=1,2, \ldots, K \), where \( s_{k}(x)=\theta_{k}^{T} \cdot x \), input \( x \) is an \( n \)-dimension vector, and \( K \) the total number of classes. 1) To learn this Softmax Regression model, how many parameters we need to estimate? What are these parameters? 2) Consider the cross-entropy cost function \( J(\Theta) \) of \( m \) training samples \( \left\{\left(x_{i}, y_{i}\right)\right\}_{i=1,2, \ldots, m} \) as below. Derive the gradient of \( J(\Theta) \) regarding to \( \theta_{k} \). \[ J(\Theta)=-\frac{1}{m} \sum_{i=1}^{m} \sum_{k=1}^{K} y_{k}^{(i)} \log \left(\hat{p}_{k}{ }^{(i)}\right) \] where \( y_{k}{ }^{(i)}=1 \) if the \( \mathrm{i}^{\text {th }} \) instance belongs to class \( k \); 0 otherwise.


We have an Answer from Expert

View Expert Answer

Expert Answer


ANSWER : Input x is an n- dimension vector
We have an Answer from Expert

Buy This Answer $5

Place Order

We Provide Services Across The Globe