In softmax_Anfany.py line 47,
is the code l2norm = np.sum(0.5 * np.dot(self.weights.T, self.weights) / len(x))
missing the lambda for L2 norm?
since the default lambda being 0.002,
is it ought to be l2norm = np.sum(0.5 *0.002* np.dot(self.weights.T, self.weights) / len(x))?