irsoli.blogg.se - Cross entropy loss pytorch

#Cross entropy loss pytorch code

To best classify data we want to maximize the probability of a model and minimize its error function. If you are here from my last articles, you already know that we are trying to find ways to best classify data. These articles are inspired by a course by Udacity called, Deep Learning with PyTorch. I also made the choice to write about everything I learn (so if you are interested in the journey make sure to follow me). So a couple of weeks ago I made the choice to learn everything I possibly could about the growing field of deep learning. No, I wasn’t spending my time the way I wanted. No, I shouldn’t stop eating fries before bed. Am I spending my time the way I want to spend my time?Īfter giving these ideas some thought, I realized the answer to both of these questions was No.Should I stop eating fries before bed? They are pretty unhealthy….

Because I have always been one to analyze my choices, I asked myself two really important questions. It was late at night, and I was lying in my bed thinking about how I spent my day. I think ML developers are smart enough to put their big boy pants on and use an explicit output activation function.Cross Entropy is a loss function often used in classification problems.Ī couple of weeks ago, I made a pretty big decision. The CrossEntropyLoss() with no output activation function approach was introduced sometime around PyTorch version 1.0 a couple of years ago in order to make multi-class classification easier. With CrossEntropyLoss(), the output values are raw logits, which could be any values, so you apply softmax() to get values that sum to 1. With NLLLoss(), the output values are log-probabilities so you apply the exp() function to remove the log. Optimizer = T.optim.SGD(net.parameters(), lr=lrn_rate)Īfter training, if you want to make a prediction where the output values are pseudo-probabilities that sum to 1.0, the two choices are: # loss_func = T.nn.CrossEntropyLoss() # no activation # loss_func = T.nn.NLLLoss() # assumes log_softmax()

#Cross entropy loss pytorch code

Z = T.nn.Identity()(self.oupt(z)) # no activationīut there’s no built-in identity() function so the code is very ugly.įor training, the two choices for optimizer are: PyTorch does have an explicit Identity module you can use: This is what I don’t like about the CrossEntropyLoss() approach - the no activation sort of looks like a mistake, even though it isn’t.

Notice that for NLLLoss() I use log_softmax() output activation and for CrossEntropyLoss() I use no activation (sometimes called identity activation). # z = self.oupt(z) # no activation for CrossEntropyLoss() # z = T.log_softmax(self.oupt(z), dim=1) # for NLLLoss() The no-output activation with CrossEntropyLoss() just doesn’t look as nice.

I was mentally comparing the two approaches and decided that the NLLLoss() (“negative log likelihood loss”) with log_softmax() output activation has one tiny advantage over the CrossEntropyLoss() with no activation approach. Using NLLLoss() with log_softmax() output activation (left) and using CrossEntropyLoss() with no output activation (right) give the exact same results. Internally, the two approaches are identical and you get the exact same results using the two approaches. If you create a PyTorch library neural network multi-class classifier, you can use NLLLoss() loss function with log_softmax() output activation, or you can use CrossEntropyLoss() loss with identity (in other words none) output activation.