Pytorch doesn't support one-hot vector?

PyTorch states in its documentation for CrossEntropyLoss that

This criterion expects a class index (0 to C-1) as the target for each value of a 1D tensor of size minibatch

In other words, it has your to_one_hot_vector function conceptually built in CEL and does not expose the one-hot API. Notice that one-hot vectors are memory inefficient compared to storing class labels.

If you are given one-hot vectors and need to go to class labels format (for instance to be compatible with CEL), you can use argmax like below:

import torch
 
labels = torch.tensor([1, 2, 3, 5])
one_hot = torch.zeros(4, 6)
one_hot[torch.arange(4), labels] = 1
 
reverted = torch.argmax(one_hot, dim=1)
assert (labels == reverted).all().item()

This code will help you with both one hot encode and multi hot encode:

import torch
batch_size=10
n_classes=5
target = torch.randint(high=5, size=(1,10)) # set size (2,10) for MHE
print(target)
y = torch.zeros(batch_size, n_classes)
y[range(y.shape[0]), target]=1
y

The output in OHE

tensor([[4, 3, 2, 2, 4, 1, 1, 1, 4, 2]])

tensor([[0., 0., 0., 0., 1.],
        [0., 0., 0., 1., 0.],
        [0., 0., 1., 0., 0.],
        [0., 0., 1., 0., 0.],
        [0., 0., 0., 0., 1.],
        [0., 1., 0., 0., 0.],
        [0., 1., 0., 0., 0.],
        [0., 1., 0., 0., 0.],
        [0., 0., 0., 0., 1.],
        [0., 0., 1., 0., 0.]])

The output for MHE when I set target = torch.randint(high=5, size=(2,10))

tensor([[3, 2, 4, 4, 2, 4, 0, 4, 4, 1],
        [4, 1, 1, 3, 2, 2, 4, 2, 4, 3]])

tensor([[0., 0., 0., 1., 1.],
        [0., 1., 1., 0., 0.],
        [0., 1., 0., 0., 1.],
        [0., 0., 0., 1., 1.],
        [0., 0., 1., 0., 0.],
        [0., 0., 1., 0., 1.],
        [1., 0., 0., 0., 1.],
        [0., 0., 1., 0., 1.],
        [0., 0., 0., 0., 1.],
        [0., 1., 0., 1., 0.]])

If you need multiple OHE:

torch.nn.functional.one_hot(target)

tensor([[[0, 0, 0, 1, 0],
         [0, 0, 1, 0, 0],
         [0, 0, 0, 0, 1],
         [0, 0, 0, 0, 1],
         [0, 0, 1, 0, 0],
         [0, 0, 0, 0, 1],
         [1, 0, 0, 0, 0],
         [0, 0, 0, 0, 1],
         [0, 0, 0, 0, 1],
         [0, 1, 0, 0, 0]],

        [[0, 0, 0, 0, 1],
         [0, 1, 0, 0, 0],
         [0, 1, 0, 0, 0],
         [0, 0, 0, 1, 0],
         [0, 0, 1, 0, 0],
         [0, 0, 1, 0, 0],
         [0, 0, 0, 0, 1],
         [0, 0, 1, 0, 0],
         [0, 0, 0, 0, 1],
         [0, 0, 0, 1, 0]]])

Pytorch doesn't support one-hot vector?

Tags:

Python

Machine Learning

Pytorch

Related

Recent Posts