# What is the difference between the random.choices() and random.sample() functions?

The fundamental difference is that `random.choices()`

will (eventually) draw elements at the same position (always sample from the entire sequence, so, once drawn, the elements are replaced - *with replacement*), while `random.sample()`

will not (once elements are picked, they are removed from the population to sample, so, once drawn the elements are not replaced - *without replacement*).

Note that here *replaced* (*replacement*) should be understood as *placed back* (*placement back*) and not as a synonym of *substituted* (and *substitution*).

To better understand it, let's consider the following example:

```
import random
random.seed(0)
ll = list(range(10))
print(random.sample(ll, 10))
# [6, 9, 0, 2, 4, 3, 5, 1, 8, 7]
print(random.choices(ll, k=10))
# [5, 9, 5, 2, 7, 6, 2, 9, 9, 8]
```

As you can see, `random.sample()`

does not produce repeating elements, while `random.choices()`

does.

In your example, both methods have repeating values because you have repeating values in the original sequence, but, in the case of `random.sample()`

those repeating values must come from different positions of the original input.

Eventually, you cannot `sample()`

more than the size of the input sequence, while this is not an issue with `choices()`

:

```
# print(random.sample(ll, 20))
# ValueError: Sample larger than population or is negative
print(random.choices(ll, k=20))
# [9, 3, 7, 8, 6, 4, 1, 4, 6, 9, 9, 4, 8, 2, 8, 5, 0, 7, 3, 8]
```

A more generic and theoretical discussion of the sampling process can be found on Wikipedia.

The basic difference is this:

- Use the
`random.sample`

function when you want to choose multiple random items from a list without including the duplicates. - Use
`random.choices`

function when you want to choose multiple items out of a list including repeated.

Here are two examples to demonstrate the difference:

```
import random
alpha_list=['Batman', 'Flash', 'Wonder Woman','Cyborg', 'Superman']
choices=random.choices(alpha_list,k=7)
print(choices)
sample= random.sample(alpha_list,k=3)
print(sample)
Output: ['Cyborg', 'Cyborg', 'Wonder Woman', 'Flash', 'Wonder Woman', 'Flash', 'Batman']
['Superman', 'Flash', 'Batman']
```

As from the above examples you can see that, **in random.choices() you can pass 'k' to be greater than length of your sequence, as random.choices() allow for duplicates**.

Whereas, if you were to pass a value of 'k' greater than length of sequence in random.sample() you'll get an error:

Sample larger than population or is negative.

Now, coming to use cases:

`random.choices(sequence, weights=None, cum_weights=None, k=1)`

: you would like to use this**when you can afford to have duplicates in your sampling**. This is the very reason why we can give a value of`k`

>`len(dataset)`

.`random.sample(sequence, k)`

: you would like to use this**when you can't afford to have duplicates while sampling your data**.

