What is a convenient way to store and retrieve boolean values in a CSV file
Ways to store boolean values in CSV files
- Strings: Two common choices are
true
andfalse
,True
andFalse
, but I've also seenyes
andno
. - Integers:
0
or1
- Floats:
0.0
or1.0
Let's compare the respective advantages / disadvantages:
- Strings:
+
A human can read it-
CSV readers will have it as a string and both will evaluate to "true" whenbool
is applied to it
- Integers:
+
CSV readers might see that this column is integer andbool(0)
evaluates to false.+
A bit more space efficient-
Not totally clear that it is boolean
- Floats:
+
CSV readers might see that this column is integer andbool(0.0)
evaluates to false.-
Not totally clear that it is boolean+
Possible to have null (as NaN)
The Pandas CSV reader shows the described behaviour.
Convert Bool strings to Bool values
Have a look at mpu.string.str2bool
:
>>> str2bool('True')
True
>>> str2bool('1')
True
>>> str2bool('0')
False
which has the following implementation:
def str2bool(string_, default='raise'):
"""
Convert a string to a bool.
Parameters
----------
string_ : str
default : {'raise', False}
Default behaviour if none of the "true" strings is detected.
Returns
-------
boolean : bool
Examples
--------
>>> str2bool('True')
True
>>> str2bool('1')
True
>>> str2bool('0')
False
"""
true = ['true', 't', '1', 'y', 'yes', 'enabled', 'enable', 'on']
false = ['false', 'f', '0', 'n', 'no', 'disabled', 'disable', 'off']
if string_.lower() in true:
return True
elif string_.lower() in false or (not default):
return False
else:
raise ValueError('The value \'{}\' cannot be mapped to boolean.'
.format(string_))