Convert commas decimal separators to dots within a Dataframe

I think the earlier mentioned answer of including decimal="," in pandas read_csv is the preferred option.

However, I found it is incompatible with the Python parsing engine. e.g. when using skiprow=, read_csv will fall back to this engine and thus you can't use skiprow= and decimal= in the same read_csv statement as far as I know. Also, I haven't been able to actually get the decimal= statement to work (probably due to me though)

The long way round I used to achieving the same result is with list comprehensions, .replace and .astype. The major downside to this method is that it needs to be done one column at a time:

df = pd.DataFrame({'a': ['120,00', '42,00', '18,00', '23,00'], 
                'b': ['51,23', '18,45', '28,90', '133,00']})

df['a'] = [x.replace(',', '.') for x in df['a']]

df['a'] = df['a'].astype(float)

Now, column a will have float type cells. Column b still contains strings.

Note that the .replace used here is not pandas' but rather Python's built-in version. Pandas' version requires the string to be an exact match or a regex.

pandas.read_csv has a decimal parameter for this: doc

I.e. try with:

df = pd.read_csv(Input, delimiter=";", decimal=",")

Convert commas decimal separators to dots within a Dataframe

Tags:

Python

Pandas

Csv

Delimiter

Separator

Related

Recent Posts