How do I fix invalid literal for int() with base 10 error in pandas
Others might encounter the following issue, when the string is a float:
>>> int("34.54545")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: '34.54545'
The workaround for this is to convert to a float first and then to an int:
>>> int(float("34.54545"))
34
Or pandas specific:
df.astype(float).astype(int)
I run this
int('260,327,021')
and get this
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-448-a3ba7c4bd4fe> in <module>() ----> 1 int('260,327,021') ValueError: invalid literal for int() with base 10: '260,327,021'
I assure you that not everything in your dataframe is a number. It may look like a number, but it is a string with commas in it.
You'll want to replace your commas and then turn to an int
pd.Series(['260,327,021']).str.replace(',', '').astype(int)
0 260327021
dtype: int64