TypeError: string indices must be integers while parsing JSON using Python?
Try replacing j = json.loads(json.dumps(jsonStr))
with j = json.loads(jsonStr)
.
The problem is that jsonStr is a string that encodes some object in JSON, not the actual object.
You obviously knew it was a string, because you called it jsonStr
. And it's proven by the fact that this line works:
jsonStr = data.decode("utf-8")
So, jsonStr
is a string. Calling json.dumps
on a string is perfectly legal. It doesn't matter whether that string was the JSON encoding of some object, or your last name; you can encode that string in JSON. And then you can decode that string, getting back the original string.
So, this:
j = json.loads(json.dumps(jsonStr))
… is going to give you back the exact same string as jsonStr
in j
. Which you still haven't decoded to the original object.
To do that, just don't do the extra encode:
j = json.loads(jsonStr)
If that isn't clear, try playing with it an interactive terminal:
>>> obj = ['abc', {'a': 1, 'b': 2}]
>>> type(obj)
list
>>> obj[1]['b']
2
>>> j = json.dumps(obj)
>>> type(j)
str
>>> j[1]['b']
TypeError: string indices must be integers
>>> jj = json.dumps(j)
>>> type(jj)
str
>>> j
'["abc", {"a": 1, "b": 2}]'
>>> jj
'"[\\"abc\\", {\\"a\\": 1, \\"b\\": 2}]"'
>>> json.loads(j)
['abc', {'a': 1, 'b': 2}]
>>> json.loads(j) == obj
True
>>> json.loads(jj)
'["abc", {"a": 1, "b": 2}]'
>>> json.loads(jj) == j
True
>>> json.loads(jj) == obj
False
Ok... So for people who are still lost because they are used to JS this is what I understood after having tested multiple use cases :
json.dumps
does not make your string ready to be loaded withjson.loads
. It will only encode it to JSON specs (by adding escapes pretty much everywhere) !json.loads
will transform a correctly formatted JSON string to a python dictionary. It will only work if the JSON follows the JSON specs (no single quotes, uppercase for boolean's first letter, etc).
Dumping JSON - An encoding story
Lets take an example !
$ obj = {"foobar": True}
This is NOT json ! This is a python dictionary that uses python types (like booleans).
True
is not compatible with the JSON specs so in order to send this to an API you would have to serialize it to REAL JSON. That's where json.dumps
comes in !
$ json.dumps({"foobar": True})
'{"foobar": true}'
See ? True
became true
which is real JSON. You now have a string that you can send to the real world. Good job.
Loading JSON - A decoding story
So now lets talk about json.loads
.
You have a string that looks like json but its only a string and what you want is a python dictionary. Lets walk through the following examples :
$ string = '{"foobar": true}'
$ dict = json.loads(string)
{'foobar': True}
Here we have a string that looks like JSON. You can use json.loads
to transform this string in dictionary and do dict["foobar"]
which will return True
.
So, why so many errors ?
Well, if your JSON looks like JSON but is not really JSON compatible (spec wise), for instance :
$ string = "{'foobar': true}"
$ json.loads(string)
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes
BAM ! This is not working because JSON specs wont allow you to have single quotes but only double ones...
If you reverse the quotes to '{"foobar": true}'
then it will work.
What you probably have tried is :
string = json.loads(json.dumps("{'foobar': true}"))
This JSON is invalid (check the quotes) and moreover you'll get a string as a results. Disapointed ? I know...
json.dumps
WILL fix you JSON string but will also encode it. The encoding will renderjson.loads
useless even if the JSON is now good to go.
You have to understand that json.dumps
encodes and json.loads
decodes !
So what you did here is encode a string and then decode the string. But its still a string ! you haven't done anything to change that fact ! If you want to get it from string to dictionary then you need an extra step... => A second json.loads
!
Lets try that with a valid JSON (no mean single quotes)
$ obj = json.loads(json.loads(json.dumps('{"foobar": true}')))
$ obj["foobar"]
True
The json string went through json.dumps
and got encoded. Then it when through json.loads
where it got decoded (useless...YEAY). Finaly, it went through json.loads
AGAIN and got transformed from string to dictionary. As you can see, using json.dumps
only adds a useless step in that situation.
One last thing. If we do the same thing again but with a bad JSON:
$ string = json.loads(json.loads(json.dumps("{'foobar': true}")))
json.decoder.JSONDecodeError: Expecting property name enclosed in double quotes
Quotes are wrong here (ain't you getting used to this by now ?).
What happend here is that json.dumps
fixed your JSON. json.loads
removed the fix (lol) and finaly json.loads
got the bad JSON which did not change as the first 2 steps canceled each other.
TL;DR
In conclusion :
Fix you JSON yourself ! Don't give to json.loads
wrongly formated JSON and don't try to mix json.loads
with json.dumps
to fix what only you can fix.
Hope this helped someone ;-)
Disclaimer. I'm no python expert.
Feel free to challenge this answer in the comment section.