elasticsearch-py scan and scroll to return all documents

Do you issue got resolved ?

I have got one simple solution, you must change the scroll_id every time after you call scroll method like below :

response_tmp = es.scroll(scroll_id=scrollId, scroll= "1m")

scrollId = response_tmp['_scroll_id']

The python scan method is generating a GET call to the rest api. It is trying to send over your scroll_id over http. The most likely case here is that your scroll_id is too large to be sent over http and so you are seeing this error because it returns no response.

Because the scroll_id grows based on the number of shards you have it is better to use a POST and send the scroll_id in JSON as part of the request. This way you get around the limitation of it being too large for an http call.

elasticsearch-py scan and scroll to return all documents

Tags:

Python

Elasticsearch

Related

Recent Posts