Python elasticsearch-dsl django pagination
Following the advice from Danielle Madeley, I also created a proxy to search results which works well with the latest version of django-elasticsearch-dsl==0.4.4
.
from django.utils.functional import LazyObject
class SearchResults(LazyObject):
def __init__(self, search_object):
self._wrapped = search_object
def __len__(self):
return self._wrapped.count()
def __getitem__(self, index):
search_results = self._wrapped[index]
if isinstance(index, slice):
search_results = list(search_results)
return search_results
Then you can use it in your search view like this:
paginate_by = 20
search = MyModelDocument.search()
# ... do some filtering ...
search_results = SearchResults(search)
paginator = Paginator(search_results, paginate_by)
page_number = request.GET.get("page")
try:
page = paginator.page(page_number)
except PageNotAnInteger:
# If page parameter is not an integer, show first page.
page = paginator.page(1)
except EmptyPage:
# If page parameter is out of range, show last existing page.
page = paginator.page(paginator.num_pages)
Django's LazyObject proxies all attributes and methods from the object assigned to the _wrapped attribute. I am overriding a couple of methods that are required by Django's paginator, but don't work out of the box with the Search() instances.
A very simple solution is to use MultipleObjectMixin and extract your Elastic results in get_queryset()
by overriding it. In this case Django will take care of the pagination itself if you add the paginate_by
attribute.
It should look like that:
class MyView(MultipleObjectMixin, ListView):
paginate_by = 10
def get_queryset(self):
object_list = []
""" Query Elastic here and return the response data in `object_list`.
If you wish to add filters when querying Elastic,
you can use self.request.GET params here. """
return object_list
Note: The code above is broad and different from my own case so I can not guarantee it works. I used similar solution by inheriting other Mixins, overriding get_queryset()
and taking advantage of Django's built in pagination - it worked great for me. As it was an easy fix I decided to post it here with a similar example.
I found this paginator on this link:
from django.core.paginator import Paginator, Page
class DSEPaginator(Paginator):
"""
Override Django's built-in Paginator class to take in a count/total number of items;
Elasticsearch provides the total as a part of the query results, so we can minimize hits.
"""
def __init__(self, *args, **kwargs):
super(DSEPaginator, self).__init__(*args, **kwargs)
self._count = self.object_list.hits.total
def page(self, number):
# this is overridden to prevent any slicing of the object_list - Elasticsearch has
# returned the sliced data already.
number = self.validate_number(number)
return Page(self.object_list, number, self)
and then in view i use:
q = request.GET.get('q', None)
page = int(request.GET.get('page', '1'))
start = (page-1) * 10
end = start + 10
query = MultiMatch(query=q, fields=['title', 'body'], fuzziness='AUTO')
s = Search(using=elastic_client, index='post').query(query)[start:end]
response = s.execute()
paginator = DSEPaginator(response, settings.POSTS_PER_PAGE)
try:
posts = paginator.page(page)
except PageNotAnInteger:
posts = paginator.page(1)
except EmptyPage:
posts = paginator.page(paginator.num_pages)
this way it works perfectly..