Faceting

Faceting is a way of grouping and narrowing search results by a common factor, for example we can group all results which are registered on a certain date. Similar to More Like This, the faceting functionality is implemented by setting up a special ^search/facets/$ route on any view which inherits from the drf_haystack.mixins.FacetMixin class.

Note

Options used for faceting is not portable across search backends. Make sure to provide options suitable for the backend you’re using.

First, read the Haystack faceting docs and set up your search index for faceting.

Serializing faceted counts

Faceting is a little special in terms that it does not care about SearchQuerySet filtering. Faceting is performed by calling the SearchQuerySet().facet(field, **options) and SearchQuerySet().date_facet(field, **options) methods, which will apply facets to the SearchQuerySet. Next we need to call the SearchQuerySet().facet_counts() in order to retrieve a dictionary with all the counts for the faceted fields. We have a special drf_haystack.serializers.HaystackFacetSerializer class which is designed to serialize these results.

Tip

It is possible to perform faceting on a subset of the queryset, in which case you’d have to override the get_queryset() method of the view to limit the queryset before it is passed on to the filter_facet_queryset() method.

Any serializer subclassed from the HaystackFacetSerializer is expected to have a field_options dictionary containing a set of default options passed to facet() and date_facet().

Facet serializer example

class PersonFacetSerializer(HaystackFacetSerializer):

    serialize_objects = False  # Setting this to True will serialize the
                               # queryset into an `objects` list. This
                               # is useful if you need to display the faceted
                               # results. Defaults to False.
    class Meta:
        index_classes = [PersonIndex]
        fields = ["firstname", "lastname", "created"]
        field_options = {
            "firstname": {},
            "lastname": {},
            "created": {
                "start_date": datetime.now() - timedelta(days=3 * 365),
                "end_date": datetime.now(),
                "gap_by": "month",
                "gap_amount": 3
            }
        }

The declared field_options will be used as default options when faceting is applied to the queryset, but can be overridden by supplying query string parameters in the following format.

?firstname=limit:1&created=start_date:20th May 2014,gap_by:year

Each field can be fed options as key:value pairs. Multiple key:value pairs can be supplied and will be separated by the view.lookup_sep attribute (which defaults to comma). Any start_date and end_date parameters will be parsed by the python-dateutil parser() (which can handle most common date formats).

Note

  • The HaystackFacetFilter parses query string parameter options, separated with the view.lookup_sep attribute. Each option is parsed as key:value pairs where the : is a hardcoded separator. Setting the view.lookup_sep attribute to ":" will raise an AttributeError.
  • The date parsing in the HaystackFacetFilter does intentionally blow up if fed a string format it can’t handle. No exception handling is done, so make sure to convert values to a format you know it can handle before passing it to the filter. Ie., don’t let your users feed their own values in here ;)

Warning

Do not use the HaystackFacetFilter in the regular filter_backends list on the serializer. It will almost certainly produce errors or weird results. Faceting filters should go in the facet_filter_backends list.

Example serialized content

The serialized content will look a little different than the default Haystack faceted output. The top level items will always be queries, fields and dates, each containing a subset of fields matching the category. In the example below, we have faceted on the fields firstname and lastname, which will make them appear under the fields category. We also have faceted on the date field created, which will show up under the dates category. Next, each faceted result will have a text, count and narrow_url attribute which should be quite self explaining.

{
  "queries": {},
  "fields": {
    "firstname": [
      {
        "text": "John",
        "count": 3,
        "narrow_url": "http://example.com/api/v1/search/facets/?selected_facets=firstname_exact%3AJohn"
      },
      {
        "text": "Randall",
        "count": 2,
        "narrow_url": "http://example.com/api/v1/search/facets/?selected_facets=firstname_exact%3ARandall"
      },
      {
        "text": "Nehru",
        "count": 2,
        "narrow_url": "http://example.com/api/v1/search/facets/?selected_facets=firstname_exact%3ANehru"
      }
    ],
    "lastname": [
      {
        "text": "Porter",
        "count": 2,
        "narrow_url": "http://example.com/api/v1/search/facets/?selected_facets=lastname_exact%3APorter"
      },
      {
        "text": "Odonnell",
        "count": 2,
        "narrow_url": "http://example.com/api/v1/search/facets/?selected_facets=lastname_exact%3AOdonnell"
      },
      {
        "text": "Hood",
        "count": 2,
        "narrow_url": "http://example.com/api/v1/search/facets/?selected_facets=lastname_exact%3AHood"
      }
    ]
  },
  "dates": {
    "created": [
      {
        "text": "2015-05-15T00:00:00",
        "count": 100,
        "narrow_url": "http://example.com/api/v1/search/facets/?selected_facets=created_exact%3A2015-05-15+00%3A00%3A00"
      }
    ]
  }
}

Serializing faceted results

When a HaystackFacetSerializer class determines what fields to serialize, it will check the serialize_objects class attribute to see if it is True or False. Setting this value to True will add an additional objects field to the serialized results, which will contain the results for the faceted SearchQuerySet. The results will by default be serialized using the view’s serializer_class. If you wish to use a different serializer for serializing the results, set the drf_haystack.mixins.FacetMixin.facet_objects_serializer_class class attribute to whatever serializer you want to use, or override the drf_haystack.mixins.FacetMixin.get_facet_objects_serializer_class() method.

Example faceted results with paginated serialized objects

{
  "fields": {
    "firstname": [
      {"...": "..."}
    ],
    "lastname": [
      {"...": "..."}
    ]
  },
  "dates": {
    "created": [
      {"...": "..."}
    ]
  },
  "queries": {},
  "objects": {
    "count": 3,
    "next": "http://example.com/api/v1/search/facets/?page=2&selected_facets=firstname_exact%3AJohn",
    "previous": null,
    "results": [
      {
        "lastname": "Baker",
        "firstname": "John",
        "full_name": "John Baker",
        "text": "John Baker\n"
      },
      {
        "lastname": "McClane",
        "firstname": "John",
        "full_name": "John McClane",
        "text": "John McClane\n"
      }
    ]
  }
}

Setting up the view

Any view that inherits the drf_haystack.mixins.FacetMixin will have a special action route added as ^<view-url>/facets/$. This view action will not care about regular filtering but will by default use the HaystackFacetFilter to perform filtering.

Note

In order to avoid confusing the filtering mechanisms in Django Rest Framework, the FacetMixin class has a couple of hooks for dealing with faceting, namely:

In order to set up a view which can respond to regular queries under ie ^search/$ and faceted queries under ^search/facets/$, we could do something like this.

We can also change the query param text from selected_facets to our own choice like params or p. For this to make happen please provide facet_query_params_text attribute as shown in the example.

class SearchPersonViewSet(FacetMixin, HaystackViewSet):

    index_models = [MockPerson]

    # This will be used to filter and serialize regular queries as well
    # as the results if the `facet_serializer_class` has the
    # `serialize_objects = True` set.
    serializer_class = SearchSerializer
    filter_backends = [HaystackHighlightFilter, HaystackAutocompleteFilter]

    # This will be used to filter and serialize faceted results
    facet_serializer_class = PersonFacetSerializer  # See example above!
    facet_filter_backends = [HaystackFacetFilter]   # This is the default facet filter, and
                                                    # can be left out.
    facet_query_params_text = 'params' #Default is 'selected_facets'

Narrowing

As we have seen in the examples above, the HaystackFacetSerializer will add a narrow_url attribute to each result it serializes. Follow that link to narrow the search result.

The narrow_url is constructed like this:

  • Read all query parameters from the request
  • Get a list of selected_facets
  • Update the query parameters by adding the current item to selected_facets
  • Pop the drf_haystack.serializers.HaystackFacetSerializer.paginate_by_param parameter if any in order to always start at the first page if returning a paginated result.
  • Return a serializers.Hyperlink with URL encoded query parameters

This means that for each drill-down performed, the original query parameters will be kept in order to make the HaystackFacetFilter happy. Additionally, all the previous selected_facets will be kept and applied to narrow the SearchQuerySet properly.

Example narrowed result

{
  "queries": {},
  "fields": {
    "firstname": [
      {
        "text": "John",
        "count": 1,
        "narrow_url": "http://example.com/api/v1/search/facets/?selected_facets=firstname_exact%3AJohn&selected_facets=lastname_exact%3AMcLaughlin"
      }
    ],
    "lastname": [
      {
        "text": "McLaughlin",
        "count": 1,
        "narrow_url": "http://example.com/api/v1/search/facets/?selected_facets=firstname_exact%3AJohn&selected_facets=lastname_exact%3AMcLaughlin"
      }
    ]
  },
  "dates": {
    "created": [
      {
        "text": "2015-05-15T00:00:00",
        "count": 1,
        "narrow_url": "http://example.com/api/v1/search/facets/?selected_facets=firstname_exact%3AJohn&selected_facets=lastname_exact%3AMcLaughlin&selected_facets=created_exact%3A2015-05-15+00%3A00%3A00"
      }
    ]
  }
}