Django Haystack with Elasticsearch and Postgres
Haystack provides modular search for Django. It features a unified, familiar API that allows us to plug in different search backends (such as Solr and Elasticsearch, etc.) without having to modify our code.
This Elasticsearch application using Haystack can be found on GitHub Einsteinish/Django-Haystack-Elasticsearch.
Let's install packages on virtualenv:
$ virtualenv venv $ source venv/bin/activate (venv) pip install -r requirements.txt
By default, Postgres uses an authentication scheme called "peer authentication" for local connections. Basically, this means that if the user's operating system username matches a valid Postgres username, that user can login with no further authentication.
That's why we get the following error when we tried to access Postgres with a user name 'k' which is the current user:
(venv)k@laptop:~/TEST/DJ2$ psql psql: FATAL: database "k" does not exist
Actually, during the Postgres installation, an operating system user named postgres was created to correspond to the postgres PostgreSQL administrative user. We need to change to this user to perform administrative tasks:
(venv)k@laptop:~/TEST/DJ2$ sudo su - postgres
We should now be in a shell session for the postgres user. Log into a Postgres session by typing:
postgres@laptop:~$ psql psql (9.3.9) Type "help" for help.
First, we will create a database for our Django project (haystack-elasticsearch). Each project should have its own isolated database for security reasons. We will call our database searchdb :
postgres=# CREATE DATABASE search_app_db; CREATE DATABASE postgres-# CREATE USER k WITH PASSWORD 'password'; CREATE ROLE
Afterwards, we'll modify a few of the connection parameters for the user we just created. This will speed up database operations so that the correct values do not have to be queried and set each time a connection is established.
We're setting the default encoding to UTF-8, which Django expects. We're also setting the default transaction isolation scheme to "read committed", which blocks reads from uncommitted transactions. Lastly, we are setting the timezone. By default, our Django projects will be set to use UTC:
postgres=# ALTER ROLE k SET client_encoding TO 'utf8'; ALTER ROLE postgres=# ALTER ROLE k SET default_transaction_isolation TO 'read committed'; ALTER ROLE postgres=# ALTER ROLE k SET timezone TO 'UTC'; ALTER ROLE
Now, all we need to do is give our database user access rights to the database we created:
postgres=# GRANT ALL PRIVILEGES ON DATABASE search_app_db TO k; GRANT
We can list the databases using \l:
postgres=# \l List of databases Name | Owner | Encoding | Collate | Ctype | Access privileges ---------------+----------+----------+-------------+-------------+----------------------- bogotobogo | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | bogotobogo2 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | postgres | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | search_app_db | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =Tc/postgres + | | | | | postgres=CTc/postgres+ | | | | | sfvue=CTc/postgres + | | | | | k=CTc/postgres searchdb | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | template0 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres + | | | | | postgres=CTc/postgres template1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | =c/postgres + | | | | | postgres=CTc/postgres test1 | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | testdb | postgres | UTF8 | en_US.UTF-8 | en_US.UTF-8 | (9 rows)
As with most Django applications, we should add haystack to the INSTALLED_APPS within our settings.py.
INSTALLED_APPS = ( 'django.contrib.auth', 'django.contrib.contenttypes', 'django.contrib.sessions', 'django.contrib.sites', 'django.contrib.messages', 'django.contrib.staticfiles', 'django.contrib.admin', 'haystack', 'search_app', )
Now, let's add Haystack connection string into settings.py and set a default index name.
#HAYSTACK settings HAYSTACK_SIGNAL_PROCESSOR = 'haystack.signals.BaseSignalProcessor' HAYSTACK_SEARCH_RESULTS_PER_PAGE = 12 HAYSTACK_CONNECTIONS = { 'default': { 'ENGINE': 'haystack.backends.elasticsearch_backend.ElasticsearchSearchEngine', 'URL': 'http://127.0.0.1:9200/', 'INDEX_NAME': 'haystack', }, }
SearchIndex objects are the way Haystack determines what data should be placed in the search index and handles the flow of data in. We can think of them as being similar to Django Models or Forms in that they are field-based and manipulate/store data.
We generally create a unique SearchIndex for each type of Model we wish to index, though we can reuse the same SearchIndex between different models if we take care in doing so and our field names are very standardized.
To build a SearchIndex, all that's necessary is to subclass both indexes. SearchIndex and indexes.Indexable, define the fields we want to store data with and define a get_model method.
We'll create the following DocumentIndex to correspond to our Document model. This code generally goes in a search_indexes.py file within the app it applies to, though that is not required. This allows Haystack to automatically pick it up. The DocumentIndex should look like this (search_app/search_indexes.py):
from haystack import indexes from search_app.models import Document class DocumentIndex(indexes.SearchIndex, indexes.Indexable): text = indexes.CharField(document=True, use_template=True) def get_model(self): return Document
Also, we're providing use_template=True on the text field. This allows us to use a data template (rather than error prone concatenation) to build the document the search engine will use in searching. We'll need to create a new template inside our template directory, search_app/templates/search/indexes/search_app/document_text.txt.
We need to place the following into the code:
{{ object.name }} {{ object.body }}
Also to integrate Haystack with Django admin, create search_app/search_sites.py inside our application:
import haystack haystack.autodiscover()
Within our URLconf (search_app/urls.py), add the following line:
from django.conf.urls import patterns, include, url from django.contrib import admin from search_app import settings admin.autodiscover() # Uncomment the next two lines to enable the admin: # from django.contrib import admin # admin.autodiscover() urlpatterns = patterns('', url(r'^$', 'search_app.views.home', name='home'), url(r'^about/', 'search_app.views.about', name='about'), url(r'^admin/', include(admin.site.urls)), (r'^media/(?P<path>.*)$', 'django.views.static.serve', {'document_root': settings.MEDIA_ROOT}), (r'^search/', include('haystack.urls')), )
This will pull in the default URLconf for Haystack. It consists of a single URLconf that points to a SearchView instance. We can change this class's behavior by passing it any of several keyword arguments or override it entirely with our own view.
Our search template (search_app/templates/search/search.html for the default case) will likely be very simple. The following is enough to get going (our template/block names will likely differ):
{% extends 'index.html' %} {% block content %} <div class="page-header"> <h3> Search Results </h3> </div> <div class="row"> <div class="span12"> <table class="table table-bordered"> <thead> <tr> <th> Name </th> <th> Text </th> </tr> </thead> <tbody> {% for result in page.object_list %} <tr> <td>{{ result.object.name }} </td> <td>{{ result.object.body }} </td> </tr> {% empty %} <tr>No results found.</tr> {% endfor %} </tbody> </table> </div> </div> {% endblock %}
With default url configuration we need to make a get request with parameter named q to action /search in search_app/templates/index.html:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd"> <html> <head> <link rel="stylesheet" href='/media/styles/bootstrap.min.css' type="text/css" media="screen"/> <link rel="stylesheet" href='/media/styles/bootstrap-responsive.min.css' type="text/css" media="screen"/> <title>Bogotobogo Django Haystack with Elasticsearch</title> <style type="text/css"> body { padding-top: 60px; } </style> </head> <body> <div class="navbar navbar-fixed-top"> <div class="navbar-inner"> <div class="container"> <a href="/" class="brand">Bogotobogo Django Haystack with Elasticsearch</a> <div class="nav-collapse"> <ul class="nav"> <li><a href="/">Home</a></li> <li><a href="/about">About</a></li> <li><a href="/admin">Admin</a></li> </ul> <ul class="nav pull-right"> <li class="divider-vertical"></li> <form action="/search" method="get" class="navbar-search pull-left"> <input type="text" placeholder="Search" class="search-query span2" name="q"> </form> </ul> </div> <!-- /.nav-collapse --> </div> </div> <!-- /navbar-inner --> </div> <div class="container"> {% block content %} <div class="row"> <div class="span12"> <div class="hero-unit"> <h1> Bogotobogo Django Haystack with Elasticsearch.</h1> <br/> <p> This example illustrates basic search features of <a href="http://www.einsteinish.com" target="_blank">einsteinish.com</a> (Search as a service powered by <a href="http://www.elasticsearch.org" target="_blank">Elastisearch</a>). </p> <p> Each CRUD operation on documents is reflected to search index in real time.</p> <p> To test Einsteinish's search features, please register and create resources/topics.</p> <p> The search feature is using <a href="http://haystacksearch.org/" target="_blank">Haystack</a> moduler search for Django. </p> </div> </div> </div> {% endblock %} <hr> <footer class="footer"><p>2016 einsteinish.com</p> <p>Built with<a href="http://twitter.github.com/bootstrap" target="_blank"> Bootstrap</a></p></footer> </div> </body> </html>
Please visit einsteinish.com to see the Haystack-Elasticsearch in action.
Ph.D. / Golden Gate Ave, San Francisco / Seoul National Univ / Carnegie Mellon / UC Berkeley / DevOps / Deep Learning / Visualization