专注细节
努力进步

Flask-Full Text Search

Introduction to full text search engines

we are going to let our database deal with the regular data, and we are going to create a specialized database that will be dedicated to text searches.Whoosh is a Flask entension which aims to the full text search.Whoosh is a pure Python engine that it will install anywhere a Python interpreter is available. The disadvantage is that search performance will not be up to par with other engines that are written in C or C++.Now, we are going to use Flask-WhooshAlchemy, which integrates a Whoosh database with Flask-SQLAlchemy models.

Configuration

The configuration for Flask-WhooshAlchemy is pretty simple. We just need to tell the extension what is the name of the full text search database.

WHOOSH_BASE = os.path.join(basedir, 'search.db')

Models changes

Since Flask-WhooshAlchemy integrates with Flask-SQLAlchemy, we indicate what data is to be indexed for searching in the proper model class:

from app import app

import sys
if sys.version_info >= (3, 0):
    enable_search = False
else:
    enable_search = True
    import flask.ext.whooshalchemy as whooshalchemy

class Post(db.Model):
    __searchable__ = ['body']

    id = db.Column(db.Integer, primary_key=True)
    body = db.Column(db.String(140))
    timestamp = db.Column(db.DateTime)
    user_id = db.Column(db.Integer, db.ForeignKey('user.id'))

    def __repr__(self):
        return '<Post %r>' % (self.body)

if enable_search:
    whooshalchemy.whoosh_index(app, Post)

The model has a new searchable field, which is an array with all the database fields that will be in the searchable index. In our case we only want to index the body field of our posts.

To make sure the database and the full text engine are synchronized we are going to delete all posts from the database and start over.

from app.models import Post
from app import db
for post in Post.query.all():
    db.session.delete(post)
db.session.commit()

Searching

And now we are aready to start. First, add a few new posts to the database. We can also do it in the Python prompt:

from app.models import User, Post
from app import db
import datetime
u = User.query.get(1)
p = Post(body='my first post', timestamp=datetime.datetime.utcnow(), author=u)
db.session.add(p)
p = Post(body='my second post', timestamp=datetime.datetime.utcnow(), author=u)
db.session.add(p)
p = Post(body='my third and last post', timestamp=datetime.datetime.utcnow(), author=u)
db.session.add(p)
db.session.commit()

Integrating full text searches into the application

To make the searching capability available to our application’s users we have to add just a few small changes.

Configuration

As far as configuration, we’ll just indicate how many search results should be returned as a maximum

MAX_SEARCH_RESULTS = 50

SearchForm

class SearchForm(Form):
    search = StringField('search', validators=[DataRequired()])

Then we need to create a search form object and make it available to all templates, since we will be putting the search form in the navigation bar that is common to all pages. The easiest way to achieve this is to create the form in the before_request handler, and then stick it in Flask’s global g

from forms import SearchForm

@app.before_request
def before_request():
    g.user = current_user
    if g.user.is_authenticated():
        g.user.last_seen = datetime.utcnow()
        db.session.add(g.user)
        db.session.commit()
        g.search_form = SearchForm()

Then we add the form to our template:

<div>Microblog:
    <a href="{{ url_for('index') }}">Home</a>
    {% if g.user.is_authenticated() %}
    | <a href="{{ url_for('user', nickname=g.user.nickname) }}">Your Profile</a>
    | <form style="display: inline;" action="{{ url_for('search') }}" method="post" name="search">{{ g.search_form.hidden_tag() }}{{ g.search_form.search(size=20) }}<input type="submit" value="Search"></form>
    | <a href="{{ url_for('logout') }}">Logout</a>
    {% endif %}
</div>

We only diplay the form when we have a logged in user, the before_request handler will only create a form when a user is logged in, since our application does not show any content to guests that are not authenticated.

Search view function

@app.route('/search', methods=['POST'])
@login_required
def search():
    if not g.search_form.validate_on_submit():
        return redirect(url_for('index'))
    return redirect(url_for('search_results', query=g.search_form.search.data))

This function doesn’t really do much, it just collects the search query from the form and then redirects to another page passing this query as an argument. The reason the search work isn’t done directly here is that if a user then hits the refresh button the browser will put up a warning indicating that form data will be resubmitted. This is avoided when the response to a POST request is a redirect, because after the redirect the browser’s refresh button will reload the redirected page.

Search results page

from config import MAX_SEARCH_RESULTS

@app.route('/search_results/<query>')
@login_required
def search_results(query):
    results = Post.query.whoosh_search(query, MAX_SEARCH_RESULTS).all()
    return render_template('search_results.html',
                           query=query,
                           results=results)

From the search.db to search the full text information.
The search results template:

<!-- extend base layout -->
{% extends "base.html" %}

{% block content %}
  <h1>Search results for "{{ query }}":</h1>
  {% for post in results %}
      {% include 'post.html' %}
  {% endfor %}
{% endblock %}
分享到:更多 ()