Search code examples
cachingoptimizationflasksqlalchemyflask-admin

How to Improve Flask Admin performance for data intensive forms / general optimization advice?


I have some models:

class Paper(db.Model):
    # Regarding lack of unique constraints -- the legacy database had
    # much duplicate data, and I hope to eventually eliminate that duplicate
    # data and then add uniqueness constraints on some columns later.
    __tablename__ = 'papers'
    __searchable__ = ['title', 'abstract', 'keywords']
    id = db.Column(db.Integer, primary_key=True)
    title = db.Column(db.String(500))
    abstract = db.Column(db.Text, nullable=True)
    doi = db.Column(db.String(50), nullable=True)
    pubmed_id = db.Column(db.String(50), nullable=True)
    link = db.Column(db.String(500), nullable=True)
    journals = db.relationship(Journal, secondary="journal_paper")
    chapters = db.relationship(Chapter, secondary="chapter_paper")
    authors = db.relationship(Author, secondary="author_paper")
    keywords = db.relationship(Keyword, secondary="keyword_paper")
    ...

I have a custom model view:

class PaperModelView(MainModelView):  # Looked up by advanced search with check if __repr__ is "PaperModelView object"
    page_size = 100
    column_list = (
        'title',
        'chapter_paper_assoc',
    )
    column_exclude_list = (
        'keyword_paper_assoc',
        'author_paper_assoc',
        'link',
        'doi',
        'pubmed_id',
    )
    column_details_list = (
        'title',
        'chapter_paper_assoc',
        'journal_paper_assoc',
        'authors',
        'abstract',
        'keywords',
        'doi',
        'pubmed_id',
        'link',
    )
    column_labels = {
        'chapter_paper_assoc': 'Print Information',
        'journal_paper_assoc': 'Publication Information',
    }
    column_filters = [
        'chapter_paper_assoc.printed',
        'journal_paper_assoc.publication_date',
        'chapters.name',
        'chapters.number',
        'journals.name',
        'authors.first_name',
        'authors.last_name',
        'keywords.keyword',
        'doi',
        'abstract',
        'link',
        'title',
    ]
    column_filter_labels = {
        'chapter_paper_assoc.printed': 'Printed',
        'journal_paper_assoc.publication_date': 'Publication Date',
        'chapters.name': 'Chapter Name',
        'chapters.number': 'Chapter Number',
        'journals.name': 'Journal Name',
        'authors.first_name': 'Author First Name',
        'authors.last_name': 'Author Last Name',
        'keywords.keyword': 'Keyword',
        'doi': 'DOI',
        'abstract': 'Abstract',
        'link': 'Link',
        'title': 'Paper Title',
    }
    column_details_exclude_list = (
        'keyword_paper_assoc',
        'author_paper_assoc',
    )
    form_excluded_columns = (
        'chapter_paper_assoc',
        'journal_paper_assoc',
        'keyword_paper_assoc',
        'author_paper_assoc',
    )
    column_formatters = {
        'journals': macro('render_journals'),
        'chapters': macro('render_chapters'),
        'authors': macro('render_authors'),
        'keywords': macro('render_keywords'),
        'chapter_paper_assoc': macro('render_chapter_papers'),
        'journal_paper_assoc': macro('render_journal_papers'),
    }
    column_formatters_export = {
        'journals': formatter,
        'chapters': formatter,
        'authors': formatter,
        'keywords': formatter,
        'chapter_paper_assoc': formatter,
        'journal_paper_assoc': formatter,
    }
    column_searchable_list = (
        'title',
        'keywords.keyword',
    )
    form_ajax_refs = {
        'authors': {
            'fields': ['first_name', 'middle_name', 'last_name'],
            'page_size': 10,
        },
        'chapters': {
            'fields': ['number', 'name',],
            'page_size': 5,
        },
        'journals': {
            'fields': ['name', 'issn'],
            'page_size': 2,
        },
        'keywords': {
            'fields': ['keyword',],
            'page_size': 10,
        }
    }

    def unique_title(form, field):
        p = Paper.query.filter_by(title=field.data).first()

        if p:
            raise ValidationError("A Paper with this title already exists")

    def unique_doi(form, field):
        p = Paper.query.filter_by(doi=field.data).first()

        if p:
            raise ValidationError("A Paper with this doi already exists")

    form_args = {
        'title': {
            'validators': [unique_title, DataRequired()]
        },
        'doi': {
            'validators': [unique_doi],
        },
        'link': {
            'validators': [DataRequired(), URL()],
        },
        'journals': {
            'validators': [Length(0,1)],  # Thought it is a many-many field, allow only 0 or 1
        }  # 0 or 1 Journals assumption is carried to on_model_change (be careful if changing)
    }
    form_base_class = FlaskForm
    list_template = 'auth/model/paper/list.html'

    def on_model_change(self, form, model, is_created):
        """
        Perform some actions before a model is created or updated.
        Called from create_model and update_model in the same transaction (if it has any meaning for a store backend).
        By default does nothing.

        Parameters:
        form – Form used to create/update model
        model – Model that will be created/updated
        is_created – Will be set to True if model was created and to False if edited
        """

        all_chapters = list(set(form.chapters.data + form.chapters_printed.data))
        for chapter in all_chapters:

            if chapter in form.chapters_printed.data:  # if chapter in both, printed takes priority
                chapter_paper = ChapterPaper.query.filter_by(chapter_id=chapter.id, paper_id=model.id).first()

                if not chapter_paper:
                    chapter_paper = ChapterPaper(chapter_id=chapter.id, paper_id=model.id)

                chapter_paper.printed = True
                db.session.add(chapter_paper)

        journal = None
        if form.journals.data:
            journal = form.journals.data[0]

        if journal:  # Assumes only 1 journal if there are any journals in this field
            issue = form.issue.data
            volume = form.volume.data
            pages = form.pages.data
            journal_paper = JournalPaper.query.filter_by(journal_id=journal.id, paper_id=model.id).first()

            if not journal_paper:
                journal_paper = JournalPaper(journal_id=journal.id, paper_id=model.id)

            journal_paper.issue = issue
            journal_paper.volume = volume
            journal_paper.pages = pages
            db.session.add(journal_paper)

        db.session.commit()

    @expose('/new/', methods=('GET', 'POST'))
    def create_view(self):
        """
            Create model view
        """
        return_url = get_redirect_target() or self.get_url('.index_view')

        if not self.can_create:
            return redirect(return_url)

        form = ExtendedPaperForm()
        if not hasattr(form, '_validated_ruleset') or not form._validated_ruleset:
            self._validate_form_instance(ruleset=self._form_create_rules, form=form)

        if self.validate_form(form):
            # in versions 1.1.0 and before, this returns a boolean
            # in later versions, this is the model itself
            model = self.create_model(form)
            if model:
                flash('Record was successfully created', 'success')
                if '_add_another' in request.form:
                    return redirect(request.url)
                elif '_continue_editing' in request.form:
                    # if we have a valid model, try to go to the edit view
                    if model is not True:
                        url = self.get_url('.edit_view', id=self.get_pk_value(model), url=return_url)
                    else:
                        url = return_url
                    return redirect(url)
                else:
                    # save button
                    return redirect(self.get_save_return_url(model, is_created=True))

        form_opts = FormOpts(widget_args=self.form_widget_args,
                             form_rules=self._form_create_rules)

        if self.create_modal and request.args.get('modal'):
            template = self.create_modal_template
        else:
            template = self.create_template

        return self.render(template,
                           form=form,
                           form_opts=form_opts,
                           return_url=return_url)

    def on_form_prefill(self, form, id):
        """
            Perform additional actions to pre-fill the edit form.

            Called from edit_view, if the current action is rendering
            the form rather than receiving client side input, after
            default pre-filling has been performed.

            By default does nothing.

            You only need to override this if you have added custom
            fields that depend on the database contents in a way that
            Flask-admin can't figure out by itself. Fields that were
            added by name of a normal column or relationship should
            work out of the box.

            :param form:
                Form instance
            :param id:
                id of the object that is going to be edited
        """
        model = Paper.query.filter_by(id=id).first()
        form.title.data = model.title
        form.abstract.data = model.abstract
        form.pubmed_id.data = model.pubmed_id
        form.doi.data = model.doi
        form.link.data = model.link
        form.chapters.data = model.chapters
        form.authors.data = model.authors
        form.keywords.data = model.keywords
        form.chapters_printed.data = model.get_printed_chapters()
        form.journals.data = model.journals

        journal = model.journals[0]
        if journal:
            journal_paper = JournalPaper.query.filter_by(paper_id=model.id, journal_id=journal.id).first()

            if journal_paper:
                form.issue.data = journal_paper.issue
                form.volume.data = journal_paper.volume
                form.pages.data = journal_paper.pages
                form.publication_date = journal_paper.publication_date

    @expose('/edit/', methods=('GET', 'POST'))
    def edit_view(self):
        """
            Edit model view
        """
        return_url = get_redirect_target() or self.get_url('.index_view')

        if not self.can_edit:
            return redirect(return_url)

        id = get_mdict_item_or_list(request.args, 'id')
        if id is None:
            return redirect(return_url)

        model = self.get_one(id)

        if model is None:
            flash('Record does not exist!', 'error')
            return redirect(return_url)

        form = ExtendedPaperForm()

        if not hasattr(form, '_validated_ruleset') or not form._validated_ruleset:
            self._validate_form_instance(ruleset=self._form_edit_rules, form=form)

        if self.validate_form(form):
            if self.update_model(form, model):
                flash('Record was successfully saved', 'success')
                if '_add_another' in request.form:
                    return redirect(self.get_url('.create_view', url=return_url))
                elif '_continue_editing' in request.form:
                    return redirect(request.url)
                else:
                    # save button
                    return redirect(self.get_save_return_url(model, is_created=False))

        if request.method == 'GET':
            self.on_form_prefill(form, id)

        form_opts = FormOpts(widget_args=self.form_widget_args,
                             form_rules=self._form_edit_rules)

        if self.edit_modal and request.args.get('modal'):
            template = self.edit_modal_template
        else:
            template = self.edit_template

        return self.render(template,
                           model=model,
                           form=form,
                           form_opts=form_opts,
                           return_url=return_url)

    @property
    def can_create(self):
        return current_user.can_edit_papers()

    @property
    def can_edit(self):
        return current_user.can_edit_papers()

    @property
    def can_delete(self):
        return current_user.can_edit_papers()

    def is_accessible(self):
        if current_user.is_authenticated:
            return current_user.can_view_papers()

        return False

    # Add a batch action to this class using flask admin's syntax
    @action('cite', 'Cite', 'Create Citations for the Selected Papers?')
    def action_cite(self, ids):
        try:
            query = Paper.query.filter(Paper.id.in_(ids))
            citations = ""

            for paper in query.all():
                citation = paper.cite('APA')
                citations += "{}\n".format(citation)

            response = make_response(citations)
            response.headers[
                "Content-Disposition"
            ] = "attachment; filename=citations.txt"  # Downloadable response
            return response
        except Exception as ex:
            if not self.handle_view_exception(ex):
                raise

    @action('chapter_add', 'Add to Chapters')
    def action_chapter_add(self, ids):
        return_url = request.referrer
        try:
            return redirect(
                url_for('addtochapterview.index',
                        return_url=return_url), 307
            )
        except Exception as ex:
            if not self.handle_view_exception(ex):
                raise

    @action('chapter_remove', 'Remove From Chapters')
    def action_chapter_remove(self, ids):
        return_url = request.referrer
        try:
            return redirect(
                url_for('removefromchapterview.index',
                        return_url=return_url), 307
            )
        except Exception as ex:
            if not self.handle_view_exception(ex):
                raise

    @action('mark_as_printed', 'Mark As Printed')
    def mark_printed(self, ids):
        return_url = request.referrer
        try:
            return redirect(
                url_for('markasprintedview.index',
                        return_url=return_url), 307
            )
        except Exception as ex:
            if not self.handle_view_exception(ex):
                raise

    @action('mark_as_unprinted', 'Mark As NOT Printed')
    def unmark_printed(self, ids):
        return_url = request.referrer
        try:
            return redirect(
                url_for('markasunprintedview.index',
                        return_url=return_url), 307
            )
        except Exception as ex:
            if not self.handle_view_exception(ex):
                raise

I have a custom form:

class ExtendedPaperForm(FlaskForm):
    title = StringField()
    abstract = TextAreaField()
    doi = StringField()
    pubmed_id = StringField()
    link = StringField()
    journals = QuerySelectMultipleField(
        query_factory=_get_model(Journal),
        allow_blank=False,
    )
    issue = StringField()
    volume = StringField()
    pages = StringField()
    publication_date = DateField(format='%Y-%m-%d')
    authors = QuerySelectMultipleField(
        query_factory=_get_model(Author),
        allow_blank=False,
    )
    keywords = QuerySelectMultipleField(
        query_factory=_get_model(Keyword),
        allow_blank=True,
    )
    chapters_printed = QuerySelectMultipleField(
        query_factory=_get_model(Chapter),
        allow_blank=True,
        label="Chapters (Printed)",
    )
    chapters = QuerySelectMultipleField(
        query_factory=_get_model(Chapter),
        allow_blank=True,
        label="Chapters (All)",
    )

with the custom query factory:

def _get_model(model, order=None):
    if order:
        return lambda: db.session.query(model).order_by(order)
    return lambda: db.session.query(model)

This all works very well, with the exception that it is very slow. From the time that the user clicks the "Create" button to the time that the servers returns a response is, on average, around ~1.1 minutes. That's too long :/.

So, I think my options are:

1) Write a better query factory function

2) Memoize the query factories for each field of the form

3) Some combination of the two

I don't know how to do any of these options. How would one even write a better query factory here. I am pretty sure that there isn't much to do on this end, since I need every option to be available for each field. I have used Flask-Cache before on a simpler project, but don't know how to apply this knowledge here.

I imagine that I could simply memoize the query functions -- but I don't know where in my model view I would clear the cached values and rebuild them -- how often -- why -- etc.

I don't have the strongest web programming vocabulary, so using references and documentation is proving frustrating. Could someone please provide some guidance on how to improve this process? Or like, maybe just the right place to look for inspiration in the flask-admin source code?

P.S

If you have a general guide for scaling and optimizing a flask app that you think would help, please share in comments. Thanks.


Solution

  • No help here, and yes, sorry that this post was so vague.

    Here are some things that I did that reduced the average client response time from 95,000 ms to the expected / acceptable ~100 ms:

    • Integrated Caching queries

    • Eliminated joins (used dynamic loading of related objects where possible)

    • Cached query factories and update them on model change

    The Flask-DebugToolbar extension is great! You should use it.