Search code examples
solrdjango-haystackexport-to-csvsearchqueryset

Export Haystack search results


I am trying to export the results of a user's search. I am using Django + Haystack + Solr to produce the search results. Currently, to create the SearchQuerySet to write out the CSV, I am passing the query parameters from the search results page to the view that produces the CSV and rebuilding the SearchQuerySet there. This is a real pain because the search is pretty complicated, with facets, multiple models, etc, and there are constantly bugs when I make modifications to the SearchForm. It seems like there should be an easy way to just pass the results directly to the export view. Any suggestions?

EDIT

I figured out my own solution and put all the modified code in the answer. Please see below. Hopefully this prevents someone else from banging their head against the wall for a week!


Solution

  • Okay, I finally figured this out myself. I basically had to add a second submit button to the html for my SearchForm and then used javascript to redirect the action to my "search_export" view. Since facets aren't passed along when the form is submitted, I had to get the facets from the request (in the search page template) and pass those to view rather hackishly through the url. The facets then have to be re-evaluated in the view. I will paste all of my code below:

    search.html

    {% block content %}
    
        <form method="get" action=".">
    
            <!-- Advanced Search Box -->
            <div class="row">
                <div class="col-lg-12">
                    <h2 class="text-center">Advanced Search</h2>
    
                    <!-- Search Form Fields Here -->
    
                    <ul class="list-inline center-block text-center adv-form">
                        <li>
                            <p><input type="submit" value="Search"></p>
                        </li>
                        <li>
                            <p><input type="submit" id="export" value="Export Results"></p>
                        </li>                            
                    </ul>
                </div>
            </div>
            <!-- End Advanced Search Box -->
    
            <!-- Search Results are displayed here -->
    
    {% endblock %}
    
    <!-- Search Unique JS -->
    {% block js %}
    
    {{ block.super }}
    
      <script>
        $(document).ready(function () {
    
            $("#export").click(function() {
              $(this).closest("form").attr('action', "{% query_params_getlist request 'selected_facets' as facets %}{% url 'search_export' facets %}");
            });
    
        });
      </script>
    
    {% endblock %}
    <!-- End Search Unique JS -->
    

    urls.py

    urlpatterns = patterns('base.views',
        # all the other url patterns go here
        url(r'^search_export/(?P<selected_facets>\S+)/$', 'search_export', name='search_export'),    
    )
    

    base_tags.py

    @register.assignment_tag
    def query_params_getlist(request, param):
        params = request.GET.getlist(param)
        if len(params) > 0:
            query_string = ""
            for p in params:
                query_string += p + '&'
            return query_string
        return 'None'
    

    views.py

    def search_export(request, selected_facets):
        if request.method == 'GET':
            form = AdvModelSearchForm(request.GET)
            if form.is_valid():
                    qs = form.search()
    
                    #deal with facets
                    facets = selected_facets.split("&")
    
                    for facet in facets:
                        if ":" not in facet:
                            continue
    
                        field, value = facet.split(":", 1)
    
                        if value:
                            # faceted fields are stored in a hierarchy, so I check for any document with the given facet or its children
                            control_value = ControlField.objects.filter(pk=qs.query.clean(value))
                            if control_value:
                                value_tree = control_value[0].get_descendants(include_self=True)
                                sq = SQ()
                                for index, node in enumerate(value_tree):
                                    kwargs = {str("%s" % (field)) : str("%s" % (node.id))}
                                    if index == 0:
                                        sq = SQ(**kwargs)
                                    else:
                                        sq = sq | SQ(**kwargs)
                                qs = qs.filter(sq)                
    
                    response = HttpResponse(content_type='text/csv')
                    response['Content-Disposition'] = 'attachment; filename="search_results.csv"'
    
                    writer = csv.writer(response)
                    titles = []
                    rows = []
                    for result in qs:
                        row = []
                        row_dict = {}
                        properties = result.text #IMPT - this field must be a MultiValueField in the Haystack search_indexes.py file or this won't return a list
                        for each_prop in properties:
                            prop_pair = each_prop.split(':', 1)
                            if len(prop_pair) < 2:
                                continue
                            prop_name = smart_str(prop_pair[0].strip())
                            prop_value = smart_str(prop_pair[1].strip())
                            if not (prop_name in titles):
                                column_index = len(titles)                        
                                titles.append(prop_name)
                            else:
                                column_index = titles.index(prop_name)
                                if column_index in row_dict:
                                    prop_value = row_dict[column_index] + '; ' + prop_value
                            row_dict[column_index] = prop_value
                        for i in range(len(titles)):
                            if i in row_dict:
                                row.append(row_dict[i])
                            else:
                                row.append('')
                        rows.append(row)
    
                    writer.writerow(titles)
                    for each_row in rows:
                        writer.writerow(each_row)
                    return response
    
        return HttpResponseRedirect('/failed_export/')