Search code examples
drupalviewdrupal-7drupal-views

Drupal views glossary for non latin letters


Well, I have searched on drupal.org and stack overflow and couldn't find a similar problem, I hope i didn't miss something.

I'm building a site in drupal 7 where I have content that needs to be grouped by letters. The views build in glossary seems the best solution and it works well. The only problem is that I use non Latin letters like Č Đ Č Š Ž and the order is messed up. I need the order to be like in Gaj's Latin alphabet:

A B C Č Ć D Dž Đ E F G H I J K L Lj M N Nj O P R S Š T U V Z Ž

as you can see there is also a problem because there are letters with "two letters", but displaying Lj as L would be a solution I could live with. The biggest problem is that C and Ć is displayed under Č but Đ is separated from D, so i get:

A (1) | Č (3) | D (1) | Ž (1) | Đ (1) for example.

Is there an elegant solution for this problem? To seperate the C and Ć from Č or to merge D and Đ. Please help.


Solution

  • I am quite late to the party on this one,

    The bad news : This is entirely due to SQL collation and not Views. The usual database collation for Drupal, utf8_general_ci instructs MySQL to treat those characters as equal in GROUP and WHERE statements.

    In particular; the views query to generate a glossary is currently:

    SELECT SUBSTRING(node.title, 1, 1) AS title_truncated, COUNT(node.nid)
    AS num_records
    FROM 
    {node} node
    WHERE (( (node.status = '1') AND (node.type IN  ('my_nodetype')) ))
    GROUP BY title_truncated
    ORDER BY title_truncated ASC
    LIMIT 10 OFFSET 0
    

    The GROUP BY here obliterates any difference between collation-listed-as-equal unicode characters, because it groups and treats them as equal.

    The good news : this is quickly solved by implementing hook_views_query_alter() to instruct MySQL to GROUP and WHERE to use the collation utf8_bin for your view.

    /**
     * Implementation of hook_views_query_alter().
     */
    function mymodule_views_query_alter(&$view, &$query) {
      if ($view->name == 'my_view') {
        // Stop the glossary from treating characters equal.
        if ($view->is_attachment) {
          foreach ($query->groupby as &$condition_group) {
            if ($condition_group == 'title_truncated') {
              $condition_group = "title_truncated COLLATE utf8_bin";
            }
          }
        }
        // Rewrite the attached view to filter on unique characters.
        else {
          foreach ($query->where as &$condition_where) {
            foreach ($condition_where['conditions'] as &$cur_condition) {
              if ($cur_condition['field'] == 'SUBSTRING(node.title, 1, 1) = :node_title') {
                $cur_condition['field'] = 'SUBSTRING(node.title, 1, 1) = :node_title COLLATE utf8_bin';
              }
            }
          }
        }
      }
    }
    

    Hopefully this is helpful to anyone else encountering this situation.