Search code examples
regexregex-group

Regex for matching on multiple conditions for kana + kanji


I am trying to write a regex that match all words based on a kanji string.

For example, matching 文学生高 could return options like 文学、学生、高い,etc

currently, I have can return only exact match on kanji entered:

/^[学生文高]+$/ but I would like to include records that have these chars ([ぁ-んァ-ン]) as well.

When I try to combine the two conditions, I end up matching everything.

/^[学生文高][ぁ-んァ-ン]+$/ <-- this is ideal, as it matches on both of those conditions.

basically, something that "must include 学生文高 but can also include ぁ-んァ-ン without having only including ぁ-んァ-ン.

For those not so familiar with Japanese, a more English example could be: searching for all words that have test and I would like to include numbers in results, but disallow matching just numbers.

For example, test match could return test1,test2 but never just 1 or 2.


Solution

  • This should work: /^[あ-んア-ン]*([学生文高][あ-んア-ン]*)+$/

    It matches zero or more kana at the start, then one or more groups containing one kanji and zero or more kana.