I have a MySQL table:
CREATE TABLE documents (
id INT NOT NULL AUTO_INCREMENT,
language_code CHAR(2),
tags CHAR(30),
text TEXT,
PRIMARY KEY (id)
);
I have 2 questions about Solr DIH:
1) The langauge_code
field indicates what language the text
field is in. And depending on the language, I want to index text
to different Solr fields.
# pseudo code
if langauge_code == "en":
index "text" to Solr field "text_en"
elif langauge_code == "fr":
index "text" to Solr field "text_fr"
elif langauge_code == "zh":
index "text" to Solr field "text_zh"
...
Can DIH handle a usecase like this? How do I configure it to do so?
2) The tags
field needs to be indexed into a Solr multiValued
field. Multiple values are stored in a string, separated by a comma. For example, if tags
contains the string "blue, green, yellow"
then I want to index the 3 values "blue"
, "green"
, "yellow"
into a Solr multiValued field.
How do I do that with DIH?
Thanks.
First your schema needs to allow it with something like this:
<dynamicField name="text_*" type="string" indexed="true" stored="true" />
Then in your DIH config something like this:
<entity name="document" dataSource="ds1" transformer="script:ftextLang" query="SELECT * FROM documents" />
With the script being defined just below the datasource:
<script><![CDATA[
function ftextLang(row){
var name = row.get('language_code');
var value = row.get('text');
row.put('text_'+name, value); return row;
}
]]></script>