Search code examples
javadatesolrlogstashutc

Solr - "Error adding field … msg=Invalid Date String" when sending json data to core


I am new to Solr.
I am trying to pass log files to Solr. For that I use log file -> Filebeat -> Logstash -> Solr.



PROBLEM

Logstash output is fine, but Solr remain empty.
Because Logstash output text is "2020-01-01 00:00:00.000", I have tried to use cURL to perform Solr updates, such as the following commands.

First Command: ( the trailing of field "my_datetime" is Z)

curl -X POST -d '{"add":{ "doc":{"my_datetime":"2020-01-01T00:00:00.000Z"}}}' -H "Content-Type: application/json" http://localhost:8983/solr/Collection/update?commit=true

Second Command: ( the trailing of field "my_datetime" is not Z)

curl -X POST -d '{"add":{ "doc":{"my_datetime":"2020-01-01 00:00:00.000"}}}' -H "Content-Type: application/json" http://localhost:8983/solr/Collection/update?commit=true

QUESTION

First command does work. But second command does not work, I have received an exception:

    {
  "responseHeader":{
    "status":400,
    "QTime":13},
  "error":{
    "metadata":[
      "error-class","org.apache.solr.common.SolrException",
      "root-error-class","org.apache.solr.common.SolrException"],
    "msg":"ERROR: [doc=60146d1c-ed31-4dda-b90e-e93537b8a63a] Error adding field 'my_datetime'='2020-01-01 00:00:00.000' msg=Invalid Date String:'2020-01-01 00:00:00.000'",
    "code":400}}

ENVIRONMENT

  • Tested on Solr v8.3.0

This is my managed-schema

<fieldType name="pdates" class="solr.DatePointField" docValues="true" multiValued="true"/>

<field name="my_datetime" type="pdate" indexed="true" stored="true"/>

This is my solrconfig.xml

<updateProcessor class="solr.ParseDateFieldUpdateProcessorFactory" name="parse-date">
    <arr name="format">
        <str>yyyy-MM-dd'T'HH:mm:ss.SSSZ</str>
        <str>yyyy-MM-dd'T'HH:mm:ss,SSS[Z</str>
        <str>yyyy-MM-dd'T'HH:mm:ss.SSS</str>
        <str>yyyy-MM-dd' 'HH:mm:ss.SSS</str>
        <str>yyyy-MM-dd'T'HH:mm:ssZ</str>
        <str>yyyy-MM-dd'T'HH:mm:ss</str>
        <str>yyyy-MM-dd'T'HH:mmZ</str>
        <str>yyyy-MM-dd'T'HH:mm</str>
        <str>yyyy-MM-dd HH:mm:ss.SSSZ</str>
        <str>yyyy-MM-dd HH:mm:ss,SSSZ</str>
        <str>yyyy-MM-dd HH:mm:ss.SSS</str>
        <str>yyyy-MM-dd HH:mm:ss,SSS</str>
        <str>yyyy-MM-dd HH:mm:ssZ</str>
        <str>yyyy-MM-dd HH:mm:ss</str>
        <str>yyyy-MM-dd HH:mmZ</str>
        <str>yyyy-MM-dd HH:mm</str>
        <str>yyyy-MM-dd</str>
        <str>yyyy-MM-dd['T'[HH:mm[:ss[.SSS]]</str>
        <str>yyyy-MM-dd['T'[HH:mm[:ss[.SSS]][z</str>
        <str>yyyy-MM-dd['T'[HH:mm[:ss[.SSS]][z</str>
        <str>yyyy-MM-dd['T'[HH:mm[:ss[,SSS]][z</str>
        <str>yyyy-MM-dd HH:mm[:ss[.SSS]][z</str>
        <str>yyyy-MM-dd HH:mm[:ss[.SSS]]</str>
        <str>yyyy-MM-dd HH:mm[:ss[,SSS]][z</str>
        <str>[EEE, ]dd MMM yyyy HH:mm[:ss] z</str>
        <str>EEEE, dd-MMM-yy HH:mm:ss z</str>
      <str>EEE MMM ppd HH:mm:ss [z ]yyyy</str>
    </arr>
  </updateProcessor>


<updateRequestProcessorChain name="add-unknown-fields-to-the-schema" default="${update.autoCreateFields:true}"
           processor="uuid,remove-blank,field-name-mutating,parse-boolean,parse-long,parse-double,parse-date,add-schema-fields">
    <processor class="solr.LogUpdateProcessorFactory"/>
    <processor class="solr.DistributedUpdateProcessorFactory"/>
    <processor class="solr.RunUpdateProcessorFactory"/>
  </updateRequestProcessorChain>

Thanks for you help!


Solution

  • You should pass the date value in correct format.

    The error is thrown by SOLR when a datetime string have an invalid format. SOLR only allows datetime strings in the format of YYYY-MM-DDThh:mm:ssZ.

    The correct format is 2020-10-01T00:00:00Z. It also worked in the example shared by you.

    Add appropriate filter in your logstash file to mutate the date in the above expected format.