Search code examples
solrsolr6

Cannot index decimal values from CSV


I'm using SOLR 6.6.2 and im trying to update a core with a CSV file of vehicle data.

Each column of data consists of various datatypes such as ints, string, dates and decimal values.

The problem is with the decimal values. I have to update them to zero decimal places otherwise i get the following error:

PS C:\solr-6.6.2\example\exampledocs> java -Dtype=text/csv -Dc="vehicles" -jar post.jar vehicles.csv

using content-type text/csv... POSTing file vehicles.csv to [base] SimplePostTool: WARNING: Solr returned an error #400 (Bad Request) for url: http://localhost:8983/solr/vehicles/update SimplePostTool: WARNING: Response: 400124org.apache.solr.common.SolrExceptionjava.lang.NumberFormatExceptionERROR: [doc=d90354e7-3d73-4718-aeb5-80b0ce8fccf9] Error adding field 'Price'='7950.01' msg=For input string: "7950.01"400 SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 400 for URL: http://localhost:8983/solr/vehicles/update 1 files indexed. COMMITting Solr index changes to http://localhost:8983/solr/vehicles/update... Time spent: 0:00:01.363> SimplePostTool version 5.0.0 Posting files to [base] url http://localhost:8983/solr/vehicles/update

What isn't helping is that im learning from the SOLR pluralsight videos which are using SOLR version 4 which uses the schema.xml to define the fields, but it appears this is deprecated in version 6 and from what I read there should be no need to modify a schema.


Solution

  • There still is a schema - and you should create / edit it explicitly to match your values.

    When you're running in the schemaless mode, a guess is made when the first value for a field is encountered. This guess seems to be wrong for your dataset - i.e., the first value isn't considered a decimal number for some reason. You can see which type Solr guessed for your column under the schema browser in the Admin interface.

    The best solution is to create an explicit schema - so you're sure that your columns are matched to a specific type.

    You can edit the schema directly in the Admin interface, use the Schema API, or modify the schema.xml file as in previous versions.