Search code examples
oracleelasticsearchlogstashlogstash-jdbc

Logstash jdbc-input-plugin configuration for initial sql_last_value


I synchronise data in Oracle database and ElasticSearch instance.
Database table "SYNC_TABLE" has the following columns: "ID" which is a NUMBER, "LAST_MODIFICATION" - TIMESTAMP, "TEXT" - VARCHAR2.

I use Logstash with jdbc-input-plugin in order to perform data syncronisation on a regular basis.

This is the Logstash configuration file:

input { 
  jdbc {
    jdbc_driver_library => "ojdbc6.jar"
    jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver"
    jdbc_connection_string => "jdbc:oracle:thin:@localhost:1521:XE"
    jdbc_user => "******"
    jdbc_password => "******"
    schedule => "* * * * *"
    statement => "SELECT * from SYNC_TABLE where LAST_MODIFICATION >= :sql_last_value"
    tracking_column => "LAST_MODIFICATION"
    tracking_column_type => "timestamp"
    use_column_value => true
  }
}
output {
  elasticsearch { 
    index => "SYNC_TABLE"
    document_type => "SYNCED_DATA"
    document_id => "%{ID}"
    hosts => "localhost:9200"
  }
  stdout { codec => rubydebug }
}

I'd like to import all the data on the first run and then syncronise only diff between the last run and current time. So I expect Logstash to make the following queries:

SELECT * from SYNC_TABLE where LAST_MODIFICATION >= '1 January 1970 00:00'

and then regularly

SELECT * from SYNC_TABLE where LAST_MODIFICATION >= 'time of last run'

Documentation says that initial value for should be 1 January 1970, but I see in my logs that instead it takes current timestamp. This is the first query:

SELECT * from SYNC_TABLE where LAST_MODIFICATION >= TIMESTAMP '2017-08-14 09:17:00.481000 +00:00'

Is there any mistake in logstash configuration file that makes the logstash use current timestamp instead of default ('1 January 1970 00:00')?


Solution

  • The problem was in .logstash_jdbc_last_run file that contained the sql_last_value from previous runs. I've removed this file and restarted logstash.