I synchronise data in Oracle database and ElasticSearch instance.
Database table "SYNC_TABLE"
has the following columns: "ID"
which is a NUMBER, "LAST_MODIFICATION"
- TIMESTAMP, "TEXT"
- VARCHAR2.
I use Logstash with jdbc-input-plugin in order to perform data syncronisation on a regular basis.
This is the Logstash configuration file:
input {
jdbc {
jdbc_driver_library => "ojdbc6.jar"
jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver"
jdbc_connection_string => "jdbc:oracle:thin:@localhost:1521:XE"
jdbc_user => "******"
jdbc_password => "******"
schedule => "* * * * *"
statement => "SELECT * from SYNC_TABLE where LAST_MODIFICATION >= :sql_last_value"
tracking_column => "LAST_MODIFICATION"
tracking_column_type => "timestamp"
use_column_value => true
}
}
output {
elasticsearch {
index => "SYNC_TABLE"
document_type => "SYNCED_DATA"
document_id => "%{ID}"
hosts => "localhost:9200"
}
stdout { codec => rubydebug }
}
I'd like to import all the data on the first run and then syncronise only diff between the last run and current time. So I expect Logstash to make the following queries:
SELECT * from SYNC_TABLE where LAST_MODIFICATION >= '1 January 1970 00:00'
and then regularly
SELECT * from SYNC_TABLE where LAST_MODIFICATION >= 'time of last run'
Documentation says that initial value for should be 1 January 1970, but I see in my logs that instead it takes current timestamp. This is the first query:
SELECT * from SYNC_TABLE where LAST_MODIFICATION >= TIMESTAMP '2017-08-14 09:17:00.481000 +00:00'
Is there any mistake in logstash configuration file that makes the logstash use current timestamp instead of default ('1 January 1970 00:00')?
The problem was in .logstash_jdbc_last_run
file that contained the sql_last_value
from previous runs.
I've removed this file and restarted logstash.