Search code examples
cronlogstash

Logstash is not ingesting from PostgreSQL database into Elasticsearch


I'm trying to ingest data from a table in a PostgreSQL database running on AWS RDS into Elasticsearch using Logstash. I installed Elasticsearch, Logstash and Kibana (all current version 8.7) onto an AWS EC2 instance running Ubuntu 20.04. Both Elasticsearch and Kibana are running correctly. The ingestion process is stalling and not working.

Here is my Logstash configuration file:

input {
  jdbc {
    jdbc_driver_class => "org.postgresql.Driver"
    jdbc_connection_string => "jdbc:postgresql://xxx.xxx.us-east-1.rds.amazonaws.com:5432/postgres"
    jdbc_user => "xxx"
    jdbc_password => "xxx"
    schedule => "0 0 * * *"
    statement => "SELECT * FROM contacts"
  }
}

output {
  elasticsearch {
    hosts => ["https://localhost:9200"]
    user => "elastic"
    password => "xxx"
    cacert => '/etc/logstash/config/certs/http_ca.crt'
    index => "contacts"
    document_id => "%{[id]}"
  }
}

And here is what happens when I run Logstash:

WARNING: Could not find logstash.yml which is typically located in $LS_HOME/config or /etc/logstash. You can specify the path using --path.settings. Continuing using the defaults
Could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console
[WARN ] 2023-05-08 04:22:20.425 [main] runner - NOTICE: Running Logstash as superuser is not recommended and won't be allowed in the future. Set 'allow_superuser' to 'false' to avoid startup errors in future releases.
[INFO ] 2023-05-08 04:22:20.435 [main] runner - Starting Logstash {"logstash.version"=>"8.7.1", "jruby.version"=>"jruby 9.3.10.0 (2.6.8) 2023-02-01 107b2e6697 OpenJDK 64-Bit Server VM 17.0.7+7 on 17.0.7+7 +indy +jit [x86_64-linux]"}
[INFO ] 2023-05-08 04:22:20.438 [main] runner - JVM bootstrap flags: [-Xms1g, -Xmx1g, -Djava.awt.headless=true, -Dfile.encoding=UTF-8, -Djruby.compile.invokedynamic=true, -XX:+HeapDumpOnOutOfMemoryError, -Djava.security.egd=file:/dev/urandom, -Dlog4j2.isThreadContextMapInheritable=true, -Djruby.regexp.interruptible=true, -Djdk.io.File.enableADS=true, --add-exports=jdk.compiler/com.sun.tools.javac.api=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.file=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.parser=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.tree=ALL-UNNAMED, --add-exports=jdk.compiler/com.sun.tools.javac.util=ALL-UNNAMED, --add-opens=java.base/java.security=ALL-UNNAMED, --add-opens=java.base/java.io=ALL-UNNAMED, --add-opens=java.base/java.nio.channels=ALL-UNNAMED, --add-opens=java.base/sun.nio.ch=ALL-UNNAMED, --add-opens=java.management/sun.management=ALL-UNNAMED]
[WARN ] 2023-05-08 04:22:20.651 [LogStash::Runner] multilocal - Ignoring the 'pipelines.yml' file because modules or command line options are specified
[INFO ] 2023-05-08 04:22:21.285 [Api Webserver] agent - Successfully started Logstash API endpoint {:port=>9600, :ssl_enabled=>false}
[INFO ] 2023-05-08 04:22:21.906 [Converge PipelineAction::Create<main>] Reflections - Reflections took 192 ms to scan 1 urls, producing 132 keys and 462 values
[INFO ] 2023-05-08 04:22:22.684 [Converge PipelineAction::Create<main>] javapipeline - Pipeline `main` is configured with `pipeline.ecs_compatibility: v8` setting. All plugins in this pipeline will default to `ecs_compatibility => v8` unless explicitly configured otherwise.
[INFO ] 2023-05-08 04:22:22.712 [[main]-pipeline-manager] elasticsearch - New Elasticsearch output {:class=>"LogStash::Outputs::ElasticSearch", :hosts=>["https://localhost:9200"]}
[INFO ] 2023-05-08 04:22:22.867 [[main]-pipeline-manager] elasticsearch - Elasticsearch pool URLs updated {:changes=>{:removed=>[], :added=>[https://elastic:xxxxxx@localhost:9200/]}}
[WARN ] 2023-05-08 04:22:23.137 [[main]-pipeline-manager] elasticsearch - Restored connection to ES instance {:url=>"https://elastic:xxxxxx@localhost:9200/"}
[INFO ] 2023-05-08 04:22:23.145 [[main]-pipeline-manager] elasticsearch - Elasticsearch version determined (8.7.1) {:es_version=>8}
[WARN ] 2023-05-08 04:22:23.145 [[main]-pipeline-manager] elasticsearch - Detected a 6.x and above cluster: the `type` event field won't be used to determine the document _type {:es_version=>8}
[INFO ] 2023-05-08 04:22:23.159 [[main]-pipeline-manager] elasticsearch - Not eligible for data streams because config contains one or more settings that are not compatible with data streams: {"index"=>"contacts"}
[INFO ] 2023-05-08 04:22:23.159 [[main]-pipeline-manager] elasticsearch - Data streams auto configuration (`data_stream => auto` or unset) resolved to `false`
[WARN ] 2023-05-08 04:22:23.161 [[main]-pipeline-manager] elasticsearch - Elasticsearch Output configured with `ecs_compatibility => v8`, which resolved to an UNRELEASED preview of version 8.0.0 of the Elastic Common Schema. Once ECS v8 and an updated release of this plugin are publicly available, you will need to update this plugin to resolve this warning.
[INFO ] 2023-05-08 04:22:23.182 [[main]-pipeline-manager] javapipeline - Starting pipeline {:pipeline_id=>"main", "pipeline.workers"=>2, "pipeline.batch.size"=>125, "pipeline.batch.delay"=>50, "pipeline.max_inflight"=>250, "pipeline.sources"=>["/etc/logstash/conf.d/postgresql.conf"], :thread=>"#<Thread:0x2fe9a4a6@/usr/share/logstash/logstash-core/lib/logstash/java_pipeline.rb:134 run>"}
[INFO ] 2023-05-08 04:22:23.187 [Ruby-0-Thread-10: /usr/share/logstash/vendor/bundle/jruby/2.6.0/gems/logstash-output-elasticsearch-11.13.1-java/lib/logstash/plugin_mixins/elasticsearch/common.rb:161] elasticsearch - Using a default mapping template {:es_version=>8, :ecs_compatibility=>:v8}
[INFO ] 2023-05-08 04:22:23.767 [[main]-pipeline-manager] javapipeline - Pipeline Java execution initialization time {"seconds"=>0.58}
[INFO ] 2023-05-08 04:22:24.308 [[main]-pipeline-manager] jdbc - ECS compatibility is enabled but `target` option was not specified. This may cause fields to be set at the top-level of the event where they are likely to clash with the Elastic Common Schema. It is recommended to set the `target` option to avoid potential schema conflicts (if your data is ECS compliant or non-conflicting, feel free to ignore this message)
[INFO ] 2023-05-08 04:22:24.309 [[main]-pipeline-manager] javapipeline - Pipeline started {"pipeline.id"=>"main"}
[INFO ] 2023-05-08 04:22:24.322 [Agent thread] agent - Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}

The process stalls here. I have checked Kibana and no indices come up. I am only trying to ingest three rows of data just to test this process out. Can someone help resolve this issue for me?


Some notes on my config file:

  • Added cacert to output because I wasn't able to connect to Elasticsearch otherwise and Logstash was throwing errors. I followed instructions to copy the http_ca.crt file from Elasticsearch into a new folder in Logstash. Not sure if that was necessary or not but it resolved that issue.

Solution

  • The schedule you've configured "0 0 * * *" is set to be started at midnight and it's 04:22 where you are.

    Set it to something different like "0 * * * *" (start of each hour) or "0/1 * * * *" (at every minute) so see something running.