In my opscenter web page, in the schema tab I was unable to see any of my keyspaces(0 Keyspaces | 0 Column Families ) and in the logs keep on saying
WARN [rollup-snapshot] 2013-11-18 20:02:47,373 42937 operations dropped so far.
WARN [rollup-snapshot] 2013-11-18 20:02:47,373 Thrift operation queue is full, discarding thrift operation
WARN [rollup-snapshot] 2013-11-18 20:02:47,373 42938 operations dropped so far.
WARN [rollup-snapshot] 2013-11-18 20:02:47,373 Thrift operation queue is full, discarding thrift operation
WARN [rollup-snapshot] 2013-11-18 20:02:47,373 42939 operations dropped so far.
WARN [rollup-snapshot] 2013-11-18 20:02:47,373 Thrift operation queue is full, discarding thrift operation
WARN [rollup-snapshot] 2013-11-18 20:02:47,374 42940 operations dropped so far.
WARN [rollup-snapshot] 2013-11-18 20:02:47,374 Thrift operation queue is full, discarding thrift operation
WARN [rollup-snapshot] 2013-11-18 20:02:47,374 42941 operations dropped so far.
WARN [rollup-snapshot] 2013-11-18 20:02:47,374 Thrift operation queue is full, discarding thrift operation
WARN [rollup-snapshot] 2013-11-18 20:02:47,374 42942 operations dropped so far.
WARN [rollup-snapshot] 2013-11-18 20:02:47,374 Thrift operation queue is full, discarding thrift operation
WARN [rollup-snapshot] 2013-11-18 20:02:47,374 42943 operations dropped so far.
WARN [rollup-snapshot] 2013-11-18 20:02:47,374 Thrift operation queue is full, discarding thrift operation
WARN [rollup-snapshot] 2013-11-18 20:02:47,374 42944 operations dropped so far.
WARN [rollup-snapshot] 2013-11-18 20:02:47,374 Thrift operation queue is full, discarding thrift operation
WARN [rollup-snapshot] 2013-11-18 20:02:47,374 42945 operations dropped so far.
WARN [rollup-snapshot] 2013-11-18 20:02:47,374 Thrift operation queue is full, discarding thrift operation
WARN [rollup-snapshot] 2013-11-18 20:02:47,374 42946 operations dropped so far.
WARN [rollup-snapshot] 2013-11-18 20:02:47,374 Thrift operation queue is full, discarding thrift operation
WARN [rollup-snapshot] 2013-11-18 20:02:47,374 42947 operations dropped so far.
WARN [rollup-snapshot] 2013-11-18 20:02:47,375 Thrift operation queue is full, discarding thrift operation
WARN [rollup-snapshot] 2013-11-18 20:02:47,375 42948 operations dropped so far.
WARN [rollup-snapshot] 2013-11-18 20:02:47,375 Thrift operation queue is full, discarding thrift operation
WARN [rollup-snapshot] 2013-11-18 20:02:47,375 42949 operations dropped so far.
WARN [rollup-snapshot] 2013-11-18 20:02:47,375 Thrift operation queue is full, discarding thrift operation
WARN [rollup-snapshot] 2013-11-18 20:02:47,375 42950 operations dropped so far.
WARN [rollup-snapshot] 2013-11-18 20:02:47,375 Thrift operation queue is full, discarding thrift operation
WARN [rollup-snapshot] 2013-11-18 20:02:47,375 42951 operations dropped so far.
WARN [rollup-snapshot] 2013-11-18 20:02:47,375 Thrift operation queue is full, discarding thrift operation
WARN [rollup-snapshot] 2013-11-18 20:02:47,375 42952 operations dropped so far.
WARN [rollup-snapshot] 2013-11-18 20:02:47,375 Thrift operation queue is full, discarding thrift operation
WARN [rollup-snapshot] 2013-11-18 20:02:47,375 42953 operations dropped so far.
WARN [rollup-snapshot] 2013-11-18 20:02:47,375 Thrift operation queue is full, discarding thrift operation
WARN [rollup-snapshot] 2013-11-18 20:02:47,375 42954 operations dropped so far.
WARN [rollup-snapshot] 2013-11-18 20:02:47,375 Thrift operation queue is full, discarding thrift operation
WARN [rollup-snapshot] 2013-11-18 20:02:47,376 42955 operations dropped so far.
WARN [rollup-snapshot] 2013-11-18 20:02:47,376 Thrift operation queue is full, discarding thrift operation
WARN [rollup-snapshot] 2013-11-18 20:02:47,376 42956 operations dropped so far.
WARN [rollup-snapshot] 2013-11-18 20:02:47,376 Thrift operation queue is full, discarding thrift operation
WARN [rollup-snapshot] 2013-11-18 20:02:47,376 42957 operations dropped so far.
I restarted the datastax-agent but still I could not find any error's in the log file below is the agent.log file
Startup log:
Starting DataStax agent monitor datastax_agent_monitor[ OK ]
log4j:WARN No appenders could be found for logger (org.eclipse.jetty.util.log).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
INFO [main] 2013-11-27 01:37:45,191 Loading conf files: /var/lib/datastax-agent/conf/address.yaml
INFO [main] 2013-11-27 01:37:45,260 Java vendor/version: Java HotSpot(TM) 64-Bit Server VM/1.7.0_25
INFO [main] 2013-11-27 01:37:45,261 Waiting for the config from OpsCenter
INFO [main] 2013-11-27 01:37:45,262 Attempting to determine Cassandra's broadcast address through JMX
INFO [main] 2013-11-27 01:37:45,264 Starting Stomp
INFO [main] 2013-11-27 01:37:45,264 SSL communication is disabled
INFO [main] 2013-11-27 01:37:45,264 Creating stomp connection to x.x.x.x:61620
INFO [Initialization] 2013-11-27 01:37:45,266 New JMX connection (127.0.0.1:7199)
INFO [StompConnection receiver] 2013-11-27 01:37:45,274 Reconnecting in 0s.
INFO [StompConnection receiver] 2013-11-27 01:37:45,280 Connected to x.x.x.x:61620
INFO [main] 2013-11-27 01:37:45,313 Starting Jetty server: {:port 61621, :host nil, :ssl? false, :join? false}
INFO [Jetty] 2013-11-27 01:37:45,511 Jetty server started
INFO [StompConnection receiver] 2013-11-27 01:37:45,566 Got new config from OpsCenter: {:kerberos_use_keytab true, :rollups300_ttl 2419200, :kerberos_use_ticket_cache true, :rollups60_ttl 604800, :thrift_port 9160, :ec2_metadata_api_host "x.x.x.x", :metrics_enabled 1, :rollups7200_ttl 31536000, :thrift_ssl_truststore nil, :metrics_ignored_column_families "", :cassandra_log_location "/var/log/cassandra/system.log", :thrift_rpc_interface "x.x.x.x", :thrift_ssl_truststore_password nil, :jmx_port 7199, :provisioning 0, :use_ssl 0, :kerberos_debug false, :rollups86400_ttl -1, :api_port "61621", :storage_keyspace "OpsCenter", :kerberos_renew_tgt true, :metrics_ignored_solr_cores "", :thrift_ssl_truststore_type "JKS", :metrics_ignored_keyspaces "system, system_traces, system_auth, dse_auth, OpsCenter", :rollup_subscriptions [], :cassandra_install_location ""}
INFO [StompConnection receiver] 2013-11-27 01:37:45,567 New JMX connection (127.0.0.1:7199)
INFO [Initialization] 2013-11-27 01:37:45,633 Using x.x.x.x as the cassandra broadcast address
INFO [StompConnection receiver] 2013-11-27 01:37:45,662 Starting up agent collection.
INFO [Initialization] 2013-11-27 01:37:45,714 agent RPC address is x.x.x.x
INFO [Initialization] 2013-11-27 01:37:45,715 agent RPC broadcast address is x.x.x.x
INFO [StompConnection receiver] 2013-11-27 01:37:45,721 Starting OS metric collectors (Linux)
INFO [Initialization] 2013-11-27 01:37:45,723 Clearing ssl.truststore
INFO [Initialization] 2013-11-27 01:37:45,723 Clearing ssl.truststore.password
INFO [Initialization] 2013-11-27 01:37:45,723 Setting ssl.store.type to JKS
INFO [Initialization] 2013-11-27 01:37:45,728 Clearing kerberos.service.principal.name
INFO [Initialization] 2013-11-27 01:37:45,728 Clearing kerberos.principal
INFO [Initialization] 2013-11-27 01:37:45,728 Setting kerberos.useTicketCache to true
INFO [Initialization] 2013-11-27 01:37:45,728 Clearing kerberos.ticketCache
INFO [Initialization] 2013-11-27 01:37:45,729 Setting kerberos.useKeyTab to true
INFO [Initialization] 2013-11-27 01:37:45,729 Clearing kerberos.keyTab
INFO [Initialization] 2013-11-27 01:37:45,729 Setting kerberos.renewTGT to true
INFO [Initialization] 2013-11-27 01:37:45,729 Setting kerberos.debug to false
INFO [thrift-init] 2013-11-27 01:37:45,733 Connecting to Cassandra cluster: x.x.x.x (port 9160)
INFO [StompConnection receiver] 2013-11-27 01:37:45,737 Starting Cassandra JMX metric collectors
INFO [thrift-init] 2013-11-27 01:37:45,749 Downed Host Retry service started with queue size -1 and retry delay 10s
INFO [StompConnection receiver] 2013-11-27 01:37:45,755 New JMX connection (127.0.0.1:7199)
INFO [thrift-init] 2013-11-27 01:37:45,757 Registering JMX me.prettyprint.cassandra.service_Agent Cluster:ServiceType=hector,MonitorType=hector
INFO [pdp-loader] 2013-11-27 01:37:45,834 in execute with client org.apache.cassandra.thrift.Cassandra$Client@67cf1438
INFO [thrift-init] 2013-11-27 01:37:45,836 Connected to Cassandra cluster: /Test
INFO [pdp-loader] 2013-11-27 01:37:45,844 Attempting to load stored metric values.
INFO [thrift-init] 2013-11-27 01:37:45,841 in execute with client org.apache.cassandra.thrift.Cassandra$Client@67cf1438
INFO [thrift-init] 2013-11-27 01:37:45,845 Using partitioner: org.apache.cassandra.dht.Murmur3Partitioner
INFO [jmx-metrics-1] 2013-11-27 01:37:50,748 New JMX connection (127.0.0.1:7199)
INFO [qtp131393312-25] 2013-11-27 01:38:59,902 HTTP: :get /os-metric/disk-space {} - 200
INFO [qtp131393312-24] 2013-11-27 01:39:04,468 HTTP: :get /os-metric/disk-space {} - 200
WARN [rollup-snapshot] 2013-11-27 01:42:45,841 Thrift operation queue is full, discarding thrift operation
WARN [rollup-snapshot] 2013-11-27 01:42:45,842 1 operations dropped so far.
WARN [rollup-snapshot] 2013-11-27 01:42:45,842 Thrift operation queue is full, discarding thrift operation
WARN [rollup-snapshot] 2013-11-27 01:42:45,842 2 operations dropped so far.
WARN [rollup-snapshot] 2013-11-27 01:42:45,843 Thrift operation queue is full, discarding thrift operation
WARN [rollup-snapshot] 2013-11-27 01:42:45,843 3 operations dropped so far.
WARN [rollup-snapshot] 2013-11-27 01:42:45,843 Thrift operation queue is full, discarding thrift operation
WARN [rollup-snapshot] 2013-11-27 01:42:45,843 4 operations dropped so far.
WARN [rollup-snapshot] 2013-11-27 01:42:45,843 Thrift operation queue is full, discarding thrift operation
WARN [rollup-snapshot] 2013-11-27 01:42:45,843 5 operations dropped so far.
WARN [rollup-snapshot] 2013-11-27 01:42:45,844 Thrift operation queue is full, discarding thrift operation
WARN [rollup-snapshot] 2013-11-27 01:42:45,844 6 operations dropped so far.
WARN [rollup-snapshot] 2013-11-27 01:42:45,844 Thrift operation queue is full, discarding thrift operation
WARN [rollup-snapshot] 2013-11-27 01:42:45,844 7 operations dropped so far.
Thrift is running:
tcp 0 0 0.0.0.0:7199 0.0.0.0:* LISTEN 498 21333533 15520/java
tcp 0 0 0.0.0.0:9160 0.0.0.0:* LISTEN 498 21334831 15520/java
Cassandra nodes are up and running.
The issue in this case was related to the number of column families created in the cluster. A large number of column families can slow down fetching the list of keyspaces and column families as well as back up metric insertion. You can configure which column families have metrics collected. See:
If you don't want to disable monitoring on clusters with a large number of column families, there are a few settings you can tweak in the agent config.
thrift_max_conns - the max number of concurrent connections to make to the local node
asysnc_pool_size - the size of the threadpool pulling from a queue of inserts and inserting in to cassandra
async_queue_size - the size of the queue of inserts to send to cassandra, if the queue fills up additional operations will be dropped