I have an existing Cassandra test install that I have been testing on. The table I want to run the cassandra-stress
tool against currently has real (not live) data, around 100k rows. So I was wondering if the tool can be used against that data, or can it only be data that the tool has inserted into an empty table in order to determine write speeds etc?
cassandra-stress
can certainly use existing tables. You have to create a stress profile, take a look at the yaml files.
You'd have to declare the ks
and table
definition and define the column-spec which is essentially a profile of what data cells in a row are expected to hold.
Finally you provide a workload and you're good to go:
cassandra-stress user profile=table1.yaml ops (specname.insert=10,specname.read=1000)
Although not directly related, this resource will be useful.