Search code examples
batch-processingsap-ase

Sybase ASE data purge batch - design & performance


I am working on a Sybase ASE (migrating to 15.7) data purge utility to be used by multiple tables/ databases to delete huge amount of unwanted older data.

  1. After receiving an input table name, automatically figure out the child tables and delete data. But, I couldn't find an hierarchical query clause like Oracle's "Connect by .. Prior" clause. Is there any other way to implement this?
  2. I am deleting data by looping through multiple transaction/ commits in small increments. After the deletes, at what interval, should I do "reorg rebuild"?
  3. Do I need to do update statistics? If I have to, what is the criteria that I should consider before doing update statistics?
  4. Some tables may be partitioned. Is there anything that I need to consider in partition's perspective?
  5. Some of our DB's (i guess index..?) are clustered. I don't have much idea about clustering. Do I need to consider anything in clustering perspective?
  6. Send Email at the end of processing. Does built-in email package similar to oracle's UTIL_SMTP?

Solution

  • Some of the points are blank right now, and I will fill them as I get a chance.

    1 - Check out this post on replicating this feature in Sybase ASE.

    2 - My post over on the dba stack covers a lot of the key points on determining when to run a reorg

    3 - Since updating statistics can be done more quickly than a reorg(which also updates statistics), it's sometimes used to help improve performance between reorgs. Deciding when to run them will depend on how quickly performance degrades when you do your purges. sp_sysmon is a valuable tool that can capture metrics to help you make your decision.

    4 - Partioned tables shouldn't really impact your purge. It's another case where it may improve performance for your deletes, as the data may be accessed more quickly than other configurations.

    5 - Not really. In theory your deletes should go a bit faster if your delete is using the clustered index. Clustered indexes are used to keep the data pages in order, as records are inserted, instead of heaping the inserts.

    6 - For Windows based systems, xp_sendmail can be used. For *nix based systems, xp_cmdshell can be used to access sendmail. The documentation for those Extended Stored Procedures is here.