$bean-save() is taking way too long per record in Sugarcrm

I am having a function that recalculates the value of calculated fields in given modules. This works fine for small modules with small number of records. However When I try this on bigger modules like Contacts or Accounts, saving each record take up to three seconds.

With more than 100,000 records this takes 83 Hours to process.

This is my code.

$moduleList = array("Accounts", "Quotes"); 

if (!defined('sugarEntry') || !sugarEntry)
die('Not A Valid Entry Point');

require_once('include/utils.php');
require_once('include/export_utils.php');
foreach( $moduleList as $module) {
    print "Updating $module...\n<br>";
    $cnt = 0;
    $moduleBean = BeanFactory::getBean($module);
    $beanList = $moduleBean->get_full_list($order_by,$where);
    if( $beanList != null ) {
        foreach($beanList as $b) {
            // These lines prevent the modified date and user from being changed.
            $b->update_date_modified = false;
            $b->update_modified_by = false;
            $b->tracker_visibility = false;
            $b->in_workflow = true;
            $b->save();
            $cnt++;
        }
    }
    print "Finished updating: $cnt records.\n<br>";

When I log the time each record takes it is then clear that $b->save(); is what takes too long.

Is there any way to speed this up ?

Solution

First of all I'd recommend finding out what part of the save takes so long, using the tool of your choice... profiler/debugger/custom Timers/etc.

Is it a hook? Optimize it or make a custom flag to skip it if not needed for the recalc.
Is it a query? Analyse the queries (EXPLAIN) to find out if adding indices to your database could help.
Is it excessive log output? Reduce it.
Is it the bean loading for related bean calculation? Optimize formulas or skip if not needed.

Other things you could try:

Prefilter which beans you want to resave using a Query (e.g. maybe record recalc not necessary if in certain state or modified after a certain related bean etc?).
Trigger updating calculated fields in your code and check if (relevant) values of the bean where actually changed after, if not then there's no need to resave it and you can skip to the next record instead.
If this is just about updating non-related calculated fields even although related calculated fields exist, then you could decide to (temporarily?) set $sugar_config['disable_related_calc_fields'] to true. That way Sugar wouldn't have to load those related beans.

If there is no hope left:

Consider running those tasks in parallel
Rewrite whatever you need to update using SQL/SugarQuery (may not be possible, depending on complexity and implications).

Notes:

Be aware that get_full_list() is deprecated since at least Sugar 7.7 and replaced by SugarQuery and $bean->fetchFromQuery(). However I'd recommend using a Query to only retrieve the IDs and use BeanFactory::retrieveBean($module, $id) to load the beans, possibly dismissing them each after computation, using BeanFactory::unregisterBean($bean); unset($bean); to help the garbage collector with keeping a low memory profile.
Also I did use get_full_list() and fetchFromQuery in the past with catastrophic side-effects due to beans not getting fully loaded when using that way instead of BeanFactory.
All records lost their email-addresses(!) as a result. I was not pleased.
Sugar 8: Especially if you have a lot of contacts, bad default indices on the table erased_fields will slow things down a lot as soon as it has a single record in it. erased_fields only has a 2-column index by default (which isn't used in important queries though), so you'll have to add 2 individual indices for those 2 columns and things will be a lot faster again.