Search code examples
multithreadingcoldfusioncfthread

Is CFThread join necessary for background processes?


Background:

This is part of a scheduled job that retrieves data from an external site (the external site provides an API for retrieving the data via web service) and updates a database with new information. It is retrieving approximately 3,500 data items. My current scheduled job creates blocks of CFThread tasks that runs 10 threads at a time and joins them before starting the next block of 10.

Code:

<cfset local.nMaxThreadCount = 10>
<!---retrieve a query that contains the items that need to be updated, approximately 3,500 items--->
<cfset local.qryItemsNeedingUpdate = getItemsNeedingUpdate(dtMostRecentItemPriceDate = local.qryMostRecentItemPriceDate.dtMostRecentItemPrice[1])>
<cfset local.nThreadBlocks = Ceiling(local.qryItemsNeedingUpdate.RecordCount / local.nMaxThreadCount)>

<cftry>
<cfloop index="local.nThreadBlock" from="1" to="#local.nThreadBlocks#">
    <cfif local.nThreadBlock EQ local.nThreadBlocks>
        <cfset local.nThreadCount = local.qryItemsNeedingUpdate.RecordCount MOD local.nMaxThreadCount>
    <cfelse>
        <cfset local.nThreadCount = local.nMaxThreadCount>
    </cfif>
    <cfset local.lstThreads = "">
    <cfloop index="local.nThread" from="1" to="#local.nThreadCount#">
        <cfset local.nQryIdx = ((local.nThreadBlock - 1) * local.nMaxThreadCount) + local.nThread>
        <cfset local.vcThreadName = "updateThread#local.qryItemsNeedingUpdate.nItemID[local.nQryIdx]#">
        <cfset local.lstThreads = ListAppend(local.lstThreads, local.vcThreadName)>

        <!---create the attributes struct to pass to a thread--->
        <cfset local.stThread = StructNew()>
        <cfset local.stThread.action = "run">
        <cfset local.stThread.name = local.vcThreadName>
        <cfset local.stThread.nItemID = local.qryItemsNeedingUpdate.nItemID[local.nQryIdx]>

        <!---spawn thread--->
        <cfthread attributecollection="#local.stThread#">
            <cfset updateItemPrices(nItemID = attributes.nItemID)>
        </cfthread>
    </cfloop>

    <!---join threads--->
    <cfthread action="join" name="#local.lstThreads#" />
</cfloop>
    <cfcatch type="any">
<cflog text="detailed error message logged here..." type="Error" file="myDailyJob" application="yes">
    </cfcatch>
</cftry>

Questions:

Is this kind of logic needed for background processes? That is, is CFThread action="join" needed? Nothing is displayed from the threads and the threads are independent (do not rely on the other threads or the process that spawned them). The threads update prices in a database and die. Is it necessary to throttle the threads, that is, run 10 at a time and join them? Could the process loop and create all 3,500 threads at once? Will ColdFusion queue the extra threads and run them as it has time?


Solution

  • "join" isn't necessary unless you need to output info to the page after threads complete.

    Threads will queue; this varies by the version of ColdFusion you're running.

    For what you're doing however, threads aren't what you want. You want to use a message queue, like ActiveMQ or Amazon SQS. You can use an event gateway like the ActiveMQ gateway that comes with Adobe CF, or write your own if you're working with a different message queue or CF engine. (For example, I wrote a messaging system that uses Amazon SQS and Railo event gateways, written in CFML)