Search code examples
goconcurrencyfreeswitchtemporal-workflow

GoESL with Temporal: Calls Not Originating Past Certain Point in FreeSWITCH


I'm integrating GoESL (https://github.com/0x19/goesl) with Temporal to automate dialing through FreeSWITCH. The setup allows for 1,000 concurrent channels and 50 calls per second (CPS). Each dial attempt initiates a Temporal workflow that originates a call via an activity.

After successfully originating 96 calls (variable number), no further calls are processed by FreeSWITCH. There are no logs in the CLI or events from the Event Socket Layer indicating further attempts. However, if I stop the Temporal worker, the previously "stuck" calls appear in the FreeSWITCH CLI, suggesting they were queued by the GoESL client. I can confirm that the worker does not get stuck, as it continues to initiate lead workflows.

Here are the relevant code snippets:

Leads processing loop:

for _, lead := range leadResult.Leads {
    // [omitted setup and checks]

    // Checking for channel availability and sleeping to respect CPS limits
    workflow.Await(ctx, func() bool {
        return dialerQueryResponse.AvailableChannels > 0
    })

    timeToSleep := time.Second / time.Duration(dialerQueryResponse.CallsPerSecondLimit)
    workflow.Sleep(ctx, timeToSleep)

    // Dialing the lead
    fmt.Printf("dialing lead %s\n", lead)
    dialLead(lead, selectedDialer.Id, callTimeout) 
    fmt.Print("lead dialed\n\n")
}

The dial lead logic:

dialLead := func(lead string, selectedDialerId, dialerCallTimeout int) {
    // Setup child workflow context with unique ID
    cwo.WorkflowID = fmt.Sprintf("Campaign_Call_%s", lead)
    childCtx := workflow.WithChildOptions(ctx, cwo)

    // Struct to pass input to the child workflow
    input := domain.CallWorkflowInput{
        Lead:                lead,
        DialerId:            selectedDialerId,
        CampaignName:        cds.CampaignName,
        DialplanExtension:   cc.Survey.DialplanExtension,
        CallTimeout:         dialerCallTimeout,
    }

    // Executing the child workflow and handling its future
    future := workflow.ExecuteChildWorkflow(childCtx, CallWorkflow, input)
    var dialerId int
    selector.AddFuture(future, func(f workflow.Future) {
        err := f.Get(ctx, &dialerId)
        // Error handling and updating concurrency state
        // ...
    })
}

Call Workflow function:

func CallWorkflow(ctx workflow.Context, input domain.CallWorkflowInput) (int, error) {
    // [omitted setup]

    // Executing the originate call activity
    var dialLeadResult domain.DialLeadResponse
    if err := workflow.ExecuteActivity(ctx, activity.Dialer.OriginateCallActivity, dialInput).Get(ctx, &dialLeadResult); err != nil {
        // Error handling
    }

    // [omitted post-call handling]
}

Which in turn executes the originate call activity:

func (a *DialerActivities) OriginateCallActivity(ctx context.Context, input domain.DialLeadRequest) (domain.DialLeadResponse, error) {
    // [omitted client selection]

    // Command to originate the call
    cmd := fmt.Sprintf("originate {%s}%s/%s/%s 704 XML default test %s 10", variables, protocol, gateway, input.DestinationNumber, input.OriginatingNumber)
    err := selectedClient.BgApi(cmd)

    if err != nil {
        // Error handling
    }

    // [omitted response preparation]
}}, nil
}

Has anyone experienced similar issues with GoESL or Temporal where calls seem to be queued and not executed past a certain point? Any suggestions on how to debug this situation or why terminating the Temporal worker might trigger the processing of queued calls?

What I have tried:

  • Ensuring that the throttle limits are respected.
  • Debugging with FreeSWITCH CLI and checking CDRs.
  • Going through the FreeSWITCH logs to try and find anything out of the ordinary.
  • Trying to up logs for GoESL events within the FreeSWITCH settings, however, no logs were written to the file.
  • Modifying the workflow.Sleep duration from a few milliseconds to between 5 - 10 seconds to ensure that it's not network latency causing the issue.
  • Confirming that no errors are thrown in my code or logs until the workflow is killed.
  • Stopped the FreeSWITCH instance to ensure that it's not a communication issue between GoESL and FreeSWITCH. When stopping the FreeSWITCH instance, the logs indicate a communication failure. I do not get any logs otherwise.
  • Research : Found this article (https://lists.freeswitch.org/pipermail/freeswitch-users/2019-May/131768.html) on Google which seems to relate to the same issue we're experiencing, however, there's no solution.

Solution

  • Decided to swap out the GoESL package (https://github.com/0x19/goesl) with a different GoESL package (https://github.com/percipia/eslgo) and the problem has been resolved. Seems to be an underlying issue in the initial GoESL package.

    I've opened an issue on the Github repo here (https://github.com/0x19/goesl/issues/40) in case anyone runs into the same problem in the future.