Search code examples
c#.nettask-parallel-libraryparallel.foreach

Multithreading issue ,Maybe a DeadLock using Foreach


Parallel.ForEach keeps on running and my program does not end. I am unable to trace where it goes after the first iteration. My guess is that gets a deadlock and keeps on doing context switching.

private void ReadInputFile()
{
    var collection = new ConcurrentBag<PropertyRecord>();
    var lines = System.IO.File.ReadLines(InputFileName);
    int i = 0;
    int RecordsCount = lines.Count();
    Parallel.ForEach(lines, line =>
    {
        if (string.IsNullOrWhiteSpace(line))
        {
            return;                    
        }

        var tokens = line.Split(',');
        var postalCode = tokens[0];
        var country = tokens.Length > 1 ? tokens[1] : "england";

        SetLabelNotifyTwoText(
            string.Format(
                "Reading PostCode {0} out of {1}"
                i,
                lines.Length));

        var tempRecord = GetAllAddesses(postalCode, country);
        if (tempRecord != null)
        {
            foreach (PropertyRecord r in tempRecord)
            {
                collection.Add(r);
            }
        }    
    });
}

private List<PropertyRecord> GetAllAddesses(
        string postalCode,
        string country = "england")
{
    SetLabelNotifyText("");
    progressBar1.Value = 0;
    progressBar1.Update();

    var records = new List<PropertyRecord>();
    using (WebClient w = new WebClient())
    {
        var url = CreateUrl(postalCode, country);
        var document = w.DownloadString(url);
        var pagesCount = GetPagesCount(document);
        if (pagesCount == null)
        {
            return null;
        }

        for (int i = 0; i < pagesCount; i++)
        {
            SetLabelNotifyText(
                string.Format(
                    "Reading Page {0} out of {1}",
                    i,
                    pagesCount - 1));

            url = CreateUrl(postalcode,country, i);
            document = w.DownloadString(url);
            var collection = Regex.Matches(
                document,
                "<div class=\"soldDetails\">(.|\\n|\\r)*?class=" +
                "\"soldAddress\".*?>(?<address>.*?)(</a>|</div>)" +
                "(.|\\n|\\r)*?class=\\\"noBed\\\">(?<noBed>.*?)" +
                "</td>|</tbody>");

            foreach (var match in collection)
            {
                var r = new PropertyRecord();

                var bedroomCount = match.Groups["noBed"].Value;
                if(!string.IsNullOrEmpty(bedroomCount))
                {
                    r.BedroomCount = bedroomCount;             
                }
                else
                {
                    r.BedroomCount = "-1";
                }

                r.address = match.Groups["address"].Value;

                var line = string.Format(
                    "\"{0}\",{1}",
                    r.address
                    r.BedroomCount);
                OutputLines.Add(line);

                Records.Add(r);
            }
        }
    }

    return Records;
}

It runs fine without Parallel.ForEach, but using Parallel.ForEach is in requirements.

I have debugged it and after returning from GetAllAdresses-method first time, Step Next button halts and it just keep on debugging in the background. It doesn't come back on any bookmark I have placed.


Solution

  • As you said in comments, your SetLabelNotifyText and SetLabelNotifyTwoText methods calls Control.Invoke.

    For Control.Invoke to work, Main thread has to be free, but in your case you seem to block the main thread by invoking Parallel.ForEach in it.

    Here is a minimal reproduction:

    private void button1_Click(object sender, EventArgs e)
    {
        Parallel.ForEach(Enumerable.Range(1, 100), (i) =>
        {
            Thread.Sleep(10);//Simulate some work
            this.Invoke(new Action(() => SetText(i)));
        });
    }
    
    private void SetText(int i)
    {
        textBox1.Text = i.ToString();
    }
    

    Main thread waits for Parallel.ForEach and worker threads waits for Main thread, and thus results in deadlock.

    How to fix: Don't use Invoke simply use BeginInvoke or don't block the MainThread.

    If this isn't the case post sscce, that will be helpful for us