Search code examples
c#graphconnectionbinary-dataadjacency-matrix

Calculating adjacency matrix from randomly generated graphs


I have developed small program, which randomly generates several connections between the graphs (the value of the count could be randomly too, but for the test aim I have defined const value, it could be redefined in random value in any time).

Code is C#: http://ideone.com/FDCtT0

( result: Success time: 0.04s memory: 36968 kB returned value: 0 )

If you don't know, what is the adjacency matrix, go here : http://en.wikipedia.org/wiki/Adjacency_matrix

enter image description here

I think, that my version of code is rather not-optimized. If I shall work with large matrixes, which have the size: 10k x 10k.

  1. What are your suggestions, how is better to parallel calculations in this task? Should I use some of the lockers-models like semaphore etc for multi-threading calculations on large matrixes.

  2. What are your suggestions for redesigning the architecture of program. How should I prepare it for large matrixes?

  3. As you see, upper at ideone, I have showed the time execution parameter and allocated memory in RAM. What is the asymptotic value of execution of my program? Is it O(n^2)?

So I want to listen to your advice how to increase the asymptotic mark, parallel calculations with using semaphores ( or maybe better locker-model for threads ).

Thank you!

PS: SO doesn't allow to post topic without formatted code, so I'm posting in at the end (full program):

/*
    Oleg Orlov, 2012(c), generating randomly adjacency matrix and graph connections
*/

using System;
using System.Collections.Generic;

class Graph
{
    internal int id;
    private int value;
    internal Graph[] links;

    public Graph(int inc_id, int inc_value)
    {
        this.id = inc_id;
        this.value = inc_value;
        links = new Graph[Program.random_generator.Next(0, 4)];
    }
}

class Program
{
    private const int graphs_count = 10;
    private static List<Graph> list;
    public static Random random_generator;

    private static void Init()
    {
        random_generator = new Random();
        list = new List<Graph>(graphs_count);

        for (int i = 0; i < list.Capacity; i++)
        {
            list.Add(new Graph(i, random_generator.Next(100, 255) * i + random_generator.Next(0, 32)));
        }
    }

    private static void InitGraphs()
    {
        for (int i = 0; i < list.Count; i++)
        {
            Graph graph = list[i] as Graph;
            graph.links = new Graph[random_generator.Next(1, 4)];

            for (int j = 0; j < graph.links.Length; j++)
            {
                graph.links[j] = list[random_generator.Next(0, 10)];
            }

            list[i] = graph;
        }
    }

    private static bool[,] ParseAdjectiveMatrix()
    {
        bool[,] matrix = new bool[list.Count, list.Count];

        foreach (Graph graph in list)
        {
            int[] links = new int[graph.links.Length];

            for (int i = 0; i < links.Length; i++)
            {
                links[i] = graph.links[i].id;
                matrix[graph.id, links[i]] = matrix[links[i], graph.id] = true;
            }
        }

        return matrix;
    }

    private static void PrintMatrix(ref bool[,] matrix)
    {
        for (int i = 0; i < list.Count; i++)
        {
            Console.Write("{0} | [ ", i);

            for (int j = 0; j < list.Count; j++)
            {
                Console.Write(" {0},", Convert.ToInt32(matrix[i, j]));
            }

            Console.Write(" ]\r\n");
        }

        Console.Write("{0}", new string(' ', 7));

        for (int i = 0; i < list.Count; i++)
        {
            Console.Write("---");
        }

        Console.Write("\r\n{0}", new string(' ', 7));

        for (int i = 0; i < list.Count; i++)
        {
            Console.Write("{0}  ", i);
        }

        Console.Write("\r\n");
    }

    private static void PrintGraphs()
    {
        foreach (Graph graph in list)
        {
            Console.Write("\r\nGraph id: {0}. It references to the graphs: ", graph.id);

            for (int i = 0; i < graph.links.Length; i++)
            {
                Console.Write(" {0}", graph.links[i].id);
            }
        }
    }

    [STAThread]
    static void Main()
    {
        try
        {
            Init();
            InitGraphs();
            bool[,] matrix = ParseAdjectiveMatrix();
            PrintMatrix(ref matrix);
            PrintGraphs();
        }
        catch (Exception exc)
        {
            Console.WriteLine(exc.Message);
        }

        Console.Write("\r\n\r\nPress enter to exit this program...");
        Console.ReadLine();
    }
}

Solution

  • I will start from the end, if you don't mind. :)

    3) Of course, it is O(n^2). As well as the memory usage.

    2) Since sizeof(bool) == 1 byte, not bit, you can optimize memory usage by using bit masks instead of raw bool values, this will make it (8 bits per bool)^2 = 64 times less.

    1) I don't know C# that well, but as i just googled i found out that C# primitive types are atomic, which means you can safely use them in multi-threading. Then, you are to make a super easy multi-threading task: just split your graphs by threads and press the 'run' button, which will run every thread with its part of graph on itself. They are independent so that's not going to be any problem, you don't need any semaphores, locks and so on.

    The thing is that you won't be able to have an adjacency matrix with size 10^9 x 10^9. You just can't store it in the memory. But, there is an other way.
    Create an adjacency list for each vertex, which will have a list of all vertices it is connected with. After building those lists from your graph, sort those lists for each vertex. Then, you can answer on the 'is a connected to b' in O( log(size of adjacency list for vertex a) ) time by using binary search, which is really fast for common usage.

    Now, if you want to implement Dijkstra algorithm really fast, you won't need an adj. matrix at all, just those lists.

    Again, it all depends on the future tasks and constraints. You cannot store the matrix of that size, that's all. You don't need it for Dijkstra or BFS, that's a fact. :) There is no conceptual difference from the graph's side: graph will be the same no matter what data structure it's stored in.

    If you really want the matrix, then that's the solution:
    We know, that number of connections (1 in matrix) is greatly smaller than its maximum which is n^2. By doing those lists, we simply store the positions of 1 (it's also called sparse matrix), which consumes no unneeded memory.