Search code examples
pythonpython-3.xgraph-theorychemistryrdkit

graph theory - connect point in 3D space with other three nearest points (distance based)


I want to connect nodes (Atoms) with the three nearest nodes(atoms).

I am doing

ad1 = np.zeros((31,31))
for i in range(31):
    dist_i = dist_mat[i]
    cut_off = sorted(dist_i)[3]
    ad1[i, np.where((dist_i<=cut_off) & (dist_i!=0.0))] = 1

np.sum(ad1, axis=1) #[3,3,3,3........3]

np.sum(ad1, axis=0)
#array([3., 3., 2., 2., 2., 4., 2., 5., 2., 3., 2., 3., 3., 2., 3., 6., 3.,
   5., 3., 4., 2., 3., 2., 4., 3., 3., 2., 4., 3., 2., 3.])

I want np.sum(ad1, axis=0) to be all 3. That would mean all node (atoms) are connected to exactly 3 nearest nodes. As we can see node/atom 5 is connected to 4 other nodes/atoms which is wrong, I want it to be connected to exactly 3 nearest nodes. How do I do it.

Below is the distance matrix of 31 atoms (31 X 31).

0.0000,6.8223,7.5588,4.8966,7.2452,2.7778,3.7082,2.7345,7.1540,6.8273,3.6995,7.4136,4.6132,5.8456,2.8037,5.4881,8.1769,2.7361,8.3034,4.9450,4.8225,4.6152,4.8243,9.4876,7.2391,2.9941,7.4180,5.8523,7.6310,5.5996,8.1761
6.8223,0.0000,3.0097,2.8567,2.6647,5.0092,5.8451,6.8037,6.7031,4.8983,7.5806,5.2873,5.5000,7.0038,4.9530,3.9763,7.0263,4.8941,4.7416,6.8450,2.8166,2.9221,5.5502,7.0328,5.5148,7.6318,2.7456,5.1150,2.9654,4.2863,5.3168
7.5588,3.0097,0.0000,2.8679,2.9242,5.1443,6.8372,6.5621,5.5169,3.0135,6.8412,4.6886,4.2507,5.4276,5.8673,4.8804,4.7149,6.5707,2.5568,6.3820,5.0625,4.2533,5.0600,5.2125,2.9262,7.8126,4.6864,5.4224,2.6001,3.2330,4.7116
4.8966,2.8567,2.8679,0.0000,3.8064,3.0221,5.1335,4.1131,5.8284,2.8610,5.1354,3.8488,2.8086,4.9618,3.0028,2.7539,5.7401,4.1214,4.5830,5.4268,2.9346,2.8122,2.9319,6.8084,3.8015,5.8992,3.8516,4.9611,2.9212,3.0441,5.7392
7.2452,2.6647,2.9242,3.8064,0.0000,4.7907,5.1676,7.2899,4.5138,5.5206,7.0249,6.9978,5.3888,5.7435,6.2424,6.0645,5.2615,5.6079,2.8907,5.3589,4.7384,2.8152,6.6866,4.6140,4.7660,6.9472,5.3921,3.2734,4.7050,2.9157,2.6684
2.7778,5.0092,5.1443,3.0221,4.7907,0.0000,2.8352,3.0025,4.6425,5.0210,2.8371,6.4170,2.6958,3.6516,3.1115,4.9489,5.5823,3.0042,5.5496,3.0193,4.1730,2.6916,4.1796,6.7613,4.7913,2.8772,6.4156,3.6500,5.9378,2.8228,5.5754
3.7082,5.8451,6.8372,5.1335,5.1676,2.8352,0.0000,5.4535,5.0179,7.5889,4.7500,8.8240,5.4549,5.4488,4.9281,6.8616,7.0789,2.8247,6.7504,3.1734,4.9441,2.9387,6.8368,7.3498,7.0263,2.7571,7.6165,2.7391,7.8184,4.2915,5.4352
2.7345,6.8037,6.5621,4.1131,7.2899,3.0025,5.4535,0.0000,6.9646,4.8923,2.8238,5.6867,2.8002,4.7436,2.7895,4.7319,7.0025,4.5800,7.4954,5.2983,5.3961,5.3071,2.7803,8.9227,5.5914,4.2758,7.1749,6.6267,6.5470,5.1104,8.2905
7.1540,6.7031,5.5169,5.8284,4.5138,4.6425,5.0179,6.9646,0.0000,6.7047,5.0056,9.1250,4.9769,2.9634,7.5819,8.5546,2.8061,6.9800,3.6751,2.5834,7.6541,4.9881,7.6494,2.7948,4.5088,5.2110,9.1264,2.9761,7.8062,2.7941,2.8038
6.8273,4.8983,3.0135,2.8610,5.5206,5.0210,7.5889,4.8923,6.7047,0.0000,5.8617,2.7465,2.9301,5.1272,4.9558,3.9808,5.3162,6.8165,4.7404,6.8576,5.5561,5.5098,2.8152,7.0330,2.6644,7.6461,5.2912,7.0096,2.9674,4.2945,7.0258
3.6995,7.5806,6.8412,5.1354,7.0249,2.8371,4.7500,2.8238,5.0056,5.8617,0.0000,7.6266,2.9474,2.7359,4.9270,6.8677,5.4363,5.4517,6.7478,3.1721,6.8329,5.4529,4.9555,7.3438,5.1708,2.7597,8.8287,5.4452,7.8249,4.2909,7.0646
7.4136,5.2873,4.6886,3.8488,6.9978,6.4170,8.8240,5.6867,9.1250,2.7465,7.6266,0.0000,4.9529,7.5866,4.8602,2.7541,8.0299,7.1729,7.0207,8.9107,5.2910,6.5588,2.9146,9.5249,5.3922,9.1160,4.1719,8.7725,2.7497,6.4529,9.0793
4.6132,5.5000,4.2507,2.8086,5.3888,2.6958,5.4549,2.8002,4.9769,2.9301,2.9474,4.9529,0.0000,2.8097,3.9433,4.7505,4.4056,5.3122,4.8719,4.3293,5.3691,4.4387,2.8442,6.4077,2.8070,4.8696,6.5606,5.3482,5.0545,2.9281,6.2031
5.8456,7.0038,5.4276,4.9618,5.7435,3.6516,5.4488,4.7436,2.9634,5.1272,2.7359,7.5866,2.8097,0.0000,6.2398,7.3805,2.7234,6.6316,4.5653,2.8038,7.3250,5.3513,5.6427,4.8167,3.2793,4.4545,8.7753,4.6687,7.1729,2.8747,5.2395
2.8037,4.9530,5.8673,3.0028,6.2424,3.1115,4.9281,2.7895,7.5819,4.9558,4.9270,4.8602,3.9433,6.2398,0.0000,2.6964,7.9654,2.7924,7.3935,6.1211,2.8429,3.9452,2.8424,9.2720,6.2355,5.1988,4.8668,6.2430,5.2004,5.1737,7.9659
5.4881,3.9763,4.8804,2.7539,6.0645,4.9489,6.8616,4.7319,8.5546,3.9808,6.8677,2.7541,4.7505,7.3805,2.6964,0.0000,8.3401,4.7306,7.1148,7.8221,2.7718,4.7476,2.7753,9.5030,6.0617,7.5711,2.7577,7.3777,3.1397,5.7882,8.3392
8.1769,7.0263,4.7149,5.7401,5.2615,5.5823,7.0789,7.0025,2.8061,5.3162,5.4363,8.0299,4.4056,2.7234,7.9654,8.3401,0.0000,8.3066,2.9172,4.5762,8.3187,6.2166,6.9974,2.6988,2.6667,6.8504,9.0816,5.2518,7.0366,3.3010,4.3079
2.7361,4.8941,6.5707,4.1214,5.6079,3.0042,2.8247,4.5800,6.9800,6.8165,5.4517,7.1729,5.3122,6.6316,2.7924,4.7306,8.3066,0.0000,7.5094,5.3015,2.7788,2.8080,5.4012,8.9373,7.2956,4.2691,5.6901,4.7565,6.5515,5.1204,7.0169
8.3034,4.7416,2.5568,4.5830,2.8907,5.5496,6.7504,7.4954,3.6751,4.7404,6.7478,7.0207,4.8719,4.5653,7.3935,7.1148,2.9172,7.5094,0.0000,5.4477,6.7736,4.8812,6.7675,2.6661,2.8871,7.5857,7.0211,4.5660,5.1569,2.7695,2.9165
4.9450,6.8450,6.3820,5.4268,5.3589,3.0193,3.1734,5.2983,2.5834,6.8576,3.1721,8.9107,4.3293,2.8038,6.1211,7.8221,4.5762,5.3015,5.4477,0.0000,6.7806,4.3263,6.7865,5.3395,5.3642,2.6393,8.9072,2.7997,8.0722,3.1518,4.5623
4.8225,2.8166,5.0625,2.9346,4.7384,4.1730,4.9441,5.3961,7.6541,5.5561,6.8329,5.2910,5.3691,7.3250,2.8429,2.7718,8.3187,2.7788,6.7736,6.7806,0.0000,2.8411,4.6768,8.8466,6.6849,6.5259,2.9190,5.6409,4.2802,5.1556,7.0036
4.6152,2.9221,4.2533,2.8122,2.8152,2.6916,2.9387,5.3071,4.9881,5.5098,5.4529,6.5588,4.4387,5.3513,3.9452,4.7476,6.2166,2.8080,4.8812,4.3263,2.8411,0.0000,5.3713,6.4170,5.3921,4.8619,4.9490,2.8105,5.0539,2.9321,4.4129
4.8243,5.5502,5.0600,2.9319,6.6866,4.1796,6.8368,2.7803,7.6494,2.8152,4.9555,2.9146,2.8442,5.6427,2.8424,2.7753,6.9974,5.4012,6.7675,6.7865,4.6768,5.3713,0.0000,8.8413,4.7294,6.5361,5.2956,7.3262,4.2788,5.1556,8.3129
9.4876,7.0328,5.2125,6.8084,4.6140,6.7613,7.3498,8.9227,2.7948,7.0330,7.3438,9.5249,6.4077,4.8167,9.2720,9.5030,2.6988,8.9373,2.6661,5.3395,8.8466,6.4170,8.8413,0.0000,4.6127,7.9200,9.5247,4.8201,7.8081,4.1075,2.6956
7.2391,5.5148,2.9262,3.8015,4.7660,4.7913,7.0263,5.5914,4.5088,2.6644,5.1708,5.3922,2.8070,3.2793,6.2355,6.0617,2.6667,7.2956,2.8871,5.3642,6.6849,5.3921,4.7294,4.6127,0.0000,6.9513,6.9963,5.7431,4.7049,2.9171,5.2546
2.9941,7.6318,7.8126,5.8992,6.9472,2.8772,2.7571,4.2758,5.2110,7.6461,2.7597,9.1160,4.8696,4.4545,5.1988,7.5711,6.8504,4.2691,7.5857,2.6393,6.5259,4.8619,6.5361,7.9200,6.9513,0.0000,9.1125,4.4513,8.8145,4.8939,6.8395
7.4180,2.7456,4.6864,3.8516,5.3921,6.4156,7.6165,7.1749,9.1264,5.2912,8.8287,4.1719,6.5606,8.7753,4.8668,2.7577,9.0816,5.6901,7.0211,8.9072,2.9190,4.9490,5.2956,9.5247,6.9963,9.1125,0.0000,7.5797,2.7482,6.4514,8.0302
5.8523,5.1150,5.4224,4.9611,3.2734,3.6500,2.7391,6.6267,2.9761,7.0096,5.4452,8.7725,5.3482,4.6687,6.2430,7.3777,5.2518,4.7565,4.5660,2.7997,5.6409,2.8105,7.3262,4.8201,5.7431,4.4513,7.5797,0.0000,7.1676,2.8724,2.7190
7.6310,2.9654,2.6001,2.9212,4.7050,5.9378,7.8184,6.5470,7.8062,2.9674,7.8249,2.7497,5.0545,7.1729,5.2004,3.1397,7.0366,6.5515,5.1569,8.0722,4.2802,5.0539,4.2788,7.8081,4.7049,8.8145,2.7482,7.1676,0.0000,5.1559,7.0354
5.5996,4.2863,3.2330,3.0441,2.9157,2.8228,4.2915,5.1104,2.7941,4.2945,4.2909,6.4529,2.9281,2.8747,5.1737,5.7882,3.3010,5.1204,2.7695,3.1518,5.1556,2.9321,5.1556,4.1075,2.9171,4.8939,6.4514,2.8724,5.1559,0.0000,3.2927
8.1761,5.3168,4.7116,5.7392,2.6684,5.5754,5.4352,8.2905,2.8038,7.0258,7.0646,9.0793,6.2031,5.2395,7.9659,8.3392,4.3079,7.0169,2.9165,4.5623,7.0036,4.4129,8.3129,2.6956,5.2546,6.8395,8.0302,2.7190,7.0354,3.2927,0.0000

Edit 1

Thank-you, @yatu, and @mathfux but Kdtree does not produce what I want.

so from @mathfux answer if the nearest list of 3 is 0,1,2 and 0 has the nearest list of 1,4,2 then the algorithm should give more importance to distances and break the connections of 3 and 0. 3 should find other points nearest to it, instead of joining with 0. if 3 cannot find other points nearest to it then 0 has to exclude either 1 or 2 based on the distance and include 3 because 3 could not find 3 nearest points other than 0,1,2.

Also one can see 0 is connected to 25 but 25 is not connected to 0 which is wrong. if 0 is connected to 25 then 25 should also be connected to 0.

[ 0, 17,  7, 25]
[ 5, 21, 12,  6]
[ 6, 25, 27, 17]
[10, 25, 13,  7]
[12,  5,  3, 24]
[21,  5,  3,  4]
[25,  6, 10, 19]
[29,  4, 12, 24]

Solution

  • It appears that such connection might not always be possible because you are not able to hold your matrix symmetric. Actually, matrix of distance is not needed for explanation.

    import matplotlib.pyplot as plt
    points = np.array([[0,0],[0,1],[0,-1],[1,0],[-0.5,0]])
    plt.scatter(*np.transpose(points))
    for i in range(len(points)):
        plt.text(points[i][0], points[i][1], [0, 1, 2, 3, 4][i], fontsize=24)
    

    enter image description here

    If you, let's say, find 3 closest points 0, 1, 2 for point 3 this doesn't mean that 3 is in a list of closest points of 0.

    If you still want to find 3 closests points, you don't need that looping at all, just use np.argsort(dist_mat, axis=1)[:,1:4] to find indexes of closest points or np.sort(dist_mat, axis=1)[:,1:4] to find distances. In case you need to work with points instead of dist_max (which is an overhead), KDTree is a better option.