Build an adjacency matrix from distances between cosmological objects

I'm probing into the Illustris API, and gathering information from a specific cosmos simulation, for a given redshift value.

This is how I request the api:

import requests

baseUrl = 'http://www.tng-project.org/api/'
    
def get(path, params=None):
    # make HTTP GET request to path
    headers = {"api-key":"my_key"}
    r = requests.get(path, params=params, headers=headers)

    # raise exception if response code is not HTTP SUCCESS (200)
    r.raise_for_status()

    if r.headers['content-type'] == 'application/json':
        return r.json() # parse json responses automatically
    
    if 'content-disposition' in r.headers:
        filename = r.headers['content-disposition'].split("filename=")[1]
        with open(f'sky_dataset/simulations/{filename}', 'wb') as f:
            f.write(r.content)
        return filename # return the filename string
    return r

And below I get the star coordinates for a given subhalo in this particular simulation. Note that -if I'm doing it right- distances have already been converted from ckpc/h to physical kpc.

Physical coordinates are the actual distances you would measure between them if you froze space and started laying out measuring rods:

import h5py
import numpy as np

simulation_id = 100
redshift = 0.57
subhalo_id = 99

scale_factor = 1.0 / (1+redshift)
little_h = 0.704

params = {'stars':'Coordinates,GFM_Metallicity'}

url = "http://www.tng-project.org/api/Illustris-1/snapshots/z=" + str(redshift) + "/subhalos/" + str(subhalo_id)
sub = get(url) # get json response of subhalo properties
saved_filename = get(url + "/cutout.hdf5",params) # get and save HDF5 cutout file

with h5py.File(f'sky_dataset/simulations/{saved_filename}') as f:
    # NOTE! If the subhalo is near the edge of the box, you must take the periodic boundary into account! (we ignore it here)
    dx = f['PartType4']['Coordinates'][:,0] - sub['pos_x']
    dy = f['PartType4']['Coordinates'][:,1] - sub['pos_y']
    dz = f['PartType4']['Coordinates'][:,2] - sub['pos_z']
    
    rr = np.sqrt(dx**2 + dy**2 + dz**2)
    rr *= scale_factor/little_h # ckpc/h -> physical kpc

    fig = plt.figure(figsize=(12,12))
    with mpl.rc_context(rc={'axes3d.grid': True}):
        ax = fig.add_subplot(projection='3d')

        # Plot the values
        ax.scatter(dx, dy, dz)
        ax.set_xlabel('X-axis')
        ax.set_ylabel('Y-axis')
        ax.set_zlabel('Z-axis')
    plt.show()

The above plots:

as requested by one comment, I print dy, dy, dz truncated examples:

dx = [ 2.63370612e-01  3.48350511e-01 -1.23379511e-02 ...  6.63767411e+00
  1.32910697e+01  8.75469902e+00]

dy = [  0.33889825   0.21808108   0.50170807 ...   8.95542985  -9.84251952
 -16.38661054]

dz = [ -0.26469788  -0.10382767  -0.16625317 ...  -4.84708218 -13.77888398
  10.42730599]

My aim is to build a connectivity network for this system, starting with an square (simetrical) adjacency matrix, whereby any two stars (or vertices) are connected if they lie within the linking length l of 1.2 Mpc, that is:

Aij = 1 if rij ≤ l, otherwise 0

where rij is the distance between the two vertices, i and j.

How can I get this adjacency matrix, based on my linking length?

Solution

A solution using sklearn.neighbors.radius_neighbors_graph and your example data:

from sklearn.neighbors import radius_neighbors_graph

# Your example data in runnable format
dx = np.array([2.63370612e-01, 3.48350511e-01, -1.23379511e-02, 
               6.63767411e+00, 1.32910697e+01,  8.75469902e+00])
dy = np.array([0.33889825,  0.21808108,   0.50170807, 
               8.95542985, -9.84251952, -16.38661054])
dz = np.array([-0.26469788,  -0.10382767, -0.16625317, 
               -4.84708218, -13.77888398, 10.42730599])

# Build a coordinate matrix with columns x, y, z, with one star per row
X = np.column_stack([dx, dy, dz])

print(X)
[[ 2.63370612e-01  3.38898250e-01 -2.64697880e-01]
 [ 3.48350511e-01  2.18081080e-01 -1.03827670e-01]
 [-1.23379511e-02  5.01708070e-01 -1.66253170e-01]
 [ 6.63767411e+00  8.95542985e+00 -4.84708218e+00]
 [ 1.32910697e+01 -9.84251952e+00 -1.37788840e+01]
 [ 8.75469902e+00 -1.63866105e+01  1.04273060e+01]]

# Find the neighbours of each star, restricted to distance lower than radius=1.2
C = radius_neighbors_graph(X, 1.2)

# C is the connectivity matrix in Compressed Sparse Row (CSR) format. 
# For demonstration purposes, convert CSR matrix to dense representation 
# as a numpy matrix
C.todense()

matrix([[0., 1., 1., 0., 0., 0.],
        [1., 0., 1., 0., 0., 0.],
        [1., 1., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0.]])

For your example data of six stars, the connectivity matrix shows:

Star 0 (row 0) is within 1.2 distance units (kpc) of Stars 1 and 2
Stars 1 and 2 are within 1.2 kpc of each other

(You asked for a linking distance of 1.2 Mpc, which would correspond to radius=1200. For demo purposes, here I used radius=1.2, corresponding to 1.2 kpc, because all six stars are within 1.2 Mpc of each other, which would have resulted in a rather boring connectivity matrix.)