pairwise_distance_3d

halotools.mock_observables.pair_counters.pairwise_distance_3d(data1, data2, r_max, period=None, verbose=False, num_threads=1, approx_cell1_size=None, approx_cell2_size=None)[source] [edit on github]

Function returns pairs of points separated by a three-dimensional distance smaller than or eqaul to the input r_max.

Note that if data1 == data2 that the pairwise_distance_3d function double-counts pairs.

Parameters:

data1 : array_like

N1 by 3 numpy array of 3-dimensional positions. Values of each dimension should be between zero and the corresponding dimension of the input period.

data2 : array_like

N2 by 3 numpy array of 3-dimensional positions. Values of each dimension should be between zero and the corresponding dimension of the input period.

r_max : array_like

radius of spheres to search for pairs around galaxies in sample1. If a single float is given, r_max is assumed to be the same for each galaxy in sample1. You may optionally pass in an array of length Npts1, in which case each point in sample1 will have its own individual pair-search radius.

Length units assumed to be in Mpc/h, here and throughout Halotools.

period : array_like, optional

Length-3 array defining the periodic boundary conditions. If only one number is specified, the enclosing volume is assumed to be a periodic cube (by far the most common case). If period is set to None, the default option, PBCs are set to infinity.

verbose : Boolean, optional

If True, print out information and progress.

num_threads : int, optional

Number of CPU cores to use in the pair counting. If num_threads is set to the string ‘max’, use all available cores. Default is 1 thread for a serial calculation that does not open a multiprocessing pool.

approx_cell1_size : array_like, optional

Length-3 array serving as a guess for the optimal manner by which the RectangularDoubleMesh will apportion the data points into subvolumes of the simulation box. The optimum choice unavoidably depends on the specs of your machine. Default choice is to use 1/10 of the box size in each dimension, which will return reasonable result performance for most use-cases. Performance can vary sensitively with this parameter, so it is highly recommended that you experiment with this parameter when carrying out performance-critical calculations.

approx_cell2_size : array_like, optional

See comments for approx_cell1_size.

Returns:

distance : coo_matrix

sparse matrix in COO format containing distances between the ith entry in data1 and jth in data2.

Examples

For demonstration purposes we create randomly distributed sets of points within a periodic unit cube.

>>> Npts1, Npts2, Lbox = 1000, 1000, 250.
>>> period = [Lbox, Lbox, Lbox]
>>> r_max = 1.0
>>> x1 = np.random.uniform(0, Lbox, Npts1)
>>> y1 = np.random.uniform(0, Lbox, Npts1)
>>> z1 = np.random.uniform(0, Lbox, Npts1)
>>> x2 = np.random.uniform(0, Lbox, Npts2)
>>> y2 = np.random.uniform(0, Lbox, Npts2)
>>> z2 = np.random.uniform(0, Lbox, Npts2)

We transform our x, y, z points into the array shape used by the pair-counter by taking the transpose of the result of numpy.vstack. This boilerplate transformation is used throughout the mock_observables sub-package:

>>> data1 = np.vstack([x1, y1, z1]).T
>>> data2 = np.vstack([x2, y2, z2]).T
>>> dist_matrix = pairwise_distance_3d(data1, data2, r_max, period = period)