pairwise_distance_xy_z

halotools.mock_observables.pair_counters.pairwise_distance_xy_z(data1, data2, rp_max, pi_max, period=None, verbose=False, num_threads=1, approx_cell1_size=None, approx_cell2_size=None)[source]

Function returns pairs of points separated by a xy-projected distance smaller than or equal to the input rp_max and z distance pi_max.

Note that if data1 == data2 that the pairwise_distance_xy_z function double-counts pairs.

Parameters:
data1array_like

N1 by 3 numpy array of 3-dimensional positions. Values of each dimension should be between zero and the corresponding dimension of the input period.

data2array_like

N2 by 3 numpy array of 3-dimensional positions. Values of each dimension should be between zero and the corresponding dimension of the input period.

rp_maxarray_like

radius of the cylinder to search for neighbors around galaxies in data1. If a single float is given, rp_max is assumed to be the same for each galaxy in data1. You may optionally pass in an array of length Npts1, in which case each point in data1 will have its own individual neighbor-search projected radius.

Length units assumed to be in Mpc/h, here and throughout Halotools.

pi_maxarray_like

Half-length of cylinder to search for neighbors around galaxies in data1. If a single float is given, pi_max is assumed to be the same for each galaxy in data1. You may optionally pass in an array of length Npts1, in which case each point in data1 will have its own individual neighbor-search cylinder half-length.

Length units assumed to be in Mpc/h, here and throughout Halotools.

periodarray_like, optional

Length-3 array defining the periodic boundary conditions. If only one number is specified, the enclosing volume is assumed to be a periodic cube (by far the most common case). If period is set to None, the default option, PBCs are set to infinity.

verboseBoolean, optional

If True, print out information and progress.

num_threadsint, optional

Number of CPU cores to use in the pair counting. If num_threads is set to the string ‘max’, use all available cores. Default is 1 thread for a serial calculation that does not open a multiprocessing pool.

approx_cell1_sizearray_like, optional

Length-3 array serving as a guess for the optimal manner by which the RectangularDoubleMesh will apportion the data points into subvolumes of the simulation box. The optimum choice unavoidably depends on the specs of your machine. Default choice is to use 1/10 of the box size in each dimension, which will return reasonable result performance for most use-cases. Performance can vary sensitively with this parameter, so it is highly recommended that you experiment with this parameter when carrying out performance-critical calculations.

approx_cell2_sizearray_like, optional

See comments for approx_cell1_size.

Returns:
distancecoo_matrix

sparse matrix in COO format containing distances between the ith entry in data1 and jth in data2.

Examples

For demonstration purposes we create randomly distributed sets of points within a periodic unit cube.

>>> Npts1, Npts2, Lbox = 1000, 1000, 250.
>>> period = [Lbox, Lbox, Lbox]
>>> rp_max = 1.0
>>> pi_max = 2.0
>>> x1 = np.random.uniform(0, Lbox, Npts1)
>>> y1 = np.random.uniform(0, Lbox, Npts1)
>>> z1 = np.random.uniform(0, Lbox, Npts1)
>>> x2 = np.random.uniform(0, Lbox, Npts2)
>>> y2 = np.random.uniform(0, Lbox, Npts2)
>>> z2 = np.random.uniform(0, Lbox, Npts2)

We transform our x, y, z points into the array shape used by the pair-counter by taking the transpose of the result of numpy.vstack. This boilerplate transformation is used throughout the mock_observables sub-package:

>>> data1 = np.vstack([x1, y1, z1]).T
>>> data2 = np.vstack([x2, y2, z2]).T
>>> perp_dist_matrix, para_dist_matrix = pairwise_distance_xy_z(data1, data2, rp_max, pi_max, period = period)