Formatting your xyz coordinates for Mock Observables calculations¶
The mock_observables
package adopts a specific convention for
how its functions accept spatial coordinate inputs.
If you have a collection of Npts coordinates for either Ndim=2 or Ndim=3,
the convention is that you will pass a multi-dimensional Numpy array
of shape (Npts, Ndim) storing the coordinates.
All the mock_observables
functions that operate on multi-dimensional data
follow this convention. For example,
tpcf
, void_prob_func
and mean_delta_sigma
all accept data formatted as
ndarray
of shape (Npts, 3), while angular_tpcf
accepts
a ndarray
of shape (Npts, 2).
Example of how to transform your coordinates¶
Suppose you have a collection of x, y, z arrays storing the spatial positions of halos or galaxies.
>>> Npts = int(1e5)
>>> Lbox = 250
>>> import numpy as np
>>> x = np.random.uniform(0, Lbox, Npts)
>>> y = np.random.uniform(0, Lbox, Npts)
>>> z = np.random.uniform(0, Lbox, Npts)
In order to bundle these arrays into the shape of the multi-dimensional array
used by the mock_observables
package:
>>> pos = np.vstack((x, y, z)).T
The pos
array is now formatted in a form that can be directly passed, for example,
to the tpcf
function as the first positional argument.
If you had two-dimensional data instead:
>>> ra = np.random.uniform(0, 2*np.pi, Npts)
>>> dec = np.random.uniform(-np.pi/2., np.pi/2, Npts)
>>> angular_coords = np.vstack((ra, dec)).T
The angular_coords
array is now formatted in a form that can be directly passed, for example,
to the angular_tpcf
function as the first positional argument.
Using the return_xyz_formatted_array
convenience function¶
When using the mock_observables
package,
the above transformation is so commonly encountered that there is a convenience function
dedicated to handling it:
>>> from halotools.mock_observables import return_xyz_formatted_array
>>> pos = return_xyz_formatted_array(x, y, z)
There is no difference between using
return_xyz_formatted_array
or numpy.vstack
.
However, the return_xyz_formatted_array
function comes
with two additional features that are worthy of special mention.
Applying redshift-space distortions¶
For some science targets, you may wish to apply redshift-space distortions to your
coordinates before computing the observable statistic.
For example, RSD has a very significant impact on galaxy group identification,
and so most applications using the FoFGroups
feature
will want to account for this effect.
To do, you can use the velocity_distortion_dimension
keyword argument together
with the velocity
keyword storing an array with
the peculiar velocity in whatever dimension you want to distort. In the code below,
we’ll apply redshift-space distortions assuming the default cosmology and redshift:
>>> velz = np.random.normal(loc=0, scale=100, size=Npts)
>>> pos_zdist = return_xyz_formatted_array(x, y, z, velocity=velz, velocity_distortion_dimension='z')
Under the distant-observer approximation,
the pos_zdist
array includes the effect of redshift-space distortions,
so that pos_zdist[:, 0] and pos_zdist[:,1] slices
can serve as the directions perpendicular to the line-of-sight,
and pos_zdist[:, 2] the direction parallel to the line-of-sight.
You may wish to use the return_xyz_formatted_array
function to apply realistic z-space
distortions for mock galaxy samples “observed” at higher redshift, and/or assuming a different cosmology.
This can be handled using the redshift
and/or cosmology
keyword arguments:
>>> from astropy.cosmology import Planck15
>>> redshift = 0.45
>>> velz = np.random.normal(loc=0, scale=100, size=Npts)
>>> pos_zdist = return_xyz_formatted_array(x, y, z, velocity=velz, velocity_distortion_dimension='z', cosmology=Planck15, redshift=redshift)
Selecting subsamples¶
There is an additional feature of the
return_xyz_formatted_array
function
that allows you to retrieve a specific subsample of your coordinates.
Let’s see how this works in a realistic example:
retrieving the spatial positions of quiescent and star-forming samples
in a mock galaxy catalog.
>>> from halotools.empirical_models import PrebuiltSubhaloModelFactory
>>> model = PrebuiltSubhaloModelFactory('smhm_binary_sfr')
>>> from halotools.sim_manager import FakeSim
>>> halocat = FakeSim()
>>> model.populate_mock(halocat)
Our model
now has a mock
object attached to it with a galaxy_table
storing the mock galaxies in the form of an Astropy Table
.
>>> x = model.mock.galaxy_table['x']
>>> y = model.mock.galaxy_table['y']
>>> z = model.mock.galaxy_table['z']
>>> red_sample_mask = model.mock.galaxy_table['quiescent'] == True
>>> red_pos = return_xyz_formatted_array(x, y, z, mask = red_sample_mask)
>>> blue_pos = return_xyz_formatted_array(x, y, z, mask = ~red_sample_mask)