:orphan: .. _mock_obs_pos_formatting: ************************************************************************** Formatting your xyz coordinates for Mock Observables calculations ************************************************************************** The `~halotools.mock_observables` package adopts a specific convention for how its functions accept spatial coordinate inputs. If you have a collection of *Npts* coordinates for either *Ndim=2* or *Ndim=3*, the convention is that you will pass a multi-dimensional Numpy array of shape *(Npts, Ndim)* storing the coordinates. All the `~halotools.mock_observables` functions that operate on multi-dimensional data follow this convention. For example, `~halotools.mock_observables.tpcf`, `~halotools.mock_observables.void_prob_func` and `~halotools.mock_observables.mean_delta_sigma` all accept data formatted as `~numpy.ndarray` of shape *(Npts, 3)*, while `~halotools.mock_observables.angular_tpcf` accepts a `~numpy.ndarray` of shape *(Npts, 2)*. Example of how to transform your coordinates =============================================== Suppose you have a collection of *x, y, z* arrays storing the spatial positions of halos or galaxies. >>> Npts = int(1e5) >>> Lbox = 250 >>> import numpy as np >>> x = np.random.uniform(0, Lbox, Npts) >>> y = np.random.uniform(0, Lbox, Npts) >>> z = np.random.uniform(0, Lbox, Npts) In order to bundle these arrays into the shape of the multi-dimensional array used by the `~halotools.mock_observables` package: >>> pos = np.vstack((x, y, z)).T The ``pos`` array is now formatted in a form that can be directly passed, for example, to the `~halotools.mock_observables.tpcf` function as the first positional argument. If you had two-dimensional data instead: >>> ra = np.random.uniform(0, 2*np.pi, Npts) >>> dec = np.random.uniform(-np.pi/2., np.pi/2, Npts) >>> angular_coords = np.vstack((ra, dec)).T The ``angular_coords`` array is now formatted in a form that can be directly passed, for example, to the `~halotools.mock_observables.angular_tpcf` function as the first positional argument. Using the `~halotools.mock_observables.return_xyz_formatted_array` convenience function ========================================================================================= When using the `~halotools.mock_observables` package, the above transformation is so commonly encountered that there is a convenience function dedicated to handling it: >>> from halotools.mock_observables import return_xyz_formatted_array >>> pos = return_xyz_formatted_array(x, y, z) There is no difference between using `~halotools.mock_observables.return_xyz_formatted_array` or `numpy.vstack`. However, the `~halotools.mock_observables.return_xyz_formatted_array` function comes with two additional features that are worthy of special mention. Applying redshift-space distortions --------------------------------------- For some science targets, you may wish to apply redshift-space distortions to your coordinates before computing the observable statistic. For example, RSD has a very significant impact on galaxy group identification, and so most applications using the `~halotools.mock_observables.FoFGroups` feature will want to account for this effect. To do, you can use the ``velocity_distortion_dimension`` keyword argument together with the ``velocity`` keyword storing an array with the peculiar velocity in whatever dimension you want to distort. In the code below, we'll apply redshift-space distortions assuming the default cosmology and redshift: >>> velz = np.random.normal(loc=0, scale=100, size=Npts) >>> pos_zdist = return_xyz_formatted_array(x, y, z, velocity=velz, velocity_distortion_dimension='z') Under the distant-observer approximation, the ``pos_zdist`` array includes the effect of redshift-space distortions, so that pos_zdist[:, 0] and pos_zdist[:,1] slices can serve as the directions perpendicular to the line-of-sight, and pos_zdist[:, 2] the direction parallel to the line-of-sight. You may wish to use the `return_xyz_formatted_array` function to apply realistic z-space distortions for mock galaxy samples "observed" at higher redshift, and/or assuming a different cosmology. This can be handled using the ``redshift`` and/or ``cosmology`` keyword arguments: >>> from astropy.cosmology import Planck15 >>> redshift = 0.45 >>> velz = np.random.normal(loc=0, scale=100, size=Npts) >>> pos_zdist = return_xyz_formatted_array(x, y, z, velocity=velz, velocity_distortion_dimension='z', cosmology=Planck15, redshift=redshift) Selecting subsamples ----------------------- There is an additional feature of the `~halotools.mock_observables.return_xyz_formatted_array` function that allows you to retrieve a specific subsample of your coordinates. Let's see how this works in a realistic example: retrieving the spatial positions of quiescent and star-forming samples in a mock galaxy catalog. >>> from halotools.empirical_models import PrebuiltSubhaloModelFactory >>> model = PrebuiltSubhaloModelFactory('smhm_binary_sfr') >>> from halotools.sim_manager import FakeSim >>> halocat = FakeSim() >>> model.populate_mock(halocat) Our ``model`` now has a ``mock`` object attached to it with a ``galaxy_table`` storing the mock galaxies in the form of an Astropy `~astropy.table.Table`. >>> x = model.mock.galaxy_table['x'] >>> y = model.mock.galaxy_table['y'] >>> z = model.mock.galaxy_table['z'] >>> red_sample_mask = model.mock.galaxy_table['quiescent'] == True >>> red_pos = return_xyz_formatted_array(x, y, z, mask = red_sample_mask) >>> blue_pos = return_xyz_formatted_array(x, y, z, mask = ~red_sample_mask)