MockFactory

class halotools.empirical_models.MockFactory(**kwargs)[source] [edit on github]

Bases: object

Abstract base class responsible for populating a simulation with a synthetic galaxy population.

MockFactory is an abstract base class, and cannot be instantiated. Concrete sub-classes of MockFactory such as HodMockFactory and SubhaloMockFactory are the objects used to populate simulations with galaxies.

Parameters:

halocat : object

Object containing the halo catalog and other associated data. Produced by CachedHaloCatalog

model : object

A model built by a sub-class of ModelFactory.

Attributes Summary

number_density Comoving number density of the mock galaxy catalog.
satellite_fraction Fraction of mock galaxies that are satellites.

Methods Summary

compute_fof_group_ids([zspace, b_perp, b_para]) Method computes the friends-of-friends group IDs of the mock galaxy catalog after (optionally) placing the mock into redshift space.
compute_galaxy_clustering([include_crosscorr]) Built-in method for all mock catalogs to compute the galaxy clustering signal.
compute_galaxy_matter_cross_clustering([…]) Built-in method for all mock catalogs to compute the galaxy-matter cross-correlation function.
populate(**kwargs) Method populating halos with mock galaxies.

Attributes Documentation

number_density

Comoving number density of the mock galaxy catalog.

Returns:

number density : float

Comoving number density in units of \((h/Mpc)^{3}\).

satellite_fraction

Fraction of mock galaxies that are satellites.

Methods Documentation

compute_fof_group_ids(zspace=True, b_perp=0.2, b_para=0.75, **kwargs)[source] [edit on github]

Method computes the friends-of-friends group IDs of the mock galaxy catalog after (optionally) placing the mock into redshift space.

Parameters:

zspace : bool, optional

Boolean determining whether we apply redshift-space distortions to the positions of galaxies using the distant-observer approximation. Default is True.

b_perp : float, optional

Maximum linking length in the perpendicular direction, normalized by the mean separation between galaxies. Default is set in model_defaults module.

b_para : float, optional

Maximum linking length in the line-of-sight direction, normalized by the mean separation between galaxies. Default is set in model_defaults module.

num_threads : int, optional

Number of CPU cores to use in the calculation. Default is maximum number available.

Returns:

ids : array

Integer array containing the group ID of each mock galaxy.

Notes

The compute_fof_group_ids method bound to mock instances is just a convenience wrapper around the FoFGroups class. If you wish for greater control over how your galaxy clustering signal is estimated, see the group_ids documentation.

compute_galaxy_clustering(include_crosscorr=False, **kwargs)[source] [edit on github]

Built-in method for all mock catalogs to compute the galaxy clustering signal.

Parameters:

variable_galaxy_mask : scalar, optional

Any value used to construct a mask to select a sub-population of mock galaxies. See examples below.

include_crosscorr : bool, optional

Only for simultaneous use with a variable_galaxy_mask-determined mask. If include_crosscorr is set to False (the default option), method will return the auto-correlation function of the subsample of galaxies determined by the input variable_galaxy_mask. If include_crosscorr is True, method will return the auto-correlation of the subsample, the cross-correlation of the subsample and the complementary subsample, and the the auto-correlation of the complementary subsample, in that order. See examples below.

mask_function : array, optional

Function object returning a masking array when operating on the galaxy_table. More flexible than the simpler variable_galaxy_mask option because mask_function allows for the possibility of multiple simultaneous cuts. See examples below.

rbins : array, optional

Bins in which the correlation function will be calculated. Default is set in model_defaults module.

num_threads : int, optional

Number of CPU cores to use in the calculation. Default is maximum number available.

Returns:

rbin_centers : array

Midpoint of the bins used in the correlation function calculation

correlation_func : array

If not using any mask (the default option), method returns the correlation function of the full mock galaxy catalog.

If using a mask, and if include_crosscorr is False (the default option), method returns the correlation function of the subsample of galaxies determined by the input mask.

If using a mask, and if include_crosscorr is True, method will return the auto-correlation of the subsample, the cross-correlation of the subsample and the complementary subsample, and the the auto-correlation of the complementary subsample, in that order. See the example below.

Notes

The compute_galaxy_clustering method bound to mock instances is just a convenience wrapper around the tpcf function. If you wish for greater control over how your galaxy clustering signal is estimated, see the tpcf documentation.

Examples

Compute two-point clustering of all galaxies in the mock:

>>> r, clustering = mock.compute_galaxy_clustering() 

Compute two-point clustering of central galaxies only:

>>> r, clustering = mock.compute_galaxy_clustering(gal_type = 'centrals') 

Compute two-point clustering of quiescent galaxies, star-forming galaxies, as well as the cross-correlation:

>>> r, quiescent_clustering, q_sf_cross_clustering, star_forming_clustering = mock.compute_galaxy_clustering(quiescent = True, include_crosscorr = True) 

Finally, suppose we wish to ask a very targeted question about how some physical effect impacts the clustering of galaxies in a specific halo mass range. For example, suppose we wish to study the two-point function of satellite galaxies residing in cluster-mass halos. For this we can use the more flexible mask_function option to select our population:

>>> def my_masking_function(table): 
>>>     result = (table['halo_mvir'] > 1e14) & (table['gal_type'] == 'satellites') 
>>>     return result 
>>> r, cluster_sat_clustering = mock.compute_galaxy_clustering(mask_function = my_masking_function) 
compute_galaxy_matter_cross_clustering(include_complement=False, seed=None, **kwargs)[source] [edit on github]

Built-in method for all mock catalogs to compute the galaxy-matter cross-correlation function.

Parameters:

variable_galaxy_mask : scalar, optional

Any value used to construct a mask to select a sub-population of mock galaxies. See examples below.

include_complement : bool, optional

Only for simultaneous use with a variable_galaxy_mask-determined mask. If include_complement is set to False (the default option), method will return the cross-correlation function between a random downsampling of dark matter particles and the subsample of galaxies determined by the input variable_galaxy_mask. If include_complement is True, method will also return the cross-correlation between the dark matter particles and the complementary subsample. See examples below.

mask_function : array, optional

Function object returning a masking array when operating on the galaxy_table. More flexible than the simpler variable_galaxy_mask option because mask_function allows for the possibility of multiple simultaneous cuts. See examples below.

rbins : array, optional

Bins in which the correlation function will be calculated. Default is set in model_defaults module.

num_threads : int, optional

Number of CPU cores to use in the calculation. Default is maximum number available.

seed : integer, optional

Random number seed used when drawing random numbers with numpy.random. Useful when deterministic results are desired, such as during unit-testing. Default is None, producing stochastic results.

Returns:

rbin_centers : array

Midpoint of the bins used in the correlation function calculation

correlation_func : array

If not using a mask (the default option), method returns the correlation function of the full mock galaxy catalog.

If using a mask, and if include_complement is False (the default option), method returns the cross-correlation function between a random downsampling of dark matter particles and the subsample of galaxies determined by the input mask.

If using a mask, and if include_complement is True, method will also return the cross-correlation between the dark matter particles and the complementary subsample. See examples below.

Notes

The compute_galaxy_matter_cross_clustering method bound to mock instances is just a convenience wrapper around the tpcf function. If you wish for greater control over how your galaxy clustering signal is estimated, see the tpcf documentation.

Examples

Compute two-point clustering between all mock galaxies and dark matter particles:

>>> r, galaxy_matter_clustering = mock.compute_galaxy_matter_cross_clustering() 

Compute the same quantity but for central galaxies only:

>>> r, central_galaxy_matter_clusteringclustering = mock.compute_galaxy_matter_cross_clustering(gal_type = 'centrals') 

Compute the galaxy-matter cross-clustering for quiescent galaxies and for star-forming galaxies:

>>> r, quiescent_matter_clustering, star_forming_matter_clustering = mock.compute_galaxy_matter_cross_clustering(quiescent = True, include_complement = True) 

Finally, suppose we wish to ask a very targeted question about how some physical effect impacts the clustering of galaxies in a specific halo mass range. For example, suppose we wish to study the galaxy-matter cross-correlation function of satellite galaxies residing in cluster-mass halos. For this we can use the more flexible mask_function option to select our population:

>>> def my_masking_function(table): 
>>>     result = (table['halo_mvir'] > 1e14) & (table['gal_type'] == 'satellites') 
>>>     return result 
>>> r, cluster_sat_clustering = mock.compute_galaxy_matter_cross_clustering(mask_function = my_masking_function) 
populate(**kwargs)[source] [edit on github]

Method populating halos with mock galaxies.

By calling the populate method of your mock, you will repopulate the halo catalog with a new realization of the model based on whatever values of the model parameters are currently stored in the param_dict of the model.

For documentation specific to the populate method of subhalo-based models, see halotools.empirical_models.SubhaloMockFactory.populate; for HOD-style models see halotools.empirical_models.HodMockFactory.populate.

Notes

Note the difference between the halotools.empirical_models.MockFactory.populate method and the closely related method halotools.empirical_models.ModelFactory.populate_mock. The populate_mock method is bound to a composite model instance and is called the first time a composite model is used to generate a mock. Calling the populate_mock method creates the MockFactory instance and binds it to composite model. From then on, if you want to repopulate a new Universe with the same composite model, you should instead call the populate method bound to model.mock. The reason for this distinction is that calling populate_mock triggers a large number of relatively expensive pre-processing steps and self-consistency checks that need only be carried out once. See the Examples section below for an explicit demonstration.

In particular, if you are running an MCMC type analysis, you will choose your halo catalog and completeness cuts, and call populate_mock with the appropriate arguments. Thereafter, you can explore parameter space by changing the values stored in the param_dict dictionary attached to the model, and then calling the populate method bound to model.mock. Any changes to the param_dict of the model will automatically propagate into the behavior of the populate method.

Normally, repeated calls to the populate method should not increase the RAM usage of halotools because a new mock catalog is created and the old one deleted. However, on certain machines the memory usage was found to increase over time. If this is the case and memory usage is critical you can try calling gc.collect() immediately following the call to mock.populate to manually invoke python’s garbage collection.

Examples

We’ll use a pre-built HOD-style model to demonstrate basic usage. The same syntax applies to subhalo-based models.

>>> from halotools.empirical_models import PrebuiltHodModelFactory
>>> model_instance = PrebuiltHodModelFactory('zheng07')

Here we will use a fake simulation, but you can populate mocks using any instance of CachedHaloCatalog or UserSuppliedHaloCatalog.

>>> from halotools.sim_manager import FakeSim
>>> halocat = FakeSim()
>>> model_instance.populate_mock(halocat)

Your model_instance now has a mock attribute bound to it. You can call the populate method bound to the mock, which will repopulate the halo catalog with a new Monte Carlo realization of the model.

>>> model_instance.mock.populate()

If you want to change the behavior of your model, just change the values stored in the param_dict. Differences in the parameter values will change the behavior of the mock-population.

>>> model_instance.param_dict['logMmin'] = 12.1
>>> model_instance.mock.populate()