HodMockFactory

class halotools.empirical_models.HodMockFactory(Num_ptcl_requirement=300, halo_mass_column_key='halo_mvir', **kwargs)[source] [edit on github]

Bases: halotools.empirical_models.MockFactory

Class responsible for populating a simulation with a population of mock galaxies based on an HOD-style model built by the HodModelFactory class.

Can be thought of as a factory that takes a model and simulation halocat as input, and generates a mock galaxy population. The returned collection of galaxies possesses whatever attributes were requested by the model, such as xyz position, central/satellite designation, star-formation rate, etc.

See Tutorial on the algorithm for HOD-based mock-making for an in-depth tutorial on the mock-making algorithm.

Parameters:

halocat : object, keyword argument

Object containing the halo catalog and other associated data. Produced by CachedHaloCatalog

model : object, keyword argument

A model built by a sub-class of HodModelFactory.

populate : boolean, optional

If set to False, the class will perform all pre-processing tasks but will not call the model to populate the galaxy_table with mock galaxies and their observable properties. Default is True.

Num_ptcl_requirement : int, optional

Requirement on the number of dark matter particles in the halo. The column defined by the halo_mass_column_key string will have a cut placed on it: all halos with halocat.halo_table[halo_mass_column_key] < Num_ptcl_requirement*halocat.particle_mass will be thrown out immediately after reading the original halo catalog in memory. Default value is set in Num_ptcl_requirement.

halo_mass_column_key : string, optional

This string must be a column of the input halo catalog. The column defined by this string will have a cut placed on it: all halos with halocat.halo_table[halo_mass_column_key] < Num_ptcl_requirement*halocat.particle_mass will be thrown out immediately after reading the original halo catalog in memory. Default is ‘halo_mvir’

Methods Summary

allocate_memory([seed]) Method allocates the memory for all the numpy arrays that will store the information about the mock.
estimate_ngals([seed]) Method to estimate the number of galaxies produced by the mock.populate() method.
populate([seed]) Method populating host halos with mock galaxies.
preprocess_halo_catalog(halocat) Method to pre-process a halo catalog upon instantiation of the mock object.

Methods Documentation

allocate_memory(seed=None)[source] [edit on github]

Method allocates the memory for all the numpy arrays that will store the information about the mock. These arrays are bound directly to the mock object.

The main bookkeeping devices generated by this method are _occupation and _gal_type_indices.

estimate_ngals(seed=None)[source] [edit on github]

Method to estimate the number of galaxies produced by the mock.populate() method. It runs one realization of all mc_occupation methods and reports the total number of galaxies produced. However, no extra memory is allocated for the galaxy tables. Note that model.populate() will invoke a new call to all mc_occupation methods and can produce a different number of galaxies.

populate(seed=None, **kwargs)[source] [edit on github]

Method populating host halos with mock galaxies.

By calling the populate method of your mock, you will repopulate the halo catalog with a new realization of the model based on whatever values of the model parameters are currently stored in the param_dict of the model.

For an in-depth discussion of how this method is implemented, see the Tutorial on the algorithm for HOD-based mock-making section of the documentation.

Parameters:

masking_function : function, optional

Function object used to place a mask on the halo table prior to calling the mock generating functions. Calling signature of the function should be to accept a single positional argument storing a table, and returning a boolean numpy array that will be used as a fancy indexing mask. All masked halos will be ignored during mock population. Default is None.

enforce_PBC : bool, optional

If set to True, after galaxy positions are assigned the model_helpers.enforce_periodicity_of_box will re-map satellite galaxies whose positions spilled over the edge of the periodic box. Default is True. This variable should only ever be set to False when using the masking_function to populate a specific spatial subvolume, as in that case PBCs no longer apply.

seed : int, optional

Random number seed used in the Monte Carlo realization. Default is None, which will produce stochastic results.

Notes

Note the difference between the halotools.empirical_models.HodMockFactory.populate method and the closely related method halotools.empirical_models.HodModelFactory.populate_mock. The populate_mock method is bound to a composite model instance and is called the first time a composite model is used to generate a mock. Calling the populate_mock method creates the HodMockFactory instance and binds it to composite model. From then on, if you want to repopulate a new Universe with the same composite model, you should instead call the populate method bound to model.mock. The reason for this distinction is that calling populate_mock triggers a large number of relatively expensive pre-processing steps and self-consistency checks that need only be carried out once. See the Examples section below for an explicit demonstration.

In particular, if you are running an MCMC type analysis, you will choose your halo catalog and completeness cuts, and call halotools.empirical_models.ModelFactory.populate_mock with the appropriate arguments. Thereafter, you can explore parameter space by changing the values stored in the param_dict dictionary attached to the model, and then calling the populate method bound to model.mock. Any changes to the param_dict of the model will automatically propagate into the behavior of the populate method.

Normally, repeated calls to the populate method should not increase the RAM usage of halotools because a new mock catalog is created and the old one deleted. However, on certain machines the memory usage was found to increase over time. If this is the case and memory usage is critical you can try calling gc.collect() immediately following the call to mock.populate to manually invoke python’s garbage collection.

Examples

>>> from halotools.empirical_models import PrebuiltHodModelFactory
>>> model_instance = PrebuiltHodModelFactory('zheng07')

Here we will use a fake simulation, but you can populate mocks using any instance of CachedHaloCatalog or UserSuppliedHaloCatalog.

>>> from halotools.sim_manager import FakeSim
>>> halocat = FakeSim()
>>> model_instance.populate_mock(halocat)

Your model_instance now has a mock attribute bound to it, which is an instance of the HodMockFactory class. You can call the populate method bound to the mock, which will repopulate the halo catalog with a new Monte Carlo realization of the model.

>>> model_instance.mock.populate()

If you want to change the behavior of your model, just change the values stored in the param_dict. The param_dict attribute is a python dictionary storing the values of all parameters in the model. Differences in the parameter values will change the behavior of the mock-population.

>>> model_instance.param_dict['logMmin'] = 12.1
>>> model_instance.mock.populate()
preprocess_halo_catalog(halocat)[source] [edit on github]

Method to pre-process a halo catalog upon instantiation of the mock object. This pre-processing includes identifying the catalog columns that will be used by the model to create the mock, building lookup tables associated with the halo profile, and possibly creating new halo properties.

Parameters:

logrmin : float, optional

Minimum radius used to build the lookup table for the halo profile. Default is set in model_defaults.

logrmax : float, optional

Maximum radius used to build the lookup table for the halo profile. Default is set in model_defaults.

Npts_radius_table : int, optional

Number of control points used in the lookup table for the halo profile. Default is set in model_defaults.