HodMockFactory¶
- class halotools.empirical_models.HodMockFactory(Num_ptcl_requirement=300, halo_mass_column_key='halo_mvir', **kwargs)[source]¶
Bases:
MockFactory
Class responsible for populating a simulation with a population of mock galaxies based on an HOD-style model built by the
HodModelFactory
class.Can be thought of as a factory that takes a model and simulation halocat as input, and generates a mock galaxy population. The returned collection of galaxies possesses whatever attributes were requested by the model, such as xyz position, central/satellite designation, star-formation rate, etc.
See Tutorial on the algorithm for HOD-based mock-making for an in-depth tutorial on the mock-making algorithm.
- Parameters:
- halocatobject, keyword argument
Object containing the halo catalog and other associated data. Produced by
CachedHaloCatalog
- modelobject, keyword argument
A model built by a sub-class of
HodModelFactory
.- populateboolean, optional
If set to
False
, the class will perform all pre-processing tasks but will not call themodel
to populate thegalaxy_table
with mock galaxies and their observable properties. Default isTrue
.- Num_ptcl_requirementint, optional
Requirement on the number of dark matter particles in the halo. The column defined by the
halo_mass_column_key
string will have a cut placed on it: all halos with halocat.halo_table[halo_mass_column_key] < Num_ptcl_requirement*halocat.particle_mass will be thrown out immediately after reading the original halo catalog in memory. Default value is set inNum_ptcl_requirement
.- halo_mass_column_keystring, optional
This string must be a column of the input halo catalog. The column defined by this string will have a cut placed on it: all halos with halocat.halo_table[halo_mass_column_key] < Num_ptcl_requirement*halocat.particle_mass will be thrown out immediately after reading the original halo catalog in memory. Default is ‘halo_mvir’
Methods Summary
allocate_memory
([seed])Method allocates the memory for all the numpy arrays that will store the information about the mock.
estimate_ngals
([seed])Method to estimate the number of galaxies produced by the mock.populate() method.
populate
([seed])Method populating host halos with mock galaxies.
preprocess_halo_catalog
(halocat)Method to pre-process a halo catalog upon instantiation of the mock object.
Methods Documentation
- allocate_memory(seed=None)[source]¶
Method allocates the memory for all the numpy arrays that will store the information about the mock. These arrays are bound directly to the mock object.
The main bookkeeping devices generated by this method are
_occupation
and_gal_type_indices
.
- estimate_ngals(seed=None)[source]¶
Method to estimate the number of galaxies produced by the mock.populate() method. It runs one realization of all mc_occupation methods and reports the total number of galaxies produced. However, no extra memory is allocated for the galaxy tables. Note that model.populate() will invoke a new call to all mc_occupation methods and can produce a different number of galaxies.
- populate(seed=None, **kwargs)[source]¶
Method populating host halos with mock galaxies.
By calling the
populate
method of your mock, you will repopulate the halo catalog with a new realization of the model based on whatever values of the model parameters are currently stored in theparam_dict
of the model.For an in-depth discussion of how this method is implemented, see the Tutorial on the algorithm for HOD-based mock-making section of the documentation.
- Parameters:
- masking_functionfunction, optional
Function object used to place a mask on the halo table prior to calling the mock generating functions. Calling signature of the function should be to accept a single positional argument storing a table, and returning a boolean numpy array that will be used as a fancy indexing mask. All masked halos will be ignored during mock population. Default is None.
- enforce_PBCbool, optional
If set to True, after galaxy positions are assigned the
model_helpers.enforce_periodicity_of_box
will re-map satellite galaxies whose positions spilled over the edge of the periodic box. Default is True. This variable should only ever be set to False when using themasking_function
to populate a specific spatial subvolume, as in that case PBCs no longer apply.- seedint, optional
Random number seed used in the Monte Carlo realization. Default is None, which will produce stochastic results.
Notes
Note the difference between the
halotools.empirical_models.HodMockFactory.populate
method and the closely related methodhalotools.empirical_models.HodModelFactory.populate_mock
. Thepopulate_mock
method is bound to a composite model instance and is called the first time a composite model is used to generate a mock. Calling thepopulate_mock
method creates theHodMockFactory
instance and binds it to composite model. From then on, if you want to repopulate a new Universe with the same composite model, you should instead call thepopulate
method bound tomodel.mock
. The reason for this distinction is that callingpopulate_mock
triggers a large number of relatively expensive pre-processing steps and self-consistency checks that need only be carried out once. See the Examples section below for an explicit demonstration.In particular, if you are running an MCMC type analysis, you will choose your halo catalog and completeness cuts, and call
halotools.empirical_models.ModelFactory.populate_mock
with the appropriate arguments. Thereafter, you can explore parameter space by changing the values stored in theparam_dict
dictionary attached to the model, and then calling thepopulate
method bound tomodel.mock
. Any changes to theparam_dict
of the model will automatically propagate into the behavior of thepopulate
method.Normally, repeated calls to the
populate
method should not increase the RAM usage of halotools because a new mock catalog is created and the old one deleted. However, on certain machines the memory usage was found to increase over time. If this is the case and memory usage is critical you can try calling gc.collect() immediately following the call tomock.populate
to manually invoke python’s garbage collection.Examples
>>> from halotools.empirical_models import PrebuiltHodModelFactory >>> model_instance = PrebuiltHodModelFactory('zheng07')
Here we will use a fake simulation, but you can populate mocks using any instance of
CachedHaloCatalog
orUserSuppliedHaloCatalog
.>>> from halotools.sim_manager import FakeSim >>> halocat = FakeSim() >>> model_instance.populate_mock(halocat)
Your
model_instance
now has amock
attribute bound to it, which is an instance of theHodMockFactory
class. You can call thepopulate
method bound to themock
, which will repopulate the halo catalog with a new Monte Carlo realization of the model.>>> model_instance.mock.populate()
If you want to change the behavior of your model, just change the values stored in the
param_dict
. Theparam_dict
attribute is a python dictionary storing the values of all parameters in the model. Differences in the parameter values will change the behavior of the mock-population.>>> model_instance.param_dict['logMmin'] = 12.1 >>> model_instance.mock.populate()
- preprocess_halo_catalog(halocat)[source]¶
Method to pre-process a halo catalog upon instantiation of the mock object. This pre-processing includes identifying the catalog columns that will be used by the model to create the mock, building lookup tables associated with the halo profile, and possibly creating new halo properties.
- Parameters:
- logrminfloat, optional
Minimum radius used to build the lookup table for the halo profile. Default is set in
model_defaults
.- logrmaxfloat, optional
Maximum radius used to build the lookup table for the halo profile. Default is set in
model_defaults
.- Npts_radius_tableint, optional
Number of control points used in the lookup table for the halo profile. Default is set in
model_defaults
.