Source code notes on HodModelFactory
¶
This section of the documentation provides detailed notes
on the source code implementation of the HodModelFactory
class.
The purpose of the HodModelFactory
class is to provide a flexible, standardized platform for building Hod-based models that can directly populate simulations with mock galaxies. The goal is to make it easy to swap new modeling features in and out of the framework while maintaining a uniform syntax. This way, when you want to study one particular feature of the galaxy-halo connection, you can focus exclusively on developing that feature, leaving the factory to take care of the remaining aspects of the mock population. This tutorial describes in detail how the HodModelFactory
accomplishes this standardization.
Outline¶
We will start in Overview of the factory design with a high-level
description of how the class creates a composite model from
a set of independently-defined features. In Inferring a model dictionary from the constructor inputs we describe
how the factory’s __init__
constructor parses the large number of optional inputs into a model dictionary.
In Consistency checks and mock-population bookkeeping we outline the various bookkeeping devices and consistency checks that the factory does in order to 1. ensure that the input model dictionary provides sufficient and self-consistent information, and 2. place the instance into a form that can directly talk to the SubhaloMockFactory
. In Inheriting behaviors from the component models we cover the process by which the appropriate methods of the component models are inherited by the composite model. The syntax for using a composite model to create mock catalogs is covered in The populate_mock convenience method.
We conclude in Further reading by pointing to sections of documentation covering related aspects such as the algorithm for using HodModelFactory
instances to populate mocks.
Overview of the factory design¶
The HodModelFactory
has virtually no behavior of its own;
it should instead be thought of as a container class that collects together
behaviors that are defined elsewhere. These behaviors are defined in
component models, which are instances of Halotools classes that typically provide
a single, specialized mapping between halos and some specific galaxy property.
By composing these individual mappings together,
the output of the factory is a composite model for the galaxy-halo connection in which
any number of user-defined galaxy properties is simultaneously modeled.
Although there are numerous options
for the form of the arguments passed to HodModelFactory
,
the basic input is a model dictionary.
A model dictionary is just an ordinary python dictionary that stores the collection of
component model instances whose behaviors are being unified together by the factory.
The model dictionary contains all of the necessary information inform the HodModelFactory
how to build a composite model from the components.
Each component model in a model dictionary typically has each of the following three private attributes:
_methods_to_inherit
_galprop_dtypes_to_allocate
_mock_generation_calling_sequence
Each of these three attributes will be explained in detail below. Briefly, the
_methods_to_inherit
is a list of strings that instructs the HodModelFactory
which methods in the component model should be carried over into the composite model.
The _galprop_dtypes_to_allocate
attribute is used to instruct the HodMockFactory
of the shape and name of every Numpy array that should be allocated for every galaxy property
assigned by the component model. The _mock_generation_calling_sequence
specifies the sequential
order in which the methods of the component model should be called by the composite model
during mock population.
Again, we will discuss these and other bookkeeping devices in more detail below.
For now, simply observe what is accomplished by these three pieces of information.
Each component model is effectively giving the factory the following message:
“I want you to know about the following methods, and only the following methods, and I will take care of how they will be computed: _methods_to_inherit
;
I need you to make sure that the when you call these methods, the following arrays that will be passed to them: _galprop_dtypes_to_allocate
; when you use me to make a mock, I need you to call these
methods in the following sequence: _mock_generation_calling_sequence
”. In this way, not only is all the physically relevant behavior defined in the component models, but the component models themselves provide the instructions for how they should be used.
The job of the HodModelFactory
is simply to follow these instructions, and to ensure that mutually consistent messages are received from the set of components in the model dictionary. In the remaining sections of this tutorial, we will walk step-by-step through the tasks carried out when a new composite model is built by instantiating an instance of the HodModelFactory
class.
Inferring a model dictionary from the constructor inputs¶
The first thing the __init__
constructor of HodModelFactory
does is to
pass all its arguments to the _parse_constructor_kwargs
method,
which simply extracts (if present) galaxy_selection_func
, halo_selection_func
and model_feature_calling_sequence
from the arguments passed to __init__
;
all remaining arguments will be interpeted as model dictionary inputs.
For an explanation of galaxy_selection_func
and halo_selection_func
,
see the ModelFactory
docstring.
When calling the constructor of the ModelFactory
super-class after parsing the inputs,
exact copies of all arguments passed to HodModelFactory
are bound to the instance.
This allows all composite model instances to remember the
exact set of instructions from which they were built.
As we will see, this is useful because it simplifies the process of building
alternate versions of any particular composite model instance.
As described in The model_feature_calling_sequence mechanism,
the model_feature_calling_sequence
determines
the order in which the component models will be called during mock population. This order is
determined by the build_model_feature_calling_sequence
method.
Once this order is determined, the model_dictionary
attribute is bound to the instance
using the appropriate order:
self.model_dictionary = collections.OrderedDict()
for key in self._model_feature_calling_sequence:
self.model_dictionary[key] = copy(self._input_model_dictionary[key])
In the next section, we will see how the model_dictionary
attribute is used to create a
number of bookkeeping mechanisms used to verify self-consistency between the model features,
and also to facilitate communication between the composite model and the HodMockFactory
.
Consistency checks and mock-population bookkeeping¶
After the model dictionary has been built, the __init__
constructor
creates a handful of lists and dictionaries and binds these to the instance
with the following lines of code:
# Build up and bind several lists from the component models
self.set_gal_types()
self.build_prim_sec_haloprop_list()
self.build_publication_list()
self.build_new_haloprop_func_dict()
self.build_dtype_list()
self.set_warning_suppressions()
self.set_model_redshift()
self.set_inherited_methods()
self.build_init_param_dict()
These methods examine each of the component models, perform various self-consistency
tests, and create standardized attributes that allow the
composite model to communicate with the SubhaloMockFactory
to populate mocks.
For a description of the most important methods in this standardization process,
see Composite Model Bookkeeping Mechanisms. At the end of this
sequence of function calls, the instance is prepared to inherit the behavior of
the primary methods of the component models, which we cover in the next section.
Inheriting behaviors from the component models¶
Once all of the above lists and dictionaries of the composite model have been created,
the HodModelFactory
finally inherits the behaviors of the component models.
This is done using with the set_primary_behaviors
method.
This is the most important function in the entire factory. Although it is only a few lines, it is sufficiently complicated to warrant detailed discussion. First, we reproduce the source below:
for component_model in self.model_dictionary.values():
gal_type = component_model.gal_type
feature_name = component_model.feature_name
for methodname in component_model._methods_to_inherit:
new_method_name = methodname + '_' + gal_type # line 1
new_method_behavior = self.update_param_dict_decorator(
component_model, methodname) # line 2
setattr(self, new_method_name, new_method_behavior) # line 3
setattr(getattr(self, new_method_name), 'gal_type', gal_type) # line 4
setattr(getattr(self, new_method_name), 'feature_name', feature_name) # line 5
In this double-for loop, we iterate over every method that the composite model
should inherit from the collection of component models.
For each method that we inherit, line 3 binds the newly-defined method to the composite model instance.
Line 1 chooses for the name of this newly-defined method to keep the same name
as appears in the component model. Line 2 modifies the component model method behavior with the
update_param_dict_decorator
decorator.
This modification is very important for the reasons described in The update_param_dict_decorator mechanism.
Note how the use of getattr and setattr allows the component models to entirely dictate what is inherited by the composite model. This high-level python feature is what makes possible the flexibility of the model factories.
Note also how in lines 4 and 5 that we bind additional data to the newly created methods of the
composite model. In particular, every method passed from a component model to a
composite model binds the name of its associated gal_type
and feature_name
to the
inherited method. This information informs the HodMockFactory
of the appropriate data to pass to each component model
(see the conclusion of the Final stage: galaxy properties assigned after the mc_occupation methods section of the
Tutorial on the algorithm for HOD-based mock-making for details about how the mock factory uses this information).
The populate_mock
convenience method¶
No matter what the component model features are, all instances of HodModelFactory
can directly populate halo catalogs with mock galaxies
with the populate_mock
method. To populate the default halo catalog,
the syntax for this is:
model = HodModelFactory(**model_dictionary)
from halotools.sim_manager import CachedHaloCatalog
halocat = CachedHaloCatalog()
model.populate_mock(halocat)
The HodModelFactory.populate_mock
method is just a
convenience wrapper around HodMockFactory.populate
method.
You can also populate alternative halo catalogs:
from halotools.sim_manager import CachedHaloCatalog
my_halocat = CachedHaloCatalog(simname = my_simname, redshift = my_redshift)
model.populate_mock(my_halocat)
You can use the syntax above to populate any instance of either
CachedHaloCatalog
or UserSuppliedHaloCatalog
.
Further reading¶
Detailed documentation on the mock-population algorithm is covered in Tutorial on the algorithm for HOD-based mock-making.