SubhaloModelFactory¶
- class halotools.empirical_models.SubhaloModelFactory(**kwargs)[source]¶
Bases:
ModelFactory
Class used to build models of the galaxy-halo connection in which galaxies live at the centers of subhalos.
See Tutorial on building a subhalo-based model for an in-depth description of how to build subhalo-based models, demonstrated by a sequence of increasingly complex examples. If you do not wish to build your own model but want to use one provided by Halotools, instead see
PrebuiltSubhaloModelFactory
.All subhalo-based composite models can directly populate catalogs of dark matter halos. For an in-depth description of how Halotools implements this mock-generation, see Tutorial on the algorithm for subhalo-based mock-making.
The arguments passed to the
SubhaloModelFactory
constructor determine the features of the model that are returned by the factory. This works in one of two ways, both of which have explicit examples provided below.Building a new model from scratch.
You can build a model from scratch by passing in a sequence of
model_features
, each of which are instances of component models. The factory then composes these independently-defined components into a composite model.Building a new model from an existing model.
It is also possible to add/swap new features to a previously built composite model instance, allowing you to create new models from existing ones. To do this, you pass in a
baseline_model_instance
and any set ofmodel_features
. Anymodel_feature
keyword that matches a feature name of thebaseline_model_instance
will replace that feature in thebaseline_model_instance
; all othermodel_features
that you pass in will augment thebaseline_model_instance
with new behavior.Regardless what set of features you use to build your model, the returned object can be used to directly populate a halo catalog with mock galaxies using the
populate_mock
method, as shown in the example below.- Parameters:
- *model_featuressequence of keyword arguments, optional
Each keyword you use will be interpreted as the name of a feature in the composite model, e.g. ‘stellar_mass’ or ‘star_formation_rate’; the value bound to each keyword must be an instance of a component model governing the behavior of that feature. See the examples section below.
- baseline_model_instance
SubhaloModelFactory
instance, optional If passed to the constructor, the
model_dictionary
bound to thebaseline_model_instance
will be treated as the baseline dictionary. Any additional keyword arguments passed to the constructor that appear in the baseline dictionary will be treated as model features that replace the corresponding component model in the baseline dictionary. Any model features passed to the constructor that do not appear in the baseline dictionary will be treated as new features that augment the baseline model with new behavior. See the examples section below.- model_feature_calling_sequencelist, optional
Determines the order in which your component features will be called during mock population.
Some component models may have explicit dependence upon the value of some other galaxy property being modeled. In such a case, you must pass a
model_feature_calling_sequence
list, ordered in the desired calling sequence.A classic example is if the stellar-to-halo-mass relation has explicit dependence on the star formation rate of the galaxy (active or quiescent). For this example, the
model_feature_calling_sequence
would be model_feature_calling_sequence = [‘sfr_designation’, ‘stellar_mass’, …].Default behavior is to assume that no model feature has explicit dependence upon any other, in which case the component models appearing in the
model_features
keyword arguments will be called in random order, giving primacy to the potential presence ofstellar_mass
and/orluminosity
features.- galaxy_selection_funcfunction object, optional
Function object that imposes a cut on the mock galaxies. Function should take a length-k Astropy table as a single positional argument, and return a length-k numpy boolean array that will be treated as a mask over the rows of the table. If not None, the mask defined by
galaxy_selection_func
will be applied to thegalaxy_table
after the table is generated by thepopulate_mock
method. Default is None.- halo_selection_funcfunction object, optional
Function object used to place a cut on the input
table
. If thehalo_selection_func
keyword argument is passed, the input to the function must be a single positional argument storing a length-N structured numpy array or Astropy table; the function output must be a length-N boolean array that will be used as a mask. Halos that are masked will be entirely neglected during mock population.
See also
Examples
As described above, there are two different ways to build models using the
SubhaloModelFactory
. Here we give demonstrations of each in turn.In the first example we’ll show how to build a model from scratch using the
model_features
option. We’ll build a composite model from two component models: one modeling stellar mass, one modeling star formation rate designation. We will use theBehroozi10SmHm
class to model stellar mass, and theBinaryGalpropInterpolModel
class to model whether galaxies are quiescent or star-forming. See the docstrings of these classes for more information about their behavior.>>> from halotools.empirical_models import Behroozi10SmHm >>> stellar_mass_model = Behroozi10SmHm(redshift = 0.5)
>>> from halotools.empirical_models import BinaryGalpropInterpolModel >>> sfr_model = BinaryGalpropInterpolModel(galprop_name = 'quiescent_designation', galprop_abscissa = [12, 15], galprop_ordinates = [0.25, 0.75])
At this point we have two component model instances,
stellar_mass_model
andsfr_model
. The following call to the factory uses themodel_features
option described above:>>> model_instance = SubhaloModelFactory(stellar_mass = stellar_mass_model, sfr = sfr_model)
The feature names we have chosen are ‘stellar_mass’ and ‘sfr’, and to each feature we have attached a component model instance.
In this particular example the assignment of stellar mass and SFR-designation are entirely independent, and so no other arguments are necessary. However, if you are building a model in which one or more of your components has explicit dependence on some other feature, then you can use the
model_feature_calling_sequence
argument; this is a list of the feature names whose order determines the sequence in which the components will be called during mock population:>>> model_instance = SubhaloModelFactory(stellar_mass = stellar_mass_model, sfr = sfr_model, model_feature_calling_sequence = ['stellar_mass', 'sfr'])
For more details about this optional argument, see The model_feature_calling_sequence mechanism.
Whatever features your composite model has, you can use the
populate_mock
method to create Monte Carlo realization of the model by populating any dark matter halo catalog in your cache directory:>>> from halotools.sim_manager import CachedHaloCatalog >>> halocat = CachedHaloCatalog(simname = 'bolshoi', redshift = 0.5) >>> model_instance.populate_mock(halocat)
Your
model_instance
now has amock
attribute storing a synthetic galaxy population. See thepopulate_mock
docstring for details.There also convenience functions for estimating the clustering signal predicted by the model. For example, the following method repeatedly populates the Bolshoi simulation with galaxies, computes the 3-d galaxy clustering signal of each mock, computes the median clustering signal in each bin, and returns the result:
>>> r, xi = model_instance.compute_average_galaxy_clustering(num_iterations = 5, simname = 'bolshoi', redshift = 0.5)
In this next example we’ll show how to build a new model from an existing one using the
baseline_model_instance
option. We will start from the composite model built in Example 1 above. Here we’ll build a new model which is identical themodel_instance
above, only we instead use theMoster13SmHm
class to model stellar mass.>>> from halotools.empirical_models import Moster13SmHm >>> moster_model = Moster13SmHm(redshift = 0.5) >>> new_model_instance = SubhaloModelFactory(stellar_mass = moster_model, baseline_model_instance = model_instance)
The
model_feature_calling_sequence
works in the same way as it did in Example 1.>>> new_model_instance = SubhaloModelFactory(stellar_mass = moster_model, baseline_model_instance = model_instance, model_feature_calling_sequence = ['stellar_mass', 'sfr'])
Methods Summary
Create the
_galprop_dtypes_to_allocate
attribute that determines the name and data type of every galaxy property that will appear in the mockgalaxy_table
.Create the
param_dict
attribute of the instance.Method uses the
model_feature_calling_sequence
passed to __init__, if available.Method builds the
_haloprop_list
of strings.Method collects together all publications from each of the component models.
populate_mock
(halocat[, masking_function])Method used to populate a simulation with a Monte Carlo realization of a model.
Reset all values of the current
param_dict
to the values the class was instantiated with.Method used to determine the sequence of function calls that will be made during mock population.
Function determines which component model methods are inherited by the composite model.
set_primary_behaviors
(**kwargs)Creates names and behaviors for the primary methods of
SubhaloModelFactory
that will be used by the outside world.Method used to determine whether a warning should be issued if the
build_init_param_dict
method detects the presence of multiple appearances of the same parameter name.update_param_dict_decorator
(component_model, ...)Decorator used to propagate any possible changes in the composite model param_dict down to the appropriate component model param_dict.
Methods Documentation
- build_dtype_list()[source]¶
Create the
_galprop_dtypes_to_allocate
attribute that determines the name and data type of every galaxy property that will appear in the mockgalaxy_table
.This attribute is determined by examining the
_galprop_dtypes_to_allocate
attribute of every component model, and building a composite set of all these dtypes, enforcing self-consistency in cases where the same galaxy property appears more than once.
- build_init_param_dict()[source]¶
Create the
param_dict
attribute of the instance. Theparam_dict
is a dictionary storing the full collection of parameters controlling the behavior of the composite model.The
param_dict
dictionary is determined by examining theparam_dict
attribute of every component model, and building up a composite dictionary from them. It is permissible for the same parameter name to appear more than once amongst a set of component models, but a warning will be issued in such cases.Notes
In MCMC applications, the items of
param_dict
defines the possible parameter set explored by the likelihood engine. Changing the values of the parameters inparam_dict
will propagate to the behavior of the component models when the relevant methods are called.
- build_model_feature_calling_sequence(supplementary_kwargs)[source]¶
Method uses the
model_feature_calling_sequence
passed to __init__, if available. If no such argument was passed, the method chooses a mostly random order for the calling sequence, excepting only for cases where either there is a feature namedstellar_mass
orluminosity
, which are always called first in the absence of explicit instructions to the contrary.- Parameters:
- supplementary_kwargsdict
Dictionary storing all keyword arguments passed to the
__init__
constructor that were not part of the input model dictionary.
- Returns:
- model_feature_calling_sequencelist
List of strings specifying the order in which the component models will be called upon during mock population to execute their methods.
- build_prim_sec_haloprop_list()[source]¶
Method builds the
_haloprop_list
of strings.This list stores the names of all halo catalog columns that appear as either
prim_haloprop_key
orsec_haloprop_key
of any component model. For all strings appearing in_haloprop_list
, the mockgalaxy_table
will have a corresponding column storing the halo property inherited by the mock galaxy.
- build_publication_list()[source]¶
Method collects together all publications from each of the component models.
- populate_mock(halocat, masking_function=None, **kwargs)[source]¶
Method used to populate a simulation with a Monte Carlo realization of a model.
After calling this method, the model instance will have a new
mock
attribute. You can then access the galaxy population viamodel.mock.galaxy_table
, an AstropyTable
.See Tutorial on the algorithm for subhalo-based mock-making for an in-depth tutorial on the mock-making algorithm.
- Parameters:
- halocatobject
Either an instance of
CachedHaloCatalog
orUserSuppliedHaloCatalog
.- masking_functionfunction, optional
Function object used to place a mask on the halo table prior to calling the mock generating functions. Calling signature of the function should be to accept a single positional argument storing a table, and returning a boolean numpy array that will be used as a fancy indexing mask. All masked halos will be ignored during mock population. Default is None.
Notes
Note the difference between the
halotools.empirical_models.SubhaloMockFactory.populate
method and the closely related methodhalotools.empirical_models.SubhaloModelFactory.populate_mock
. Thepopulate_mock
method is bound to a composite model instance and is called the first time a composite model is used to generate a mock. Calling thepopulate_mock
method creates theSubhaloMockFactory
instance and binds it to composite model. From then on, if you want to repopulate a new Universe with the same composite model, you should instead call thepopulate
method bound tomodel.mock
. The reason for this distinction is that callingpopulate_mock
triggers a large number of relatively expensive pre-processing steps and self-consistency checks that need only be carried out once. See the Examples section below for an explicit demonstration.In particular, if you are running an MCMC type analysis, you will choose your halo catalog and completeness cuts, and call
populate_mock
with the appropriate arguments. Thereafter, you can explore parameter space by changing the values stored in theparam_dict
dictionary attached to the model, and then calling thepopulate
method bound tomodel.mock
. Any changes to theparam_dict
of the model will automatically propagate into the behavior of thepopulate
method.Examples
Here we’ll use a pre-built model to demonstrate basic usage. The syntax shown below is the same for all composite models, whether they are pre-built by Halotools or built by you with
SubhaloModelFactory
.>>> from halotools.empirical_models import PrebuiltSubhaloModelFactory >>> model_instance = PrebuiltSubhaloModelFactory('behroozi10')
Here we will use a fake simulation, but you can populate mocks using any instance of
CachedHaloCatalog
orUserSuppliedHaloCatalog
.>>> from halotools.sim_manager import FakeSim >>> halocat = FakeSim() >>> model_instance.populate_mock(halocat)
Your
model_instance
now has amock
attribute bound to it. You can call thepopulate
method bound to themock
, which will repopulate the halo catalog with a new Monte Carlo realization of the model.>>> model_instance.mock.populate()
If you want to change the behavior of your model, just change the values stored in the
param_dict
. Differences in the parameter values will change the behavior of the mock-population.>>> model_instance.param_dict['scatter_model_param1'] = 0.25 >>> model_instance.mock.populate()
- restore_init_param_dict()[source]¶
Reset all values of the current
param_dict
to the values the class was instantiated with.Primary behaviors are reset as well, as this is how the inherited behaviors get bound to the values in
param_dict
.See also
- set_calling_sequence()[source]¶
Method used to determine the sequence of function calls that will be made during mock population. The methods of each component model will be called one after the other; the order in which the component models are called upon is determined by
_model_feature_calling_sequence
. When each component model is called, the sequence of methods that are called for that component is determined by the_mock_generation_calling_sequence
attribute bound to the component model instance. See The model_feature_calling_sequence mechanism for further details.
- set_inherited_methods()[source]¶
Function determines which component model methods are inherited by the composite model.
Each component model should have a
_mock_generation_calling_sequence
attribute that provides the sequence of method names to call during mock population. Additionally, each component should have a_methods_to_inherit
attribute that determines which methods will be inherited by the composite model. The_mock_generation_calling_sequence
list should be a subset of_methods_to_inherit
. If any of the above conditions fail, no exception will be raised during the construction of the composite model. Instead, an empty list will be forcibly attached to each component model for which these lists may have been missing. Also, for each component model, if there are any elements of_mock_generation_calling_sequence
that were missing from_methods_to_inherit
, all such elements will be forcibly added to that component model’s_methods_to_inherit
.Finally, each component model should have an
_attrs_to_inherit
attribute that determines which attributes will be inherited by the composite model. If any component models did not implement the_attrs_to_inherit
, an empty list is forcibly added to the component model.After calling the set_inherited_methods method, it will be therefore be entirely safe to run a for loop over each component model’s
_methods_to_inherit
and_attrs_to_inherit
, even if these lists were forgotten or irrelevant to that particular component.
- set_primary_behaviors(**kwargs)[source]¶
Creates names and behaviors for the primary methods of
SubhaloModelFactory
that will be used by the outside world.Notes
The new methods created here are given standardized names, for consistent communication with the rest of the package. This consistency is particularly important for mock-making, so that the
SubhaloModelFactory
can always call the same functions regardless of the complexity of the model.The behaviors of the methods created here are defined elsewhere;
set_primary_behaviors
just creates a symbolic link to those external behaviors.
- set_warning_suppressions()[source]¶
Method used to determine whether a warning should be issued if the
build_init_param_dict
method detects the presence of multiple appearances of the same parameter name.If any of the component model instances have a
_suppress_repeated_param_warning
attribute that is set to the boolean True value, then no warning will be issued even if there are multiple appearances of the same parameter name. This allows the user to not be bothered with warning messages for cases where it is understood that there will be no conflicting behavior.See also
- update_param_dict_decorator(component_model, func_name)[source]¶
Decorator used to propagate any possible changes in the composite model param_dict down to the appropriate component model param_dict.
The behavior of the methods bound to the composite model are decorated versions of the methods defined in the component models. The decoration is done with
update_param_dict_decorator
. For each function that gets bound to the composite model, what this decorator does is search the param_dict of the component_model associated with the function, and update all matching keys in that param_dict with the param_dict of the composite. This way, all the user needs to do is make changes to the composite model param_dict. Then, when calling any method of the composite model, the changed values of the param_dict automatically propagate down to the component model before calling upon its behavior. This allows the composite_model to control behavior of functions that it does not define.- Parameters:
- component_modelobj
Instance of the component model in which the behavior of the function is defined.
- func_namestring
Name of the method in the component model whose behavior is being decorated.
- Returns:
- decorated_funcfunction
Function object whose behavior is identical to the behavior of the function in the component model, except that the component model param_dict is first updated with any possible changes to corresponding parameters in the composite model param_dict.