SubhaloModelFactory

class halotools.empirical_models.SubhaloModelFactory(**kwargs)[source] [edit on github]

Bases: halotools.empirical_models.ModelFactory

Class used to build models of the galaxy-halo connection in which galaxies live at the centers of subhalos.

See Tutorial on building a subhalo-based model for an in-depth description of how to build subhalo-based models, demonstrated by a sequence of increasingly complex examples. If you do not wish to build your own model but want to use one provided by Halotools, instead see PrebuiltSubhaloModelFactory.

All subhalo-based composite models can directly populate catalogs of dark matter halos. For an in-depth description of how Halotools implements this mock-generation, see Tutorial on the algorithm for subhalo-based mock-making.

The arguments passed to the SubhaloModelFactory constructor determine the features of the model that are returned by the factory. This works in one of two ways, both of which have explicit examples provided below.

  1. Building a new model from scratch.

You can build a model from scratch by passing in a sequence of model_features, each of which are instances of component models. The factory then composes these independently-defined components into a composite model.

  1. Building a new model from an existing model.

It is also possible to add/swap new features to a previously built composite model instance, allowing you to create new models from existing ones. To do this, you pass in a baseline_model_instance and any set of model_features. Any model_feature keyword that matches a feature name of the baseline_model_instance will replace that feature in the baseline_model_instance; all other model_features that you pass in will augment the baseline_model_instance with new behavior.

Regardless what set of features you use to build your model, the returned object can be used to directly populate a halo catalog with mock galaxies using the populate_mock method, as shown in the example below.

Parameters:

*model_features : sequence of keyword arguments, optional

Each keyword you use will be interpreted as the name of a feature in the composite model, e.g. ‘stellar_mass’ or ‘star_formation_rate’; the value bound to each keyword must be an instance of a component model governing the behavior of that feature. See the examples section below.

baseline_model_instance : SubhaloModelFactory instance, optional

If passed to the constructor, the model_dictionary bound to the baseline_model_instance will be treated as the baseline dictionary. Any additional keyword arguments passed to the constructor that appear in the baseline dictionary will be treated as model features that replace the corresponding component model in the baseline dictionary. Any model features passed to the constructor that do not appear in the baseline dictionary will be treated as new features that augment the baseline model with new behavior. See the examples section below.

model_feature_calling_sequence : list, optional

Determines the order in which your component features will be called during mock population.

Some component models may have explicit dependence upon the value of some other galaxy property being modeled. In such a case, you must pass a model_feature_calling_sequence list, ordered in the desired calling sequence.

A classic example is if the stellar-to-halo-mass relation has explicit dependence on the star formation rate of the galaxy (active or quiescent). For this example, the model_feature_calling_sequence would be model_feature_calling_sequence = [‘sfr_designation’, ‘stellar_mass’, …].

Default behavior is to assume that no model feature has explicit dependence upon any other, in which case the component models appearing in the model_features keyword arguments will be called in random order, giving primacy to the potential presence of stellar_mass and/or luminosity features.

galaxy_selection_func : function object, optional

Function object that imposes a cut on the mock galaxies. Function should take a length-k Astropy table as a single positional argument, and return a length-k numpy boolean array that will be treated as a mask over the rows of the table. If not None, the mask defined by galaxy_selection_func will be applied to the galaxy_table after the table is generated by the populate_mock method. Default is None.

halo_selection_func : function object, optional

Function object used to place a cut on the input table. If the halo_selection_func keyword argument is passed, the input to the function must be a single positional argument storing a length-N structured numpy array or Astropy table; the function output must be a length-N boolean array that will be used as a mask. Halos that are masked will be entirely neglected during mock population.

Examples

As described above, there are two different ways to build models using the SubhaloModelFactory. Here we give demonstrations of each in turn.

In the first example we’ll show how to build a model from scratch using the model_features option. We’ll build a composite model from two component models: one modeling stellar mass, one modeling star formation rate designation. We will use the Behroozi10SmHm class to model stellar mass, and the BinaryGalpropInterpolModel class to model whether galaxies are quiescent or star-forming. See the docstrings of these classes for more information about their behavior.

>>> from halotools.empirical_models import Behroozi10SmHm
>>> stellar_mass_model = Behroozi10SmHm(redshift = 0.5)
>>> from halotools.empirical_models import BinaryGalpropInterpolModel
>>> sfr_model = BinaryGalpropInterpolModel(galprop_name = 'quiescent_designation', galprop_abscissa = [12, 15], galprop_ordinates = [0.25, 0.75])

At this point we have two component model instances, stellar_mass_model and sfr_model. The following call to the factory uses the model_features option described above:

>>> model_instance = SubhaloModelFactory(stellar_mass = stellar_mass_model, sfr = sfr_model)

The feature names we have chosen are ‘stellar_mass’ and ‘sfr’, and to each feature we have attached a component model instance.

In this particular example the assignment of stellar mass and SFR-designation are entirely independent, and so no other arguments are necessary. However, if you are building a model in which one or more of your components has explicit dependence on some other feature, then you can use the model_feature_calling_sequence argument; this is a list of the feature names whose order determines the sequence in which the components will be called during mock population:

>>> model_instance = SubhaloModelFactory(stellar_mass = stellar_mass_model, sfr = sfr_model, model_feature_calling_sequence = ['stellar_mass', 'sfr'])

For more details about this optional argument, see The model_feature_calling_sequence mechanism.

Whatever features your composite model has, you can use the populate_mock method to create Monte Carlo realization of the model by populating any dark matter halo catalog in your cache directory:

>>> from halotools.sim_manager import CachedHaloCatalog
>>> halocat = CachedHaloCatalog(simname = 'bolshoi', redshift = 0.5) 
>>> model_instance.populate_mock(halocat) 

Your model_instance now has a mock attribute storing a synthetic galaxy population. See the populate_mock docstring for details.

There also convenience functions for estimating the clustering signal predicted by the model. For example, the following method repeatedly populates the Bolshoi simulation with galaxies, computes the 3-d galaxy clustering signal of each mock, computes the median clustering signal in each bin, and returns the result:

>>> r, xi = model_instance.compute_average_galaxy_clustering(num_iterations = 5, simname = 'bolshoi', redshift = 0.5) 

In this next example we’ll show how to build a new model from an existing one using the baseline_model_instance option. We will start from the composite model built in Example 1 above. Here we’ll build a new model which is identical the model_instance above, only we instead use the Moster13SmHm class to model stellar mass.

>>> from halotools.empirical_models import Moster13SmHm
>>> moster_model = Moster13SmHm(redshift = 0.5)
>>> new_model_instance = SubhaloModelFactory(stellar_mass = moster_model, baseline_model_instance = model_instance)

The model_feature_calling_sequence works in the same way as it did in Example 1.

>>> new_model_instance = SubhaloModelFactory(stellar_mass = moster_model, baseline_model_instance = model_instance, model_feature_calling_sequence = ['stellar_mass', 'sfr'])

Methods Summary

build_dtype_list() Create the _galprop_dtypes_to_allocate attribute that determines the name and data type of every galaxy property that will appear in the mock galaxy_table.
build_init_param_dict() Create the param_dict attribute of the instance.
build_model_feature_calling_sequence(…) Method uses the model_feature_calling_sequence passed to __init__, if available.
build_prim_sec_haloprop_list() Method builds the _haloprop_list of strings.
build_publication_list() Method collects together all publications from each of the component models.
populate_mock(halocat[, masking_function]) Method used to populate a simulation with a Monte Carlo realization of a model.
restore_init_param_dict() Reset all values of the current param_dict to the values the class was instantiated with.
set_calling_sequence() Method used to determine the sequence of function calls that will be made during mock population.
set_inherited_methods() Function determines which component model methods are inherited by the composite model.
set_model_redshift()
set_primary_behaviors(**kwargs) Creates names and behaviors for the primary methods of SubhaloModelFactory that will be used by the outside world.
set_warning_suppressions() Method used to determine whether a warning should be issued if the build_init_param_dict method detects the presence of multiple appearances of the same parameter name.
update_param_dict_decorator(component_model, …) Decorator used to propagate any possible changes in the composite model param_dict down to the appropriate component model param_dict.

Methods Documentation

build_dtype_list()[source] [edit on github]

Create the _galprop_dtypes_to_allocate attribute that determines the name and data type of every galaxy property that will appear in the mock galaxy_table.

This attribute is determined by examining the _galprop_dtypes_to_allocate attribute of every component model, and building a composite set of all these dtypes, enforcing self-consistency in cases where the same galaxy property appears more than once.

build_init_param_dict()[source] [edit on github]

Create the param_dict attribute of the instance. The param_dict is a dictionary storing the full collection of parameters controlling the behavior of the composite model.

The param_dict dictionary is determined by examining the param_dict attribute of every component model, and building up a composite dictionary from them. It is permissible for the same parameter name to appear more than once amongst a set of component models, but a warning will be issued in such cases.

Notes

In MCMC applications, the items of param_dict defines the possible parameter set explored by the likelihood engine. Changing the values of the parameters in param_dict will propagate to the behavior of the component models when the relevant methods are called.

build_model_feature_calling_sequence(supplementary_kwargs)[source] [edit on github]

Method uses the model_feature_calling_sequence passed to __init__, if available. If no such argument was passed, the method chooses a mostly random order for the calling sequence, excepting only for cases where either there is a feature named stellar_mass or luminosity, which are always called first in the absence of explicit instructions to the contrary.

Parameters:

supplementary_kwargs : dict

Dictionary storing all keyword arguments passed to the __init__ constructor that were not part of the input model dictionary.

Returns:

model_feature_calling_sequence : list

List of strings specifying the order in which the component models will be called upon during mock population to execute their methods.

build_prim_sec_haloprop_list()[source] [edit on github]

Method builds the _haloprop_list of strings.

This list stores the names of all halo catalog columns that appear as either prim_haloprop_key or sec_haloprop_key of any component model. For all strings appearing in _haloprop_list, the mock galaxy_table will have a corresponding column storing the halo property inherited by the mock galaxy.

build_publication_list()[source] [edit on github]

Method collects together all publications from each of the component models.

populate_mock(halocat, masking_function=None, **kwargs)[source] [edit on github]

Method used to populate a simulation with a Monte Carlo realization of a model.

After calling this method, the model instance will have a new mock attribute. You can then access the galaxy population via model.mock.galaxy_table, an Astropy Table.

See Tutorial on the algorithm for subhalo-based mock-making for an in-depth tutorial on the mock-making algorithm.

Parameters:

halocat : object

Either an instance of CachedHaloCatalog or UserSuppliedHaloCatalog.

masking_function : function, optional

Function object used to place a mask on the halo table prior to calling the mock generating functions. Calling signature of the function should be to accept a single positional argument storing a table, and returning a boolean numpy array that will be used as a fancy indexing mask. All masked halos will be ignored during mock population. Default is None.

Notes

Note the difference between the halotools.empirical_models.SubhaloMockFactory.populate method and the closely related method halotools.empirical_models.SubhaloModelFactory.populate_mock. The populate_mock method is bound to a composite model instance and is called the first time a composite model is used to generate a mock. Calling the populate_mock method creates the SubhaloMockFactory instance and binds it to composite model. From then on, if you want to repopulate a new Universe with the same composite model, you should instead call the populate method bound to model.mock. The reason for this distinction is that calling populate_mock triggers a large number of relatively expensive pre-processing steps and self-consistency checks that need only be carried out once. See the Examples section below for an explicit demonstration.

In particular, if you are running an MCMC type analysis, you will choose your halo catalog and completeness cuts, and call populate_mock with the appropriate arguments. Thereafter, you can explore parameter space by changing the values stored in the param_dict dictionary attached to the model, and then calling the populate method bound to model.mock. Any changes to the param_dict of the model will automatically propagate into the behavior of the populate method.

Examples

Here we’ll use a pre-built model to demonstrate basic usage. The syntax shown below is the same for all composite models, whether they are pre-built by Halotools or built by you with SubhaloModelFactory.

>>> from halotools.empirical_models import PrebuiltSubhaloModelFactory
>>> model_instance = PrebuiltSubhaloModelFactory('behroozi10')

Here we will use a fake simulation, but you can populate mocks using any instance of CachedHaloCatalog or UserSuppliedHaloCatalog.

>>> from halotools.sim_manager import FakeSim
>>> halocat = FakeSim()
>>> model_instance.populate_mock(halocat)

Your model_instance now has a mock attribute bound to it. You can call the populate method bound to the mock, which will repopulate the halo catalog with a new Monte Carlo realization of the model.

>>> model_instance.mock.populate()

If you want to change the behavior of your model, just change the values stored in the param_dict. Differences in the parameter values will change the behavior of the mock-population.

>>> model_instance.param_dict['scatter_model_param1'] = 0.25
>>> model_instance.mock.populate()
restore_init_param_dict()[source] [edit on github]

Reset all values of the current param_dict to the values the class was instantiated with.

Primary behaviors are reset as well, as this is how the inherited behaviors get bound to the values in param_dict.

set_calling_sequence()[source] [edit on github]

Method used to determine the sequence of function calls that will be made during mock population. The methods of each component model will be called one after the other; the order in which the component models are called upon is determined by _model_feature_calling_sequence. When each component model is called, the sequence of methods that are called for that component is determined by the _mock_generation_calling_sequence attribute bound to the component model instance. See The model_feature_calling_sequence mechanism for further details.

set_inherited_methods()[source] [edit on github]

Function determines which component model methods are inherited by the composite model.

Each component model should have a _mock_generation_calling_sequence attribute that provides the sequence of method names to call during mock population. Additionally, each component should have a _methods_to_inherit attribute that determines which methods will be inherited by the composite model. The _mock_generation_calling_sequence list should be a subset of _methods_to_inherit. If any of the above conditions fail, no exception will be raised during the construction of the composite model. Instead, an empty list will be forcibly attached to each component model for which these lists may have been missing. Also, for each component model, if there are any elements of _mock_generation_calling_sequence that were missing from _methods_to_inherit, all such elements will be forcibly added to that component model’s _methods_to_inherit.

Finally, each component model should have an _attrs_to_inherit attribute that determines which attributes will be inherited by the composite model. If any component models did not implement the _attrs_to_inherit, an empty list is forcibly added to the component model.

After calling the set_inherited_methods method, it will be therefore be entirely safe to run a for loop over each component model’s _methods_to_inherit and _attrs_to_inherit, even if these lists were forgotten or irrelevant to that particular component.

set_model_redshift()[source] [edit on github]
set_primary_behaviors(**kwargs)[source] [edit on github]

Creates names and behaviors for the primary methods of SubhaloModelFactory that will be used by the outside world.

Notes

The new methods created here are given standardized names, for consistent communication with the rest of the package. This consistency is particularly important for mock-making, so that the SubhaloModelFactory can always call the same functions regardless of the complexity of the model.

The behaviors of the methods created here are defined elsewhere; set_primary_behaviors just creates a symbolic link to those external behaviors.

set_warning_suppressions()[source] [edit on github]

Method used to determine whether a warning should be issued if the build_init_param_dict method detects the presence of multiple appearances of the same parameter name.

If any of the component model instances have a _suppress_repeated_param_warning attribute that is set to the boolean True value, then no warning will be issued even if there are multiple appearances of the same parameter name. This allows the user to not be bothered with warning messages for cases where it is understood that there will be no conflicting behavior.

update_param_dict_decorator(component_model, func_name)[source] [edit on github]

Decorator used to propagate any possible changes in the composite model param_dict down to the appropriate component model param_dict.

The behavior of the methods bound to the composite model are decorated versions of the methods defined in the component models. The decoration is done with update_param_dict_decorator. For each function that gets bound to the composite model, what this decorator does is search the param_dict of the component_model associated with the function, and update all matching keys in that param_dict with the param_dict of the composite. This way, all the user needs to do is make changes to the composite model param_dict. Then, when calling any method of the composite model, the changed values of the param_dict automatically propagate down to the component model before calling upon its behavior. This allows the composite_model to control behavior of functions that it does not define.

Parameters:

component_model : obj

Instance of the component model in which the behavior of the function is defined.

func_name : string

Name of the method in the component model whose behavior is being decorated.

Returns:

decorated_func : function

Function object whose behavior is identical to the behavior of the function in the component model, except that the component model param_dict is first updated with any possible changes to corresponding parameters in the composite model param_dict.