
class halotools.utils.SampleSelector[source]

Bases: object

Container class for commonly used sample selections.

Methods Summary


Method divides sample in to host halos and subhalos, and returns either the hosts or the hosts and the subs depending on the value of the input return_subhalos.

property_range([lower_bound, upper_bound, ...])

Method makes a cut on an input table column based on an input upper and lower bound, and returns the cut table.


Method divides a sample into subsamples based on the percentile ranking of a given property.

Methods Documentation

static host_halo_selection(return_subhalos=False, **kwargs)[source]

Method divides sample in to host halos and subhalos, and returns either the hosts or the hosts and the subs depending on the value of the input return_subhalos.

static property_range(lower_bound=-inf, upper_bound=inf, return_complement=False, host_halos_only=False, subhalos_only=False, **kwargs)[source]

Method makes a cut on an input table column based on an input upper and lower bound, and returns the cut table.

tableAstropy Table object, keyword argument
keystring, keyword argument

Column name that will be used to apply the cut

lower_boundfloat, optional keyword argument

Minimum value for the input column of the returned table. Default is \(-\infty\).

upper_boundfloat, optional keyword argument

Maximum value for the input column of the returned table. Default is \(+\infty\).

return_complementbool, optional keyword argument

If True, property_range gives the table elements that do not pass the cut as the second return argument. Default is False.

host_halos_onlybool, optional keyword argument

If true, property_range will use the host_halo_selection method to make an additional cut on the sample so that only host halos are returned. Default is False

subhalos_onlybool, optional keyword argument

If true, property_range will use the host_halo_selection method to make an additional cut on the sample so that only subhalos are returned. Default is False

cut_tableAstropy Table object


To demonstrate the property_range method, we will start out by loading a table of halos into memory using the FakeSim class:

>>> from halotools.sim_manager import FakeSim
>>> halocat = FakeSim()
>>> halos = halocat.halo_table

To make a cut on the halo catalog to select halos in a specific mass range:

>>> halo_sample = SampleSelector.property_range(table = halos, key = 'halo_mvir', lower_bound = 1e12, upper_bound = 1e13)

To apply this same cut, and also only select host halos passing the cut, we use the host_halos_only keyword:

>>> host_halo_sample = SampleSelector.property_range(table = halos, key = 'halo_mvir', lower_bound = 1e12, upper_bound = 1e13, host_halos_only=True)

The same applies if we only want subhalos returned only now we use the subhalos_only keyword:

>>> subhalo_sample = SampleSelector.property_range(table = halos, key = 'halo_mvir', lower_bound = 1e12, upper_bound = 1e13, subhalos_only=True)
static split_sample(**kwargs)[source]

Method divides a sample into subsamples based on the percentile ranking of a given property.

tableAstropy Table object, keyword argument
keystring, keyword argument

Column name that will be used to define the percentiles


Sequence of percentiles used to define the returned subsamples. If percentiles has more than one element, the elements must be monotonically increasing. If percentiles is length-N, there will be N+1 returned subsamples.



To demonstrate the split_sample method, we will start out by loading a table of halos into memory using the FakeSim class:

>>> from halotools.sim_manager import FakeSim
>>> halocat = FakeSim()
>>> halos = halocat.halo_table

We can easily use split_sample to divide the sample into a high-Vmax and low-Vmax subsamples:

>>> sample_below_median, sample_above_median = SampleSelector.split_sample(table = halos, key = 'halo_vmax', percentiles = 0.5)

Likewise, we can do the same thing to divide the sample into quartiles:

>>> lowest, lower, higher, highest = SampleSelector.split_sample(table = halos, key = 'halo_zhalf', percentiles = [0.25, 0.5, 0.75])

The following alternative syntax is also supported:

>>> subsample_collection = SampleSelector.split_sample(table = halos, key = 'halo_zhalf', percentiles = [0.25, 0.5, 0.75])
>>> lowest, lower, higher, highest = subsample_collection