SampleSelector

class halotools.utils.SampleSelector[source] [edit on github]

Bases: object

Container class for commonly used sample selections.

Methods Summary

host_halo_selection([return_subhalos]) Method divides sample in to host halos and subhalos, and returns either the hosts or the hosts and the subs depending on the value of the input return_subhalos.
property_range([lower_bound, upper_bound, …]) Method makes a cut on an input table column based on an input upper and lower bound, and returns the cut table.
split_sample(**kwargs) Method divides a sample into subsamples based on the percentile ranking of a given property.

Methods Documentation

static host_halo_selection(return_subhalos=False, **kwargs)[source] [edit on github]

Method divides sample in to host halos and subhalos, and returns either the hosts or the hosts and the subs depending on the value of the input return_subhalos.

static property_range(lower_bound=-inf, upper_bound=inf, return_complement=False, host_halos_only=False, subhalos_only=False, **kwargs)[source] [edit on github]

Method makes a cut on an input table column based on an input upper and lower bound, and returns the cut table.

Parameters:

table : Astropy Table object, keyword argument

key : string, keyword argument

Column name that will be used to apply the cut

lower_bound : float, optional keyword argument

Minimum value for the input column of the returned table. Default is \(-\infty\).

upper_bound : float, optional keyword argument

Maximum value for the input column of the returned table. Default is \(+\infty\).

return_complement : bool, optional keyword argument

If True, property_range gives the table elements that do not pass the cut as the second return argument. Default is False.

host_halos_only : bool, optional keyword argument

If true, property_range will use the host_halo_selection method to make an additional cut on the sample so that only host halos are returned. Default is False

subhalos_only : bool, optional keyword argument

If true, property_range will use the host_halo_selection method to make an additional cut on the sample so that only subhalos are returned. Default is False

Returns:

cut_table : Astropy Table object

Examples

To demonstrate the property_range method, we will start out by loading a table of halos into memory using the FakeSim class:

>>> from halotools.sim_manager import FakeSim
>>> halocat = FakeSim()
>>> halos = halocat.halo_table

To make a cut on the halo catalog to select halos in a specific mass range:

>>> halo_sample = SampleSelector.property_range(table = halos, key = 'halo_mvir', lower_bound = 1e12, upper_bound = 1e13)

To apply this same cut, and also only select host halos passing the cut, we use the host_halos_only keyword:

>>> host_halo_sample = SampleSelector.property_range(table = halos, key = 'halo_mvir', lower_bound = 1e12, upper_bound = 1e13, host_halos_only=True)

The same applies if we only want subhalos returned only now we use the subhalos_only keyword:

>>> subhalo_sample = SampleSelector.property_range(table = halos, key = 'halo_mvir', lower_bound = 1e12, upper_bound = 1e13, subhalos_only=True)
static split_sample(**kwargs)[source] [edit on github]

Method divides a sample into subsamples based on the percentile ranking of a given property.

Parameters:

table : Astropy Table object, keyword argument

key : string, keyword argument

Column name that will be used to define the percentiles

percentiles : array_like

Sequence of percentiles used to define the returned subsamples. If percentiles has more than one element, the elements must be monotonically increasing. If percentiles is length-N, there will be N+1 returned subsamples.

Returns:

subsamples : list

Examples

To demonstrate the split_sample method, we will start out by loading a table of halos into memory using the FakeSim class:

>>> from halotools.sim_manager import FakeSim
>>> halocat = FakeSim()
>>> halos = halocat.halo_table

We can easily use split_sample to divide the sample into a high-Vmax and low-Vmax subsamples:

>>> sample_below_median, sample_above_median = SampleSelector.split_sample(table = halos, key = 'halo_vmax', percentiles = 0.5)

Likewise, we can do the same thing to divide the sample into quartiles:

>>> lowest, lower, higher, highest = SampleSelector.split_sample(table = halos, key = 'halo_zhalf', percentiles = [0.25, 0.5, 0.75])

The following alternative syntax is also supported:

>>> subsample_collection = SampleSelector.split_sample(table = halos, key = 'halo_zhalf', percentiles = [0.25, 0.5, 0.75])
>>> lowest, lower, higher, highest = subsample_collection