Quickstart guide to analyzing halo catalogs¶
In this section of the documentation we’ll give a quick demonstration
of how information in Halotools-formatted halo catalogs is organized.
In particular, you’ll see how to access both halo catalog metadata
as well as the Astropy Table
storing the tabular halo data.
For more in-depth information about how to analyze halo catalogs, see the Tutorials on analyzing halo catalogs section of the documentation. This quickstart guide assumes you have followed the Getting started with Halotools section of the documentation, so that you already have the default halo catalog stored on your machine.
Loading cached halo catalogs into memory¶
To load the default halo catalog into memory, just instantiate
the CachedHaloCatalog
class with no arguments:
from halotools.sim_manager import CachedHaloCatalog
halocat = CachedHaloCatalog()
You may find it useful to read the documentation of the
CachedHaloCatalog
class together with this quickstart guide.
The default halo catalog in Halotools is the redshift-zero Bolshoi simulation with halos identified using Rockstar. This is reflected in the metadata of the halo catalog:
print(halocat.simname, halocat.halo_finder, halocat.redshift)
('bolshoi', 'rockstar', -0.0003)
Loading alternative catalogs¶
As described in the documentation on the CachedHaloCatalog
class,
you can access any cached halo catalog using the same syntax as above, but using
keyword arguments to specify which cached catalog you’d like. For example, if you
have used the halotools/scripts/download_additional_halocat.py
script to
download the Bolshoi-Planck z = 0.5 snapshot, then you can load that catalog
into memory as follows:
halocat = CachedHaloCatalog(simname = 'bolplanck', redshift = 0.5)
Note that the CachedHaloCatalog
class
works with any Halotools-formatted halo catalog stored in any disk location,
not just Halotools-provided snapshots stored in the default cache location.
This includes your own reductions of
the publicly available Rockstar catalogs
and/or your own proprietary simulation
with halos identified by whatever method you prefer.
Organization of halo information¶
A Halotools-formatted halo catalog comes equipped with both the tabular data associated with the halos, and metadata about the simulation snapshot. In this quickstart guide, we’ll demonstrate how to access both kinds of information in the two sections below.
Accessing the tabular data storing the halo catalog¶
The catalog of halos itself is stored as the halo_table
attribute in
the form of an Astropy Table
object:
halos = halocat.halo_table
To see what halo properties are available, you can use the keys
method, just like a python dictionary
print(halos.keys())
['halo_vmax_firstacc', 'halo_dmvir_dt_tdyn', 'halo_macc', 'halo_scale_factor', 'halo_vmax_mpeak', 'halo_m_pe_behroozi', 'halo_xoff', 'halo_spin', 'halo_scale_factor_firstacc', 'halo_c_to_a', 'halo_mvir_firstacc', 'halo_scale_factor_last_mm', 'halo_scale_factor_mpeak', 'halo_pid', 'halo_m500c', 'halo_id', 'halo_halfmass_scale_factor', 'halo_upid', 'halo_t_by_u', 'halo_rvir', 'halo_vpeak', 'halo_dmvir_dt_100myr', 'halo_mpeak', 'halo_m_pe_diemer', 'halo_jx', 'halo_jy', 'halo_jz', 'halo_m2500c', 'halo_mvir', 'halo_voff', 'halo_axisA_z', 'halo_axisA_x', 'halo_axisA_y', 'halo_y', 'halo_b_to_a', 'halo_x', 'halo_z', 'halo_m200b', 'halo_vacc', 'halo_scale_factor_lastacc', 'halo_vmax', 'halo_m200c', 'halo_vx', 'halo_vy', 'halo_vz', 'halo_dmvir_dt_inst', 'halo_rs', 'halo_nfw_conc', 'halo_hostid', 'halo_mvir_host_halo']
You can read about the conventions used to define subhalos vs. host halos in the Rockstar halo and subhalo nomenclature conventions section of the documentation. For a thorough discussion of the meaning of each column in these halo catalogs, see the appendix of Rodriguez Puebla et al 2016.
You can select a particular sample of halos using a Numpy boolean mask:
mask = (halos['halo_mvir'] > 1e12) & (halos['halo_mvir'] < 2e12) & (halos['halo_upid'] == -1)
milky_way_halos = halos[mask]
Accessing the snapshot metadata¶
All metadata associated with a Halotools-formatted halo catalog is
accessible via attributes of the CachedHaloCatalog
object.
print(halocat.redshift, halocat.Lbox)
(0.4966, 250.0)
The Lbox
attribute can be useful in performing calculations, for
example in accounting for the periodic boundary conditions of the
simulation. There are also many attributes dedicated to rigorously
keeping track of how a halo catalog was processed.
For example, during the initial processing of the halo catalog, cuts may
have been placed on certain columns of the halo catalog. If you
processed your halo catalog using the
halotools.sim_manager.RockstarHlistReader
, every cut you used to
reduce the halo catalog will have a corresponding attribute reminding
you of the choice you made during the data reduction. In the
Halotools-provided snapshots, any (sub)halo that never had more than 300
particles at any point in its assembly history was discarded. The
halo_mpeak
column of the halo table stores the largest value of the
virial mass ever attained by the halo throughout its assembly history,
and so this 300-particle cut is reflected by the
halo_mpeak_row_cut_min
attribute of the halo catalog:
print("Minimum halo_mpeak = %.2e" % halocat.halo_mpeak_row_cut_min)
Minimum halo_mpeak = 4.05e+10
As simple bookkeeping errors are so common in simulation analysis, you
may find Halotools useful to help avoid buggy results even if the
CachedHaloCatalog
is the only feature of the package that you use.