group_member_generator¶
- halotools.utils.group_member_generator(data, grouping_key, requested_columns)[source]¶
Generator used to loop over grouped data and yield requested properties of members of a group. When running a for loop over
group_member_generator, you will be repeatedly sent arrays storing properties of data entries sharing a commongrouping_key. This enables you to perform whatever intra-group calculation you wish for each iteration through the number of total groups. The generator also sends you the indices of the inputdatacorresponding to the yielded group members, allowing you to create new columns for your data table storing the results of your intra-group calculations.Before calling
group_member_generator, the inputdatamust be sorted by thegrouping_keyso thatdata[grouping_key]is monotonic.Common applications of
group_member_generatorinclude subhalo analysis (e.g., calculating host halo mass) and galaxy group analysis (e.g., calculating total stellar mass or group-centric position). The Examples section below shows basic usage. There are also three tutorials demonstrating common applications in more detail:- Parameters:
- dataStructured Numpy
ndarrayor AstropyTable - grouping_keystring
Name of the column that defines how the input
dataare grouped, e.g.,group_idorhalo_hostid. The inputdatamust be sorted such that the array stored indata[grouping_key]is monotonic.- requested_columnslist of strings
List of column names that will be yielded by the generator. As you loop over the generator, for every string entry in
requested_columnsthere will be an array that is yielded. It is permissible forrequested_columnsto be an empty list, in which case thegroup_data_listyielded at each iteration will also be an empty list.
- dataStructured Numpy
- Returns:
- first_idx, last_idxint
These two integers provide the indices of the rows of the input
datayielded at each iteration.- group_data_listlist
List of arrays storing the requested group member properties. There will be one element of
group_data_listfor every element of the inputrequested_columns. Each element is a Numpyndarraywith a length equal to the number of members of the group.
Examples
First let’s retrieve a Halotools-formatted halo catalog storing some randomly generated data.
>>> from halotools.sim_manager import FakeSim >>> halocat = FakeSim() >>> halos = halocat.halo_table
As described in Rockstar halo and subhalo nomenclature conventions, the
halo_hostidis a natural grouping key for a halo table. Let’s use this key to calculate the host halo mass of all halos in the data table.First we build the generator:
>>> halos.sort(['halo_hostid', 'halo_upid']) >>> grouping_key = 'halo_hostid' >>> requested_columns = ['halo_mvir'] >>> group_gen = group_member_generator(halos, grouping_key, requested_columns)
Then we loop over it:
>>> result = np.zeros(len(halos)) >>> for first, last, member_props in group_gen: ... masses = member_props[0] ... host_mass = masses[0] ... result[first:last] = host_mass >>> halos['halo_mvir_host_halo'] = result
Inside the scope of the loop, the first two yielded integers allow us to access the appropriate slice of the array being calculated. The
member_propslist only stores a single element, the masses array storing the value ofhalo_mvirof each member of the host + subhalo system. Because we have sorted the halos by bothhalo_hostidandhalo_upid, then within eachhalo_hostidgrouping, the host system will appear first because -1 is smaller than any value forhalo_upidstored by a subhalo. Thus by selecting the first element of the masses array, we select the virial mass of the host halo.We can also use the
group_member_generatorto compute more complicated quantities. For example, let’s calculate the mean mass-weighted spin of all halo members. Note that our halo table is already sorted, so we save CPU time by not re-sorting it.>>> grouping_key = 'halo_hostid' >>> requested_columns = ['halo_mvir', 'halo_spin'] >>> group_gen = group_member_generator(halos, grouping_key, requested_columns)
>>> result = np.zeros(len(halos)) >>> for first, last, member_props in group_gen: ... masses = member_props[0] ... spins = member_props[1] ... mass_weighted_avg_spin = np.sum(masses*spins)/float(len(masses)) ... result[first:last] = mass_weighted_avg_spin >>> halos['halo_mass_weighted_avg_spin'] = result