sid.contacts

This module contains everything related to contacts and matching.

Module Contents

Functions

calculate_contacts(contact_models: Dict[str, Dict[str, Any]], states: pandas.DataFrame, params: pandas.DataFrame, seed: itertools.count) → pandas.DataFrame

Calculate number of contacts of different types.

calculate_infections_by_contacts(states: pandas.DataFrame, recurrent_contacts: numpy.ndarray, random_contacts: numpy.ndarray, params: pandas.DataFrame, indexers: Dict[str, numba.typed.List], assortative_matching_cum_probs: numba.typed.List, contact_models: Dict[str, Dict[str, Any]], group_codes_info: Dict[str, Dict[str, Any]], susceptibility_factor: numpy.ndarray, virus_strains: Dict[str, Any], seasonality_factor: pandas.Series, seed: itertools.count) → Tuple[pandas.Series, pandas.Series, pandas.DataFrame]

Calculate infections from contacts.

_reduce_random_contacts_with_infection_probs(random_contacts: numpy.ndarray, probs: numpy.ndarray, seed: int) → numpy.ndarray

Reduce the number of random contacts stochastically.

_calculate_infections_by_recurrent_contacts(recurrent_contacts: numpy.ndarray, infectious: numpy.ndarray, cd_infectious_true: numpy.ndarray, immunity: numpy.ndarray, virus_strain: numpy.ndarray, group_codes: numpy.ndarray, indexers: numba.typed.List, infection_probs: numpy.ndarray, susceptibility_factor: numpy.ndarray, contagiousness_factor: numpy.ndarray, immunity_resistance_factor: numpy.ndarray, infection_counter: numpy.ndarray, seed: int) → Tuple[numpy.ndarray]

Match recurrent contacts and record infections.

_calculate_infections_by_random_contacts(random_contacts: numpy.ndarray, infectious: numpy.ndarray, cd_infectious_true: numpy.ndarray, immunity: numpy.ndarray, virus_strain: numpy.ndarray, group_codes: numpy.ndarray, assortative_matching_cum_probs: numba.typed.List, indexers: numba.typed.List, susceptibility_factor: numpy.ndarray, contagiousness_factor: numpy.ndarray, immunity_resistance_factor: numpy.ndarray, infection_counter: numpy.ndarray, seed: int) → Tuple[numpy.ndarray]

Match random contacts and record infections.

choose_other_group(a, cdf)

Choose a group out of a, given cumulative choice probabilities.

choose_other_individual(a, weights)

Return an element of a, if weights are not all zero, else return -1.

_get_index_refining_search(u, cdf)

Get the index of the first element in cdf that is larger than u.

create_group_indexer(states: pandas.DataFrame, group_code_name: str) → numba.typed.List

Create the group indexer.

post_process_contacts(contacts, states, contact_models)

Post-process contacts.

_sum_preserving_round(arr)

Round values in an array, preserving the sum as good as possible.

_consolidate_reason_of_infection(was_infected_by_recurrent: numpy.ndarray, was_infected_by_random: numpy.ndarray, contact_models: Dict[str, Dict[str, Any]]) → pandas.Series

Consolidate reason of infection.

_numpy_replace(x: numpy.ndarray, replace_to: Dict[Any, Any])

Replace values in a NumPy array with a dictionary.

calculate_contacts(contact_models: Dict[str, Dict[str, Any]], states: pandas.DataFrame, params: pandas.DataFrame, seed: itertools.count) pandas.DataFrame[source]

Calculate number of contacts of different types.

Parameters
Returns

A DataFrame with columns for each active contact

model and the number of contacts.

Return type

contacts (pandas.DataFrame)

calculate_infections_by_contacts(states: pandas.DataFrame, recurrent_contacts: numpy.ndarray, random_contacts: numpy.ndarray, params: pandas.DataFrame, indexers: Dict[str, numba.typed.List], assortative_matching_cum_probs: numba.typed.List, contact_models: Dict[str, Dict[str, Any]], group_codes_info: Dict[str, Dict[str, Any]], susceptibility_factor: numpy.ndarray, virus_strains: Dict[str, Any], seasonality_factor: pandas.Series, seed: itertools.count) Tuple[pandas.Series, pandas.Series, pandas.DataFrame][source]

Calculate infections from contacts.

This function mainly converts the relevant parts from states and contacts into numpy arrays or other objects that are supported in numba nopython mode and then calls _calculate_infections_by_contacts_numba() to calculate the infections by contact.

Parameters
  • states (pandas.DataFrame) – see The states DataFrame.

  • recurrent_contacts (numpy.ndarray) – An array with boolean entries for each person and recurrent contact model.

  • random_contacts (numpy.ndarray) – An array with integer entries indicating the number of contacts for each person and random contact model.

  • params (pandas.DataFrame) – See params.

  • indexers (Dict[str, numba.typed.List]) – The indexer is a dictionary with one entry for recurrent and random contact models. The values are Numba lists containing Numba lists for each contact model. Each list holds indices for each group in the contact model.

  • assortative_matching_cum_probs (numba.typed.List) – The list contains one entry for each random contact model. Each entry holds a n_groups * n_groups transition matrix where probs[i, j] is the cumulative probability that an individual from group i meets someone from group j.

  • contact_models (Dict[str, Dict[str, Any]]) – The contact models.

  • group_codes_info (Dict[str, Dict[str, Any]]) – The name of the group code column for each contact model.

  • susceptibility_factor (numpy.ndarray) – A multiplier which scales the infection probability due to susceptibility.

  • virus_strains (Dict[str, Any]) – A dictionary with the keys "names", "contagiousness_factor" and "immunity_resistance_factor" holding the different contagiousness factors and immunity resistance factors of multiple viruses.

  • seasonality_factor (pandas.Series) – A multiplier which scales the infection probabilities due to seasonality. The index are the factor model names.

  • seed (itertools.count) – Seed counter to control randomness.

Returns

Tuple containing

  • infected (pandas.Series): Boolean Series that is True for newly infected people.

  • n_has_additionally_infected (pandas.Series): A series with counts of people an individual has infected in this period by contact.

  • missed_contacts (pandas.DataFrame): Counts of missed contacts for each contact model.

  • was_infected_by (numpy.ndarray): An array indicating the contact model which caused the infection.

Return type

(tuple)

_reduce_random_contacts_with_infection_probs(random_contacts: numpy.ndarray, probs: numpy.ndarray, seed: int) numpy.ndarray[source]

Reduce the number of random contacts stochastically.

The remaining random contacts have the interpretation that they would lead to an infection if one person is infectious and the other person is susceptible, the person has the highest susceptibility in the population according to the susceptibility_factor, and the infected person is affected by the most contagious virus strain according to the virus_strain.

The copy is necessary as we need the original random contacts for debugging.

Parameters
  • random_contacts (numpy.ndarray) – An integer array containing the number of contacts per individual for each random (non-recurrent) contact model.

  • probs (numpy.ndarray) – An array containing one infection probability for each random contact model.

  • seed (int) – The seed.

Returns
random_contacts (numpy.ndarray): Same shape as contacts. Equal to contacts for

recurrent contact models. Less or equal to contacts otherwise.

_calculate_infections_by_recurrent_contacts(recurrent_contacts: numpy.ndarray, infectious: numpy.ndarray, cd_infectious_true: numpy.ndarray, immunity: numpy.ndarray, virus_strain: numpy.ndarray, group_codes: numpy.ndarray, indexers: numba.typed.List, infection_probs: numpy.ndarray, susceptibility_factor: numpy.ndarray, contagiousness_factor: numpy.ndarray, immunity_resistance_factor: numpy.ndarray, infection_counter: numpy.ndarray, seed: int) Tuple[numpy.ndarray][source]

Match recurrent contacts and record infections.

Parameters
  • recurrent_contacts (numpy.ndarray) – 2d integer array with number of contacts per individual. There is one row per individual in the state and one column for each contact model where model[“model”] != “meet_group”.

  • infectious (numpy.ndarray) – 1d boolean array that indicates if a person is infectious. This is not directly changed after an infection.

  • cd_infectious_true (numpy.ndarray) – 1d integer array with countdown values until an individual is infectious.

  • immunity (numpy.ndarray) – 1d float array indicating immunity level

  • virus_strain (numpy.ndarray) –

  • group_codes (numpy.ndarray) – 2d integer array with the index of the group used in the first stage of matching.

  • indexers (numba.typed.List) – Nested typed list. The i_th entry of the inner lists are the indices of the i_th group. There is one inner list per contact model.

  • infection_probs (numpy.ndarray) – An array containing the infection probabilities for each recurrent contact model.

  • susceptibility_factor (numpy.ndarray) – A multiplier which scales the infection probability.

  • contagiousness_factor (numpy.ndarray) – Virus strain dependent contagiosity factor.

  • immunity_resistance_factor (np.ndarray) – Virus strain dependent immunity resistance factor. This factor determines how prior immunity influences infection probabilities. Higher values decrease the effectiveness to guard from an infection.

  • infection_counter (numpy.ndarray) – An array counting infection caused by an individual.

  • seed (int) – Seed value to control randomness.

Returns

(tuple) Tuple containing

  • newly_infected (numpy.ndarray): 1d integer array that is -1 for individuals who are not newly infected and set to the virus strain of infection for individuals who are infected.

  • infection_counter (numpy.ndarray): 1d integer array, counting how many individuals were infected by an individual.

  • was_infected_by (numpy.ndarray): An array indicating the contact model which caused the infection.

_calculate_infections_by_random_contacts(random_contacts: numpy.ndarray, infectious: numpy.ndarray, cd_infectious_true: numpy.ndarray, immunity: numpy.ndarray, virus_strain: numpy.ndarray, group_codes: numpy.ndarray, assortative_matching_cum_probs: numba.typed.List, indexers: numba.typed.List, susceptibility_factor: numpy.ndarray, contagiousness_factor: numpy.ndarray, immunity_resistance_factor: numpy.ndarray, infection_counter: numpy.ndarray, seed: int) Tuple[numpy.ndarray][source]

Match random contacts and record infections.

Parameters
  • random_contacts (numpy.ndarray) – 2d integer array with number of contacts per individual. There is one row per individual in the state and one column for each contact model where model[“model”] != “meet_group”.

  • infectious (numpy.ndarray) – 1d boolean array that indicates if a person is infectious. This is not directly changed after an infection.

  • cd_infectious_true (numpy.ndarray) – 1d integer array with countdown values until an individual is infectious.

  • immunity (numpy.ndarray) – 1d float array indicating immunity level

  • virus_strain (numpy.ndarray) –

  • group_codes (numpy.ndarray) – 2d integer array with the index of the group used in the first stage of matching.

  • assortative_matching_cum_probs (numba.typed.List) – List of arrays of shape n_group, n_groups. arr[i, j] is the cumulative probability that an individual from group i meets someone from group j.

  • indexers (numba.typed.List) – Nested typed list. The i_th entry of the inner lists are the indices of the i_th group. There is one inner list per contact model.

  • susceptibility_factor (np.ndarray) – A multiplier which scales the infection probability.

  • contagiousness_factor (np.ndarray) – Virus strain dependent contagiosity factor.

  • immunity_resistance_factor (np.ndarray) – Virus strain dependent immunity resistance factor. This factor determines how prior immunity influences infection probabilities. Higher values decrease the effectiveness to guard from an infection.

  • infection_counter (numpy.ndarray) – An array counting infection caused by an individual.

  • seed (int) – Seed value to control randomness.

Returns

(tuple) Tuple containing

  • newly_infected (numpy.ndarray): 1d integer array that is -1 for individuals who are not newly infected and set to the virus strain of infection for individuals who are infected.

  • infection_counter (numpy.ndarray): 1d integer array, counting how many individuals were infected by an individual.

  • missed (numpy.ndarray): Matrix which contains unmatched random contacts.

choose_other_group(a, cdf)[source]

Choose a group out of a, given cumulative choice probabilities.

Note: This function is also used in sid-germany.

choose_other_individual(a, weights)[source]

Return an element of a, if weights are not all zero, else return -1.

Implementation is similar to _choose_one_element.

numpy.argmax() returns the first index for multiple maximum values.

Note: This function is also used in sid-germany.

Parameters
Returns

An element of a or -1

Return type

choice (int or float)

Example

>>> choose_other_individual(np.arange(3), np.array([0, 0, 5]))
2
>>> choose_other_individual(np.arange(3), np.zeros(3))
-1
>>> chosen = choose_other_individual(np.arange(3), np.array([0.1, 0.5, 0.7]))
>>> chosen in [0, 1, 2]
True

Get the index of the first element in cdf that is larger than u.

The algorithm does a refining search. We first iterate over cdf in larger steps to find a subset of cdf in which we have to look at each element.

The step size in the first iteration is the square root of the length of cdf, which minimizes runtime in expectation if u is a uniform random variable.

Parameters
  • u (float) – A uniform random draw.

  • cdf (numpy.ndarray) – 1d array with cumulative probabilities.

Returns

The selected index.

Return type

int

Example

>>> cdf = np.array([0.1, 0.6, 1.0])
>>> _get_index_refining_search(0, cdf)
0
>>> _get_index_refining_search(0.05, cdf)
0
>>> _get_index_refining_search(0.55, cdf)
1
>>> _get_index_refining_search(1, cdf)
2
create_group_indexer(states: pandas.DataFrame, group_code_name: str) numba.typed.List[source]

Create the group indexer.

The indexer is a list where the positions correspond to the group number defined by assortative variables. The values inside the list are one-dimensional integer arrays containing the indices of states belonging to the group.

If there are no assortative variables, all individuals are assigned to a single group with code 0 and the indexer is a list where the first position contains all indices of states.

When an assortative variable is factorized, missing values receive -1 as the group key. Thus, we remove all negative group keys from the indexer.

Parameters
Returns

The i_th entry are the indices of the i_th group.

Return type

indexer (numba.typed.List)

post_process_contacts(contacts, states, contact_models)[source]

Post-process contacts.

The number of contacts for random models is rounded such that the sum of contacts is preserved.

_sum_preserving_round(arr)[source]

Round values in an array, preserving the sum as good as possible.

The function loops over the elements of an array and collects the deviations to the nearest downward adjusted integer. Whenever the collected deviations reach a predefined threshold, +1 is added to the current element and the collected deviations are reduced by 1.

Parameters

arr (numpy.ndarray) – A one-dimensional array whose values should be rounded.

Returns

Array with sum preserved rounded values.

Return type

arr (numpy.ndarray)

Example

>>> arr = np.full(10, 5.2)
>>> _sum_preserving_round(arr)
array([5., 5., 6., 5., 5., 5., 5., 6., 5., 5.])
>>> arr = np.full(2, 1.9)
>>> _sum_preserving_round(arr)
array([2., 2.])
_consolidate_reason_of_infection(was_infected_by_recurrent: numpy.ndarray, was_infected_by_random: numpy.ndarray, contact_models: Dict[str, Dict[str, Any]]) pandas.Series[source]

Consolidate reason of infection.

_numpy_replace(x: numpy.ndarray, replace_to: Dict[Any, Any])[source]

Replace values in a NumPy array with a dictionary.