How to implement testing for Covid-19¶
This notebook shows you how to implement testing in your model and demonstrates how the spread of the disease can change if people change their behavior due to the information.
Testing has proven to be a key counter measure to the Covid-19 epidemy. It is necessary to gain information on people’s health statuses and to change their behavior or the behavior of their recent contacts. The goal is to interrupt a chain of infections early on and prevent superspreaders or exponential increases in infection numbers.
Testing also enables to study a multitude of new scenarios which can be helpful for policy makers to enact interventions to get the disease under control.
[1]:
%matplotlib inline
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from sid import get_simulate_func
from sid.config import INDEX_NAMES
from sid.shared import load_epidemiological_parameters
The stages¶
Testing is implemented in three stages to offer full flexibility for researchers to implement different testing schemes. We will go through each stage in detail. To summarise, the three stages are
Calculating the demand for tests.
Allocating tests to individuals.
Processing tests in laboratories.
Demand models¶
In the first stage, the demand for tests is calculated. The user can specify multiple demand models where each model represents a channel through which an individual can ask for a test. An individual can request a test if she is symptomatic, if she works with elderly people, or if she has the potential to be a superspreader (e.g. teachers).
Each model is a dictionary similar to a contact model. It has the following keys:
A
"loc"
entry subsets the parameters passed to the function to facilitate access."start"
and"end"
are dates which can be parsed with :func:pandas.Timestamp
and represent the period in which the testing model is active.Under the
"model"
key is a function which assigns a probability to each individual which is the likelihood by which the person demands a test. Below, you find an example for an demand model where individuals ask for a test when they experience symptoms, have not already received a positive test result and when they are not waiting for a test result. The return of the demand model can be a :class:pandas.Series
or :class:numpy.ndarray
.
[2]:
def demand_test_if_experience_symptoms(states, params, seed):
"""Demand test if person experiences symptoms.
This demand model assumes that individuals request a test for a
Covid-19 infection if they experience symptoms, have not requested
a test before which is still pending and have not received a positive
test result with a probability of 1.
Args:
states (pandas.DataFrame): The states of the individuals.
params (pandas.DataFrame): A DataFrame with parameters.
seed (int): A provided seed to ensure reproducibility.
Returns:
demand_probability (numpy.ndarray, pandas.Series): An array or a series
which contains the probability for each individual demanding a test.
"""
s = states.symptomatic & ~states.pending_test & ~states.knows_immune
return s
[3]:
testing_demand_models = {
"symptoms": {
"model": demand_test_if_experience_symptoms,
# "loc": ...,
# "start": "2019-02-29",
# "end": pd.Timestamp("2019-07-30"),
}
}
Allocation models¶
In the second phase, available tests are distributed to individuals with allocation models. If an individual is assigned a test, the test will be administered and is ready to be processed by an laboratory in the next stage.
Allocation models have to purposes: First, they allow to resolve instances in which the demand for tests exceeds the available number of tests which is given by
params.loc[("testing", "allocation", "rel_available_tests"), "value"]
per 100,000 individuals.
If allocation models assign more tests than are available, a warning is issued, but the simulation continues. A warning is also raised, if an individual receives multiple tests.
Secondly, allocation models allow to give some groups preferred access to tests. For example, policy makers may decide that health care workers, teachers or other professions have a higher priority.
Allocations models are similarly defined as demand models. Inside sid, we loop over all custom allocation models which can return a boolean series indicating individuals which received a test. When a model returns a series with allocated tests, demands_test
is set to False
for individuals who have received a test. The following allocation model receives the updated demands_test
and a new count of all individuals already receiving a test in this period which can be used to limit the
allocation.
The second argument is demands_test
which is a series indicating individuals who demand a test.
[4]:
def allocate_tests(n_allocated_tests, demands_test, states, params, seed):
"""Allocate tests to individuals who demand a test.
For simplicity, we assume that everyone has the same chance of
receiving a test to resolve excess demand.
Args:
n_allocated_tests (int): The number of individuals who already
received a test in this period from previous allocation models.
demands_test (pandas.Series): A series with boolean entries
where ``True`` indicates individuals asking for a test.
states (pandas.DataFrame): The states of the individuals.
params (pandas.DataFrame): A DataFrame with parameters.
seed (int): A provided seed to ensure reproducibility.
Returns:
allocated_tests (numpy.ndarray, pandas.Series): An array or a
series which indicates which individuals received a test.
"""
np.random.seed(seed)
n_available_tests = int(
params.loc[("testing", "allocation", "rel_available_tests"), "value"]
* len(states)
/ 100_000
)
selected_states = states.loc[demands_test]
n_allocated_tests = min(len(selected_states), n_available_tests)
locs = selected_states.sample(n=n_allocated_tests).index
allocated_tests = pd.Series(index=states.index, data=False)
allocated_tests.loc[locs] = True
return allocated_tests
[5]:
testing_allocation_models = {"direct_allocation": {"model": allocate_tests}}
Processing models¶
Now, we are in the third and final stage of the testing module. Here, all administered tests are collected and assigned to laboratories which will return the test result after some duration. With the test result, individuals are able to adjust their behavior or institutions can regulate behavior.
The processing models have two purposes similar to allocation models. First, they allow to resolve instances in which the demand for tests exceeds the available number of tests which is given by
params.loc[("testing", "processing", "rel_available_capacity"), "value"]
per 100,000 individuals.
If processing models evaluate more test than they are capable of, a warning is issued, but the simulation continues.
Secondly, processing models allow to assign a higher priority to some tests. For example, if laboratories are not able to handle the flood of administered tests anymore, policy makers may decide that laboratories should prioritize newly administered tests to give feedback more quickly (last-in first-out principle).
Similarly, to the functions of allocation models, processing models receive a count called n_to_be_processed_tests
which represents the number individuals whose test has already started to be processed. The individuals who administered a test which is waiting to be processed are indicated via states["pending_test"]
. Whenever an processing model returned a boolean series indicating individuals whose test is processed, states["pending_test"]
is set to False
for those individuals
and n_to_be_processed_tests
is updated.
[6]:
def process_tests(n_to_be_processed_tests, states, params, seed):
"""Process tests.
For simplicity, we assume that all tests are processed immediately, without
further delay.
Args:
n_to_be_processed_tests (int): Number of individuals whose test is
already set to be processed.
states (pandas.DataFrame): The states of the individuals.
params (pandas.DataFrame): A DataFrame with parameters.
seed (int): A provided seed to ensure reproducibility.
Returns:
started_processing (numpy.ndarray, pandas.Series): An array or series
with boolean entries indicating which tests started to be processed.
"""
np.random.seed(seed)
n_available_capacity = int(
params.loc[("testing", "processing", "rel_available_capacity"), "value"]
* len(states)
/ 100_000
)
selected_states = states.loc[states.pending_test]
n_processed_tests = min(len(selected_states), n_available_capacity)
locs = selected_states.sample(n=n_processed_tests).index
to_be_processed_tests = pd.Series(index=states.index, data=False)
to_be_processed_tests.loc[locs] = True
return to_be_processed_tests
[7]:
testing_processing_models = {"direct_processing": {"model": process_tests}}
Parameters¶
As seen before, sid requires two parameter values to set the number of available tests and the number of tests which can be processed by laboratories each day.
params.loc[("testing", "allocation", "rel_available_tests"), "value"]
params.loc[("testing", "processing", "rel_available_capacity"), "value"]
If you require a more flexible parametrisation, e.g., daily changing resources, set the parameters to infinity with np.inf
to avoid warnings. Then, use :func:functools.partial
to attach the data on resources to the function.
Countdowns¶
Testing requires some new countdowns which are explained here.
cd_received_test_result_true
can be used to set the processing duration of the test.cd_knows_immune_false
andcd_knows_infectious_false
are countdowns which are started when an individual receives a positive test result and at whose end the individual does not know about her health status anymore. The state variablesknows_immune
andknows_infectious
can be used to change the behaviour of individuals in contact models.
Prepare the rest¶
To compare a model without testing with the same model with testing, we need to prepare some other objects.
Initial states¶
[8]:
available_ages = [
"0-9",
"10-19",
"20-29",
"30-39",
"40-49",
"50-59",
"60-69",
"70-79",
"80-100",
]
ages = np.random.choice(available_ages, size=10_000)
regions = np.random.choice(["North", "South"], size=10_000)
households = np.random.choice(5_000, size=10_000)
initial_states = pd.DataFrame(
{"age_group": ages, "region": regions, "household": households}
)
initial_states = initial_states.astype("category")
initial_states["id"] = initial_states.index
Initial infections¶
Initial infections are specified via the initial conditions. We start the simulation with 100 infected individuals.
[9]:
initial_conditions = {"initial_infections": 100}
Contact models¶
The model has three contact models for households (recurrent), close and distant contacts. We assume that with a 90% chance a complete household will stay at home if any member of the household received a positive test result for at least 14 days or until the person is no longer infectious.
[10]:
def _if_hh_member_tested_positive_hh_stays_home(contacts, states):
# Everyone who received a positive test result will stay home for 14 days
# after the infection.
condition = states.knows_immune & (
states.cd_ever_infected.ge(-13) | states.knows_infectious
)
# Select only 90% of people who will stay at home.
people_w_positive_test = states.loc[condition].index
people_w_positive_test_staying_at_home = np.random.choice(
people_w_positive_test, size=int(len(people_w_positive_test) * 0.9)
)
condition.loc[
condition.index.difference(people_w_positive_test_staying_at_home)
] = False
# Everyone who is in a household with someone who has been tested positive
# will stay at home.
household_level_condition = condition.groupby(states.household).transform("any")
contacts.loc[household_level_condition] = 0
return contacts
def meet_distant(states, params, seed):
possible_n_contacts = np.arange(10)
contacts = np.random.choice(possible_n_contacts, size=len(states))
contacts = pd.Series(contacts, index=states.index)
contacts = _if_hh_member_tested_positive_hh_stays_home(contacts, states)
return contacts
def meet_close(states, params, seed):
possible_n_contacts = np.arange(5)
contacts = np.random.choice(possible_n_contacts, size=len(states))
contacts = pd.Series(contacts, index=states.index)
contacts = _if_hh_member_tested_positive_hh_stays_home(contacts, states)
return contacts
def meet_household(states, params, seed):
# everyone always meets their household members.
return pd.Series(1, states.index)
assort_by = ["age_group", "region"]
contact_models = {
"household": {
"model": meet_household,
"assort_by": "household",
"is_recurrent": True,
},
"close": {"model": meet_close, "assort_by": assort_by, "is_recurrent": False},
"distant": {"model": meet_distant, "assort_by": assort_by, "is_recurrent": False},
}
Parameters¶
[11]:
inf_params = pd.read_csv("infection_probs_testing.csv", index_col=INDEX_NAMES)
assort_probs = pd.read_csv("assort_by_params.csv", index_col=INDEX_NAMES)
immunity_params = pd.read_csv("immunity_params.csv", index_col=INDEX_NAMES)
disease_params = load_epidemiological_parameters()
params = pd.concat([disease_params, inf_params, assort_probs, immunity_params])
[12]:
# Allow for 50 ICU beds for 10,000 individuals.
params.loc[("health_system", "icu_limit_relative", "icu_limit_relative"), "value"] = 500
# Allow to distribute and process 50 tests per day.
params.loc[("testing", "allocation", "rel_available_tests"), "value"] = 500
params.loc[("testing", "processing", "rel_available_capacity"), "value"] = 500
/home/tim/miniconda3/envs/sid/lib/python3.9/site-packages/IPython/core/interactiveshell.py:2898: PerformanceWarning: indexing past lexsort depth may impact performance.
result = self._run_cell(
Run the simulation¶
[13]:
run_simulation_with_testing = get_simulate_func(
params=params,
initial_states=initial_states,
initial_conditions=initial_conditions,
contact_models=contact_models,
duration={"start": "2020-02-27", "periods": 200},
testing_demand_models=testing_demand_models,
testing_allocation_models=testing_allocation_models,
testing_processing_models=testing_processing_models,
saved_columns={"testing_states": "pending_test"},
seed=0,
path=".sid_with_testing",
)
result_with_testing = run_simulation_with_testing(params=params)
Start the simulation...
2020-09-13: 100%|██████████| 200/200 [00:56<00:00, 3.52it/s]
[14]:
run_simulation_without_testing = get_simulate_func(
params=params,
initial_states=initial_states,
initial_conditions=initial_conditions,
contact_models=contact_models,
duration={"start": "2020-02-27", "periods": 200},
saved_columns={"testing_states": "pending_test"},
seed=0,
path=".sid_without_testing",
)
result_without_testing = run_simulation_without_testing(params=params)
Start the simulation...
2020-09-13: 100%|██████████| 200/200 [00:29<00:00, 6.71it/s]
[15]:
fig, axs = plt.subplots(8, 2, figsize=(12, 18), sharey="row", sharex=True)
fig.subplots_adjust(bottom=0, wspace=0.2, hspace=0.2)
fig.suptitle(
"Comparison of simulation without testing (left) and with testing (right).",
y=0.92,
fontsize=14,
)
for i, result in enumerate([result_without_testing, result_with_testing]):
df = result["time_series"].compute()
df.resample("D", on="date")["ever_infected"].mean().plot(ax=axs[0, i])
df.resample("D", on="date")["infectious"].mean().plot(ax=axs[1, i])
df.resample("D", on="date")["dead"].sum().plot(ax=axs[2, i])
infectious_last_seven_days = df.cd_infectious_false.between(-7, 0)
df.loc[infectious_last_seven_days].resample("D", on="date")[
"n_has_infected"
].mean().plot(ax=axs[3, i])
df.resample("D", on="date")["knows_infectious"].sum().plot(ax=axs[4, i])
df.resample("D", on="date")["knows_immune"].sum().plot(ax=axs[5, i])
df.resample("D", on="date")["pending_test"].sum().plot(ax=axs[6, i])
df.resample("D", on="date")["needs_icu"].sum().plot(ax=axs[7, i])
for ax in axs.flatten():
ax.set_xlabel("")
ax.spines["right"].set_visible(False)
ax.spines["top"].set_visible(False)
for i in range(2):
axs[0, i].set_title("Share of Infected People")
axs[1, i].set_title("Share of Infectious People in the Population")
axs[2, i].set_title("Total Number of Deaths")
axs[3, i].set_title("$R_t$ (Effective Reproduction Number)")
axs[4, i].set_title("Number of People who know they are infectious")
axs[5, i].set_title("Number of People who know they are immune")
axs[6, i].set_title("Number of pending tests.")
axs[7, i].set_title("Number of patients in ICU.")
plt.show()