We have developed a modeling framework to capture the key components of pathogen transmission in public transport networks. This is a general framework that could be used across a wide range of pathogens, modes of transport (including multimodal) and public transport networks, although here we focus on transmission of COVID-19 on trains. Figure 4 gives an overview of the modeling framework. There are four main components:
-
1.
Transit Assignment Engine: a transit assignment engine that simulates the movement of passengers in the public transport network. For our case study, this is the train network in Sydney, Australia.
-
2.
Modeling the spread of infection: the transit disease spread model. For our COVID-19 case study, this includes two forms of transmission: direct (person-to-person) and fomite (person-to-surface-to-person).
-
3.
Results: an analytical module that provides summary statistics and visualizations of the results.
-
4.
Disease Sowing: a community-wide transmission model providing information on disease sowing (the expected number of infectious passengers on the Sydney train network).
In addition to these components, our modeling framework requires input data and the specification of the scenarios of interest (e.g. including any non-pharmaceutical interventions considered).

The modeling framework with more information about each component described in Transist Assignment Engine – Seeding. The modular structure allows for flexibility in determining the geographic location, the pathogen of interest, and the scenarios studied. The agent-based model provides detailed results to make operational and strategic decisions
In our case study, this framework involves the assignment of a passenger to a train route based on traveler smart card data and an assigned shortest possible route. Depending on capacity and current occupancy, each section of the route is assigned to a traffic service vehicle (in this case a train). The trains run according to a predefined timetable. Each time a passenger completes their journey, the transit disease model is triggered and that passenger is assigned an exposure status (i.e., whether the passenger became infected during their journey). The information about new infections is sent to the visualization module to be displayed on a dashboard. At the end of each simulated day and at the end of the simulation horizon, the total number of new infections is calculated and reported for final evaluation. The SAfE transport modules are described in more detail under “Transist Assignment Engine” – “Seeding”.
Transit assignment engine
The Transit Mapping Engine is an agent-based simulation platform that maps ride demand to transit service supply. The Transit Attribution Engine supports SAfE Transport as it provides the network of contacts between traveling passengers (i.e. who they are in contact with and for how long).
The engine depends on input data in the form of:
-
1.
Trip demand (ie trip origin, destination and start of trip time for each passenger) and
-
2.
Provision of transit services (ie routes and timetables of public transport services).
The full architecture is included in Supplementary File 1. The total number of journeys on the Sydney train network before COVID-19 was about 8 million per week. To account for the decline in patronage during COVID-19, we assume demand is 10% of the total, which is consistent with real-life observations [27, 29, 30]. The range of services has not changed in Sydney due to COVID-19. Ride demand is captured by smart cards, including tap-on and tap-off data with location GPS coordinates and timestamp. This is used by the shortest path router to calculate a set of time-dependent shortest paths for each traveling passenger using the classic Dijkstra algorithm. During the simulation, passengers are tracked and detailed dynamic outputs are collected for each agent, connection (between two consecutive stops/stations), stops/stations, service vehicles, service lines and the entire network.
Modeling the spread of infection in trains
At its core, the disease spread model on trains provides a probability of becoming infected for each susceptible passenger based on current and past travel of infectious passengers in the same spatial area. The model uses a number of simplifying assumptions, the most important of which is that we ignore any age-related effects (all agents are identical). We also assume homogeneity of mixing within the considered spatial area, including an even distribution of passengers throughout the train. In order to account for our homogeneous mixture within a spatial area assumption, we use a half-car spatial area for our model due to the stacking structure (including upper and lower deck) of the railway cars in Sydney (Waratah design).
The overall probability that a susceptible person will be infected is based on the standard probabilistic statement of one minus the probability that they were not infected. This allows us to separately consider the contagion probability of each of the infectious passengers and the surfaces on the half-car. The transmission model uses the attack rate regression expression from the study by Hu et al. [19] (Fig. 4 in [19], average of all seats) both directly for the empirical person-to-person part of our transfer model and for calibrating the mechanistic person-to-surface-to-person model. The mechanistic model is based on the transmission pathway model developed by Atkinson and Wein [1]. This approach models the fraction of virus droplets on the surface and the effective dose based on the surface concentration. We then use a standard dose-response model for the probability of infection from contaminated surfaces (germ carriers). Further details of the transmission model are provided in Supplementary File 1 along with sensitivity analyzes of key parameters.
sowing
Our primary disease spread model, as described in “Modeling the spread of infection on trains”, only considers transmission on the public transport network (and is specifically calibrated for trains). In order to keep the number of infectious passengers traveling on trains identical in all simulations for comparability, we use a deterministic compartment model to approximate transmission dynamics in the general community.
We use the standard progression structure “susceptible – exposed – infectious – recovered”, repeating the compartments “exposed” and “infectious” to better account for the distribution of time spent in these states (see e.g. [20]). We refer to this as the “SEEIIR” model for short. We start the deterministic SEEIIR model with 2000 cases of infection using Sydney’s population of 5.73 million in 2019 [4]and a basic reproduction number of 2.5 [6, 23, 45]. We have used estimates of the proportion of the population that commutes by public transport (approximately 20% according to 2006, 2001 and 2016 census data). [2, 28]) and the proportion of commuters who use the train (approx. 50.9% [39]) to arrive at an estimate that 10% of the Sydney population uses trains. Further details are contained in Supplementary File 1, including a table listing the numbers used to seed infectious passengers for each of the 7 days.
Face mask wearing scenarios
The main objective of this work was to investigate the likely effects of different passenger face mask wearing ratios (i.e. face mask coverage). How passengers’ mask-wearing status affects the transit transmission model depends on whether the passenger is susceptible or contagious. The mask-wearing status of infectious people reduces virus shedding. The mask-wearing status of susceptible individuals is known to reduce the overall likelihood of infection [38], but does so through two separate mechanisms: (i) it explicitly affects the likelihood of being infected by the direct transmission component; and (ii) it provides the effective dose from the surface for the germ transfer component.
To parameterize the viral shedding effect on these infectious passengers, we use the filtration efficiency of a two-ply cloth mask as described in Howard et al. [18]. They discovered that two-ply cloth masks reduce exposure to infectious particles by 88-94% and have filtration efficiencies of 80-90%. We use 90% in our modeling because it is within these two limits, and reducing the burden of infectious particles is arguably the most important consideration.
For vulnerability reduction, the model was calibrated using sum of squared error minimization to achieve the reported odds ratio of 0.22 for wearing a mask [38]. Further information on this can be found in supplementary dossier 1.