October 26, 2017
Copyright:

Disclaimer: The designations employed and the presentation of material throughout this publication do not imply the expression of any opinion whatsoever on the part of UNESCO concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. The ideas and opinions expressed in this publication are those of the authors; they are not necessarily those of UNESCO and do not commit the Organization. 
Citation: Mitchell, J. and Akram, S. 2017. Pathogen Specific Persistence Modeling Data. In: J.B. Rose and B. JiménezCisneros, (eds) Global Water Pathogen Project. http://www.waterpathogens.org (M. Yates (eds) Part 4 Management of Risk from Excreta and Wastewater) http://www.waterpathogens.org/book/pathogenspecificpersistencemodelingdata Michigan State University, E. Lansing, MI, UNESCO. 
Last published: October 26, 2017 
Persistence modeling facilitates the accurate simulation of different stages of growth, survival, and death of microorganisms in environmental matrices by describing the changes in population size of microorganisms over time. The most commonly used model for simulating persistence patterns of pathogens is the first order exponential oneparameter model. However, persistence curves for many microorganisms do not follow this classic linear trend, as evidenced by decades of studies across the growing field of predictive microbiology and for microbes in many different matrices. Therefore, it is essential for an evaluation of linear and nonlinear curves (models) in order to provide an accurate description of both pathogen and matrix specific persistence (or inactivation).
Seventeen linear and nonlinear persistence models were used to find the best models for describing the persistence of water microbes  bacteria, viruses, bacteriophages, bacteroidales, and protozoa in human urine, wastewater, freshwater, marine water, groundwater matrices, biosolids and manure. A total number of 30 datasets were used in this study containing 180 different pathogen (or indicator)/matrix combinations to find the best fitting models through linear regression techniques to describe persistence subject to various conditions. Like the exponential decay model, these models contain general parameter(s) to mathematically describe the relationship between reductions in microbial populations with time. The models do not contain explanatory variables to isolate the effects of environmental conditions (i.e. temperature, UV exposure) on inactivation. Overall, three models (JM2, JM1, and Gamma) were found to be the best fitting models across the entire data set and represented 59%, 34% and 25% of the data sets respectively. JM2 fit the persistence data the best across environmental matrices except in human urine, and groundwater in which JM1 performed the best at describing the persistence patterns. Across pathogens, JM2 was the best model for bacteria, bacteriophages, and bacteroidales. However, viruses were best fit by JM1. The models which best describe the persistence pattern of each pathogen or indicator in a matrix under different treatments and their corresponding parameters are presented in this chapter. In addition, T90 and T99 values, which are commonly used to specify the time required for a pathogen concentration to decrease by one and two log units, respectively, is reported for all the datasets and compared between matrices and microorganisms types. While this metric is often used to describe pathogen persistence, it is only relevant in the linear region of the persistence curve and will be misleading if an incorrect model is assumed or a model other than the best fitting model is used to estimate these values. Therefore, the results in this chapter that contains the best fitting models and parameters along with the associated calculation of T90 and T99 for various pathogen/matrix combinations can reduce uncertainty in estimations of pathogen population size in water environments over time.
Mathematical models are commonly used to describe the microbial inactivation of pathogens persisting in environmental matrices and to predict population sizes for subsequent human health risk calculations, which may lead to treatment decisions. Persistence modeling facilitates the accurate simulation of the stages of growth, survival, and death of microorganisms in different matrices by describing the mathematical relationship between the population size of microorganisms and time through regression techniques to estimate kinetic parameters (constants). While T90 and T99 values are commonly used to indicate the time required for a pathogen concentration to decrease by one and two log units, respectively, these metrics are only relevant in the linear region of a persistence curve and may be misleading if more complex persistence patterns accurately describe a specific pathogen in a specific matrix (i.e. curves with shoulders or tails). The significance of utilizing mathematical models to describe kinetics is in their ability to describe the persistence pattern of specific pathogens in different environments enabling engineers and policy makers to predict the absolute microorganism populations at any given time through interpolation or extrapolation, not just the population size relative to the initial conditions like the T90 or T99.
The most commonly used model for simulating persistence patterns of pathogens is the first order exponential oneparameter model, which was originally developed to describe the inactivation of chemical disinfectants (Chick 1908). However, as predictive microbial modeling has developed as a filed over many years, persistence curves for microorganisms in a number of environments were observed that do not follow this classic linear pattern. .Therefore, accurate description of nonlinear persistence curves is essential. A common explanation for this nonlinearity in persistence is that the population of microorganisms may consist of several subpopulations, each with different inactivation kinetics. Along with linear curves, curves with “shoulder” (a delay before attenuation begins), curves with “tailing” (attenuation slows with time), and sigmoidal curves (both a shoulder and a tailing) are the four most typically observed models for bacterial decay (Xiong et al. 1999). Shoulders in curves represent the smooth initial inactivation, while tailing can represent an intrinsic resistance of some microorganisms or that they are protected by various factors. Figure 1 shows a schematic representation of different patterns of persistence of microorganisms.
Figure 1. Schematic representation of four different persistence patterns
As the first order model cannot describe more complex persistence patterns such as shoulders, tailing, and sigmoidal curves noted above, several other models were developed and tested over time.
Table 1 shows the list of the microbial persistence models, which were utilized for the studies described in this chapter as well as their corresponding equations. These models are mostly empirical and have three or fewer model parameters.
A literature review was conducted as described in the previous chapter, “Persistence of Pathogens in Sewage and Other Water Types”. However, it was also expanded to include papers with dieoff data presented additional water matrix terms: sludge, urine and manure. For this analysis, raw data of microorganism concentration vs. time was obtained from the original author of the peerreviewed journal article or digitized from the figures in the publications. A total of 95 studies contained acceptable data for modeling, which contained 304 individual data sets describing a pathogen and matrix combination under various environmental conditions (urine, n=9; freshwater, n=131; wastewater, n=16; biosolids, n=42; marine, n=60; groundwater, n=46). While the environmental conditions  temperature, the presence of UV light or indigenous microbiota, for example  are known to influence decay rates, the focus of this analysis is not to describe empirical data sets with explanatory variables through regression modeling as this is reported in the original papers. The purpose of this chapter is to summarize pathogen specific decay rates using generalized persistence models which bestfit decay data sets for specific water environments under specific conditions.
Seventeen previously established linear and nonlinear persistence models were evaluated to determine the best models for describing the persistence of different bacteria, viruses, bacteriophages, bacteroidales, and protozoa in human urine, wastewater, biosolids and manure, freshwater, marine water, groundwater matrices. The chapter includes a selected representation of pathogens, marker or indicator data sets across the 6 environmental matrices noted above. A total number of 30 studies are summarized in this chapter which includes 180 different pathogen or indicator/matrix combinations. Indicator bacteria are typically used to detect the level of fecal contamination in environments and are generally not pathogenic to human health while pathogens are microorganisms which can produce diseases. Table 2 shows the best fitting models for all collected data. JM2, JM1, and gamma models were the best fit for 58.7%, 33.7% and 25.0% of the experimental data better than the other tested models respectively. As JM2 fit the persistence data the best in all matrices except in human urine, and groundwater where JM1 performed the best at describing the persistence patterns.
Table 3 shows the best fitting models for different pathogen and indicator types. JM2 was the best model for bacteria, Bacteriophages, and bacteroidales. For viruses, JM1 best described the persistence curves.
Table 4 shows the best fitting models for specific pathogens and indicators. The JM2 model best described the persistence curves for E.Coli, Enterococci, Salmonella, and HF183. The persistence was best described for MS2 by the JM1 model, and exponential model was the best fitting model for Adenovirus.
The time that the concentration of a type of microorganism in a specific environment decreases by one and two log units is called T90 and T99 respectively. These values are calculated and summarized using the best fitting models and their corresponding parameters for every pathogen (or indicator)/matrix combination. The predicted number of days needed to achieve 90% and 99% decay rates (T90 and T99) of pathogens and indicators in different matrices are summarized in Tables 5 and 6. The tables also show the range of variations and the standard deviations in T90 and T99 values calculated for every pathogen (or indicator)/matrix combination.
The results from the different studies on the persistence of various types of pathogenic human bacteria, viruses, bacteroidales, protozoa, and bacteriophages in different environments of human urine, wastewater, freshwater, marine water, groundwater, and biosolids were collected to find the bestfit mathematical models for different pathogen (or indicator)/matrix combinations.
The model parameters for different pathogen combinations in a human urine matrix along with the corresponding fitted parameters are presented in Table 7. Pathogens studied consisted of adenovirus and MS2 bacteriophage. Data were obtained from a yet unpublished study conducted by Dr. Tamar Kohn from Swiss Federal Institute of Technology in Lausanne.
The model parameters for different pathogen combinations in wastewater matrices along with the corresponding fitting parameters are presented in Table 8. Studied pathogens consisted of bacteria, viruses, bacteroidales, and protozoa and matrices consisted of treated and untreated wastewaters.
The model parameters for different pathogen combinations in different manure and biosolids matrices along with the corresponding fitting parameters are presented in Table 9. Studied pathogens consisted of bacteria and viruses, bacteroidales, and bacteriophages and matrices consisted of different biosolid types of composted manure, sludge, manure, and freshwaters contaminated with feces.
The model parameters for different pathogen combinations in different freshwater matrices along with the corresponding fitting parameters are presented in Tables 10a,10b, and 10c. Studied pathogens consisted of bacteria, viruses, bacteroidales, bacteriophages, and protozoa and matrices consisted of different freshwater types of lakes and rivers.
The model parameters for different pathogen combinations in different marine water matrices along with the corresponding fitting parameters are presented in Table 11. Studied pathogens consisted of bacteria and viruses, and matrices consisted of different marine water types of seawater, laboratory prepared saltwater and isolated estuarine waters.
The model parameters for different pathogen combinations in different groundwater matrices along with the corresponding fitting parameters are presented in Table 12. Studied pathogens consisted of viruses, a protozoa, and a bacteriophage.
As the first order model could not describe the persistence pattern of various pathogen/matrix combinations, several other models were developed and tested over time. The logistic model was developed originally to describe the sigmoid shaped decay curves in microbiology (Kamau et al. 1990). The Fermi model was applied originally to describe the influence of electric field intensity on the persistence of a microbial population (Peleg 1995). In addition, an exponentially damped polynomial model can be used to describe tailing survival curves (CavalliSforza et al. 1983). Juneja and Marks developed two models, denoted here as JM1 and JM2, to describe the fate of foodborne pathogens in food processing operations (Juneja et al. 2003; Juneja et al. 2006). JM1 was mostly utilized to simulate the decay which had a convex shape and describe the tail in decay curves over long periods. JM2 was frequently used to simulate the nonlinear persistence curves of thermal inactivation rates. Different variations of Gompertz models, denoted Gz2 and Gz3, were developed to predict the number of microorganisms persisting under stressed environmental conditions (Wu et al. 2004; Gil et al. 2011). Along with the logistic model, Gompertz functions are frequently used to fit sigmoidal kinetics (Membre et al. 1997). Gz2 and Gz3 models can describe loglinear kinetics, shoulders, and tailing effects (Zwietering et al. 1990; Chhabra et al. 1999). Weibull and lognormal models are commonly used for thermal and nonthermal disinfection in different matrices. Weibull model can predict linear, concave, and convex and sigmoidal curves (Coroller et al. 2006). The other commonly used persistence model is Gamma. This is a simple model with few parameters and is commonly being utilized to simulate the microbial persistence in water matrices under varying environmental conditions (van Gerwen & Zwietering 1998). The broken line models of Bi and Bi2 were originally developed to simulate multiple breakpoints in decay curves (Muggeo 2003). The double exponential model that was originally developed to simulate thermal inactivation is able to represent linear and biphasic persistence curves (Abraham et al. 1990). On the other hand, sigmoidal models of sA and sB are typically being used to describe concave inactivation curves (Peleg 2006).
This chapter specifies and discusses the most appropriate mathematical models to describe the persistence patterns of various types of pathogens in different matrices, and compares the best fit models and decay rates in the specific pathogen (or indicator)/matrix combinations.
Maximum likelihood estimation and Bayesian Information Criterion (BIC) values were used in order to assess the goodness of fit among the 17 persistence models which were fit to each pathogen (or indicator)/matrix combination. The models with the lowest absolute BIC values were selected as the best fitting models. Differences less than 2 in BIC values are not considered strong evidence for model selection. If the difference between the lowest BIC values of some models in a pathogen/matrix combination was less than 2, all these models were considered as the best fit models in this chapter.
BIC is defined as:
BIC = k ln (n)  2 ln (L_{m})
where k is the total number of parameters, n is the number of data points in the observed data (x), and Lm is the maximum likelihood of the model. Lm is defined as:
L_{m} = p(xθ, Μ)
where θ are the parameter values which maximize the likelihood function, and M is the model used.
Raw data from different studies were obtained/extracted, analyzed, and fit to various persistence models using R statistical language (R Development Core Team, 2013). Datacapturing software, GetData Graph Digitizer (http://www.getdatagraphdigitizer.com), was used to digitize figures in the published papers where data values were not specifically stated.
Comments