Paper No. 17-04

Cheung S, Hutton JL and Brettschneider JA

Review of sojourn time calculation models used in breast cancer screening

Abstract: For decades, researchers have been estimating sojourn time of breast cancers. This is primarily for identifying a suitable round length in a new breast screening programme. In recent years, the aim of researches on sojourn time is to evaluate round lengths in existing breast screening programmes. Lead time is also a well studied topic, and it is used to adjust the survival time of screen-detected cancer patients in studies on the efficacy of screening programmes.

Because observing sojourn time and lead time is infeasible, mathematical models are built to perform estimations. However, the terminologies used in these models are often not well-defined. This can cause difficulties for researchers to learn and interpret their results. In this study, the models and the parameters were reviewed.

The definition of sojourn time is inconsistent, in which the end point of this time length is the clinically detectable time in some papers and the clinical detection time in others. The starting point of the lead time for screen-detected cancer is the diagnosis time, which is approximated by the screening date in lead time modelling. As for sensitivity, it has been wrongly defined as the sensitivity of mammogram in most studies of breast cancer sojourn time. Instead, it should be the sensitivity of the whole diagnostic process within the chosen screening programme.

Two main types of the stochastic process models were used for sojourn time estimation - the time recurrence models and Markov chain models. These models simplify the natural history of breast cancer and, as a result, they carry numerous assumptions. Especially, the progressive assumption is likely to be invalid. The commonly used estimation methods are maximum likelihood, least square approach and Bayesian inference. The study settings and cohorts often affect the model and the data. These subsequently affect the estimated results. Researcher should understand the disease nature, the study settings, the population cohorts and the data collection process in order to build an appropriate model for the study cohort.