1 Introduction
This thesis is about applied and methodological Bayesian statistics. It is applied and methodological in that the primary concern is real-world questions and the means to answer them. The statistical approach is Bayesian because probability theory is used to arrive at conclusions based on models for observed data.
The applied focus of this thesis is in obtaining the strategic information needed to plan the response to the HIV (human immunodeficiency virus) epidemic in sub-Saharan Africa (SSA). Over 40 years since the beginning of the epidemic, HIV is the largest annual cause of disability adjusted life years (DALYs) among non-infants in SSA [Global Burden of Disease Collaborative Network (2019); Figure 1.1]. Quantification of the epidemic using statistics is a crucial part of the public health response. Effective implementation of HIV prevention and treatment requires strategic information. However, producing suitable estimates of relevant indicators is complicated by a range of statistical challenges.
The data used were gathered in national household surveys or routinely collected from healthcare facilities providing HIV services. An important feature of these data are the location and time at which observations were recorded. Spatio-temporal data have important recurring commonalities across a diverse range of application settings. The work conducted in this thesis uses and aspires to contribute to techniques from spatio-temporal statistics.
Computation is an essential part of modern statistical practice.
Each project in this thesis, and the thesis itself, is accompanied by R (R Core Team 2022) code, hosted on GitHub at https://github.com/athowes
.
To facilitate reproducible research, the R package orderly
(FitzJohn et al. 2023) was used to structure code repositories.
1.1 Chapter overview
This thesis is structured as follows:
- Chapter 2 provides an overview of the HIV/AIDS epidemic and describes the challenges faced by surveillance efforts.
- Chapter 3 introduces the statistical concepts and notation used throughout the thesis, focusing on Bayesian modelling and computation, spatio-temporal statistics, and survey methods.
- Chapter 4: The prevailing model for spatial structure used in small-area estimation (Besag, York, and Mollié 1991) was intended to analyse a grid of pixels. In disease mapping, areas correspond to the administrative divisions of a country, which are typically not a grid. I used simulation and survey data studies to evaluate the practical consequences of this concern.
- Chapter 5: Adolescent girls and young women are a demographic group at disproportionate risk of HIV infection. The Global AIDS Strategy recommends prioritising interventions on the basis of behaviour to prevent the most new infections using the limited available resources. I estimated the size of behavioural risk groups across priority countries to enable implementation of this strategy. Additionally, I assessed the potential benefits of the strategy in terms of numbers of new infections prevented. This work (Howes et al. 2023) was included in the UNAIDS (Joint United Nations Programme on HIV/AIDS) Global AIDS Update 2022 and 2023.
- Chapter 6: The Naomi small-area estimation model (Eaton et al. 2021) is used by countries to estimate district-level HIV indicators. First, to allow for compatibility with Naomi, I implemented the integrated nested Laplace approximations using automatic differentiation, opening the door to a new class of fast, flexible, and accurate Bayesian inference algorithms. The implementation was using models for a clinical trial of an epilepsy drug, and for the prevalence of the parasitic worm Loa loa. Second, I developed an approximate Bayesian inference method combining adaptive Gauss-Hermite quadrature with principal components analysis. I applied these methods to data from Malawi, and analysed the consequences of the inference method choice for policy relevant outcomes.
- Chapter 7: Finally, I discuss contributions of the research, avenues for future work, and some broader reflections.
Though chronological order is recommended, Chapters 4, 5 and 6 may be read in any order, or as stand-alone studies, if preferred.