Tuberculosis (TB) remains a major public health challenge in Brazil despite long-standing control strategies. Understanding its temporal dynamics is essential for surveillance and policy planning. This study investigates the monthly tuberculosis incidence rate in Brazil using Singular Spectrum Analysis (SSA) and compares its performance with the Mann-Kendall (MK) trend test and ARIMA modeling. SSA was applied with a window length of 36 months to decompose the series into trend, seasonal, and noise components. The MK test was used to assess the statistical significance of the long-term trend, while ARIMA modeling evaluated short-term predictive performance. The results reveal a statistically significant decreasing trend (τ = −0.33, p < 0.001), weak but persistent seasonality, and structural changes in recent years. SSA proved superior in identifying intrinsic temporal patterns, while MK confirmed trend significance and ARIMA complemented the analysis with forecasting capability. The integrated framework provides a robust approach for epidemiological time-series analysis of tuberculosis.
ARIMA; Brazil; Mann-kendall test; Singular spectrum analysis; Tuberculosis; Seasonality
Tuberculosis (TB) remains one of the leading infectious causes of morbidity and mortality worldwide, particularly in low- and middle-income countries (WHO, 2023). Brazil is among the 30 countries with the highest TB burden, despite continuous investments in diagnosis, treatment, and surveillance (WHO, 2023).
Long-term epidemiological time series are often affected by non-stationarity, demographic changes, health policy interventions, and external shocks such as the COVID-19 pandemic. Traditional parametric models, including regression-based and ARIMA approaches, rely on assumptions that may not be fully satisfied in such contexts (Box et al., 2015). Consequently, data-adaptive and non-parametric techniques have gained increasing attention in epidemiological research.
Singular Spectrum Analysis (SSA) is a powerful non-parametric method that decomposes time series into interpretable components such as trend and seasonality without requiring stationarity or linearity assumptions (Golyandina & Zhigljavsky, 2013). SSA has been successfully applied in environmental sciences and epidemiology to reveal hidden temporal structures in complex datasets (Hassani, 2007).
This study applies SSA to the monthly tuberculosis incidence rate in Brazil and compares its performance with the Mann–Kendall trend test and ARIMA modeling. The objective is to provide a comprehensive methodological assessment and demonstrate the advantages of combining complementary approaches for epidemiological time-series analysis.
Monthly tuberculosis data were obtained from the Brazilian National Notifiable Diseases Information System (SINAN/DATASUS), covering the period from January 2001 to December 2022. The analysis focused on the monthly tuberculosis incidence rate (TI), expressed as the number of reported cases per 100,000 inhabitants.
The use of incidence rates, rather than absolute case counts, minimizes the influence of population growth and allows consistent temporal comparisons across the study period [1]. The final dataset consisted of 264 monthly observations with no missing value.
Data source and preprocessing
This time-series analysis was conducted using monthly tuberculosis incidence data in Brazil from January 2001 to December 2022. Tuberculosis case records were obtained from the Brazilian National Notifiable Diseases Information System (SINAN), the official surveillance system for compulsory notifiable diseases managed by the Ministry of Health. Annual population estimates were retrieved from the Brazilian Institute of Geography and Statistics (IBGE).
Monthly tuberculosis incidence rates (TI) were calculated as the number of reported cases per 100,000 inhabitants, providing a population-standardized metric suitable for long-term temporal analysis. The final dataset consisted of 264 continuous monthly observations with no missing values, and no data imputation was required.
Exploratory data analysis
An exploratory analysis was performed to visualize the temporal evolution of tuberculosis incidence and to identify potential trends, variability, and seasonal patterns. This step included time-series visualization and monthly boxplot analysis to assess intra-annual variability. Exploratory results guided the selection of appropriate analytical methods and window length for Singular Spectrum Analysis.
Singular Spectrum Analysis (SSA)
Singular Spectrum Analysis (SSA) is a non-parametric, data-adaptive technique used to decompose a time series into interpretable components such as trend, seasonal oscillations, and noise. The SSA procedure consists of four main steps: embedding, singular value decomposition (SVD), grouping, and reconstruction [2].
A window length of L = 36 months was selected to adequately capture annual and interannual variability, which is appropriate for monthly epidemiological series [3]. The trajectory matrix was constructed from the original incidence series and decomposed using SVD. The leading eigenvalues and eigenvectors were examined to identify dominant components.
Trend and seasonal components were reconstructed using diagonal averaging based on the grouping of dominant SSA modes. The relative contribution of each component was assessed using the proportion of variance explained by the corresponding eigen value3.4 Mann–Kendall trend test
To statistically assess the presence of a monotonic trend in the tuberculosis incidence series, the Mann–Kendall (MK) test was applied. This non-parametric test does not require assumptions of normality and is robust to outliers and non-linear behavior [4].
The MK test statistic Kendall’s tau (τ) and the associated p-value were computed for the original monthly incidence series. The test was used to confirm the statistical significance of the long-term trend identified through SSA.
ARIMA modeling
Autoregressive Integrated Moving Average (ARIMA) models were used as a benchmark parametric approach for comparison with SSA results. The tuberculosis incidence series was different to achieve stationarity, and model parameters were selected based on standard diagnostic procedures and the Akaike Information Criterion (AIC) [5].
ARIMA modeling was employed primarily to evaluate short-term predictive performance and to contrast parametric modeling assumptions with the decomposition-based insights provided by SSA.
Ethical considerations
This study was based exclusively on publicly available secondary data obtained from open-access governmental databases. As no individual-level or identifiable information was used, ethical approval by a research ethics committee was not required.
Descriptive statistics
Table 1 summarizes the descriptive statistics of the monthly tuberculosis incidence rate. The mean incidence rate was 3.79 cases per 100,000 inhabitants, with values ranging from 2.67 to 4.92 cases per 100,000 inhabitants, indicating moderate variability over the study period.
|
Statistic |
Value |
|
Number of observations |
264 |
|
Mean |
3.787 |
|
Standard deviation |
0.378 |
|
Minimum |
2.669 |
|
25th percentile |
3.533 |
|
Median |
3.742 |
|
75th percentile |
4.002 |
|
Maximum |
4.92 |
Table 1: Descriptive statistics of monthly tuberculosis incidence rate (TI), Brazil (2001–2022).
SSA decomposition
The SSA eigenvalue spectrum (Figure 1) shows that the first component explains approximately 99.5% of the total variance, indicating a dominant trend structure.
The reconstructed trend (Figure 2) reveals a persistent decline in tuberculosis incidence from the early 2000s until the mid-2010s, followed by stabilization and increased variability in recent years. The seasonal component exhibits a weak but consistent annual cycle, suggesting modest seasonal modulation of TB incidence.
Figure 1: SSA eigenvalue spectrum (scree plot) for the first ten components.
The eigenvalue spectrum (Figure 1) reveals a pronounced dominance of the first SSA component, highlighting the importance of long-term structural dynamics in tuberculosis incidence.
Figure 2: SSA decomposition of tuberculosis incidence: original series, trend, and seasonal components.
The SSA decomposition (Figure 2) highlights a clear long-term decreasing trend in tuberculosis incidence, while the seasonal component shows low-amplitude but recurrent annual oscillations, indicating secondary seasonal effects.
Monthly seasonality (boxplot analysis)
Figure 3 presents the monthly boxplot distribution of tuberculosis incidence. Higher median values are observed in March and August, while lower medians occur in February and December. The interquartile ranges indicate moderate month-to-month variability, supporting the presence of weak but persistent seasonality consistent with SSA results.
Figure 3: Monthly boxplot distribution of tuberculosis incidence rate (TI) in Brazil.
Figure 3 highlights modest but consistent seasonal differences in TB incidence, with higher median values in March and August and lower medians in February and December, supporting the presence of weak seasonality observed in the SSA reconstruction.
Mann–Kendall trend test
The Mann–Kendall test confirms a statistically significant decreasing trend in tuberculosis incidence:
These results corroborate the long-term decline identified by SSA (Table 2).
|
Statistic |
Value |
|
Kendall’s τ |
−0.327 |
|
p-value |
2.51 × 10-15 |
Table 2. Mann-Kendall trend test results for monthly tuberculosis incidence rate.
The Mann–Kendall test indicated a statistically significant decreasing trend in tuberculosis incidence over the study period.
ARIMA modeling
An ARIMA(1,1,1) model provided an adequate fit to the differenced series, with AIC = 91.52. While the model showed reasonable short-term predictive capability, it did not explicitly separate trend and seasonal components (Table 3).
|
Metric |
Value |
|
ARIMA order |
(1,1,1) |
|
Akaike Information Criterion (AIC) |
91.52 |
Table 3: ARIMA model specification and performance metrics.
The ARIMA(1,1,1) model provided an adequate statistical fit, although its interpretability was limited compared to SSA.
The present study provides a comprehensive analysis of long-term tuberculosis incidence in Brazil using Singular Spectrum Analysis, complemented by the Mann–Kendall trend test and ARIMA modeling. The results consistently indicate a statistically significant declining trend in tuberculosis incidence over the past two decades, corroborating national and global assessments of TB control progress [6].
Similar declining trends in TB incidence have been reported in other middle-income countries undergoing sustained public health interventions, although the magnitude and temporal stability of these trends vary considerably across regions [2]. The stabilization and increased variability observed in the most recent years of the series may reflect disruptions in healthcare access and case detection associated with the COVID-19 pandemic, as documented in several recent studies [1].
The SSA decomposition revealed a weak but persistent seasonal component, which is consistent with previous findings suggesting seasonal modulation of tuberculosis linked to climatic conditions, population behavior, and diagnostic delays [3]. Although seasonality accounted for a small proportion of total variance, its consistent presence across years underscores the importance of considering intra-annual variability in TB surveillance and planning.
Compared to parametric approaches, SSA demonstrated superior capability in separating long-term trends from short-term fluctuations without requiring stationarity assumptions. While ARIMA models remain valuable for short-term forecasting, their sensitivity to structural breaks and limited epidemiological interpretability have been widely reported in infectious disease applications [2]. The Mann–Kendall test complemented the analysis by providing formal statistical confirmation of the monotonic trend detected through SSA.
Overall, the integrated methodological framework adopted in this study aligns with recent recommendations advocating the combined use of decomposition-based methods and statistical testing to improve the robustness of epidemiological time-series analyses [1].
Singular Spectrum Analysis proved to be a robust and informative tool for analyzing monthly tuberculosis incidence in Brazil. When combined with the Mann–Kendall test and ARIMA modeling, SSA provides a comprehensive framework for trend detection, seasonal analysis, and forecasting. This approach can be readily extended to other infectious diseases and public health indicators.
Citation: de Souza A (2026) Singular Spectrum Analysis of Monthly Tuberculosis Incidence in Brazil: A Comparative Methodological Assessment. J Pulm Med Respir Res 11: 095.
Copyright: © 2026 Amaury de Souza, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.