Case 02 – Colombia (🇨🇴)

Exploring environmental and accessibility drivers of dengue risk across Colombian regions

1 Context

Note

Between 2015 and 2023, Colombia experienced recurrent dengue epidemics influenced by climatic anomalies, urban expansion, and unequal access to healthcare.
High-temperature clusters and increased precipitation have been associated with outbreaks in departments such as Antioquia, Valle del Cauca, and Santander, whereas rural areas face delays in case detection due to travel-time barriers.

1.1 Objective

Visualize how precipitation, temperature, and rural access interact with dengue incidence in Colombia between 2016 – 2024.

2 Functions from land4health()

Variable Function Description
Dengue cases l4h_dengue_cases() Extracts harmonized weekly dengue case data from the OpenDengue Project for a specified date range and country, returning a tidy tibble suitable for spatial or temporal analyses.
Surface temperature l4h_surface_temp() Extracts daytime or nighttime Land Surface Temperature (LST) for a user-defined region and time range using the MODIS MOD11A1.061 product. The function supports summarizing the temperature data over each date using a selected statistic (e.g., mean or median)
Precipitation l4h_terra_climate() Extracts monthly TerraClimate indicators (e.g., precipitation, ) aggregated over a user-defined region using Earth Engine, returning climate-ready covariates.

3 Workflow

3.1 Libraries

We need to verify and load all the necessary packages required for this case. It ensures that dependencies for spatial, environmental, and epidemiological analyses are properly installed and available. If any package is missing, install it manually with: install.packages()

for(p in c("geodata","sf","terra","tidyterra","rmapshaper","mapview","tmap",
  "tidyverse","lubridate","purrr","ggrepel",
  "ggplot2","ggpubr","RColorBrewer", "mapview", "stringi", "patchwork", "cowplot"
)){
  if(!require(p, character.only = TRUE)){
    install.packages(p)
  }
  library(p, character.only = TRUE)
}

library(land4health)
# ee_Authenticate()   # Run once if not authenticated
ee_Initialize()
── rgee 1.1.7 ─────────────────────────────────────── earthengine-api 0.1.317 ── 
 âś” user: not_defined 
 âś” Initializing Google Earth Engine:
 âś” Initializing Google Earth Engine:  DONE!

 âś” Earth Engine account: users/land4health_2 

 âś” Python Path: C:/Users/yomal/AppData/Local/Programs/Python/Python39/python.exe 
──────────────────────────────────────────────────────────────────────────────── 

3.2 Spatial data preparation

We first load and define the administrative boundaries that will serve as the spatial reference for all analyses. The shapefiles are retrieved from the GADM dataset via the geodata package.

colombia_sf <- geodata::gadm(country = "CO", level = 1, path = tempdir()) |> 
  st_as_sf()

colombia_sf <- rmapshaper::ms_simplify(colombia_sf, keep = 0.01)  

rmapshaper::ms_simplify() function simplifies the spatial geometry by removing redundant vertices while maintaining the overall shape and topology of the polygons.

3.3 Dengue cases

We retrieved harmonized weekly dengue case data from the OpenDengue Project, an open-access repository compiling surveillance information reported by national ministries of health across Latin America. Using the l4h_dengue_cases() function from the Land4Health library, we extracted all available records for Colombia between January and December 2021.

dengue_spat <- l4h_dengue_cases(
    from = "2021-01-01",
    to = "2021-12-31",
    data_type = "temporal",
    country = "Colombia",
    cache = TRUE,
    quiet = TRUE,
  )
Code
dengue_week <- dengue_spat %>%
  filter(!is.na(state)) %>%                              
  mutate(week = lubridate::week((as.Date(date_start)))) %>%        
  group_by(week) %>%
  summarise(total_cases = sum(cases, na.rm = TRUE)) %>%  #
  arrange(week)

ggplot(dengue_week, aes(x = week, y = total_cases)) +
  geom_col(fill = "#d73027", color = "white", width = 0.8) + 
  geom_line(aes(y = total_cases), color = "#67000d", linewidth = 0.6, alpha = 0.8) +  
  geom_point(color = "#67000d", size = 1.5, alpha = 0.8) +
  scale_x_continuous(breaks = seq(1, 52, by = 1)) +  #
  labs(
    x = "Epidemiological week",
    y = "Dengue cases (2021)"
  ) +
  theme_classic(base_size = 8) +
  theme(
    plot.title = element_text(face = "bold", size = 15, hjust = 0.5),
    axis.text = element_text(color = "black"),
    axis.line = element_line(color = "gray40"),
    panel.grid.major.y = element_line(color = "gray90", size = 0.3),
    panel.grid.minor = element_blank(),
    axis.ticks = element_line(color = "gray60"),
    plot.background = element_rect(fill = "white", color = NA)
  )

3.4 Land Surface Temperature (LST)

We extracted Land Surface Temperature (LST) data to characterize spatiotemporal variations in surface temperature across Colombia, using the l4h_surface_temp() function. The resulting dataset provides daily temperature estimates (in °C) derived from the MODIS MOD11A1.061 product, enabling the analysis of thermal patterns and their relationship with climatic drivers such as precipitation.

#Temperature
lst_day <- l4h_surface_temp(
  from = "2021-01-01",
  to   = "2021-12-31",
  region = colombia_sf,
  band = "day",
  stat = "mean")

3.5 Precipitation

We extracted daily precipitation (pr) data from the TerraClimate dataset to characterize spatial and temporal rainfall variability across Colombia during 2021. Using the l4h_terra_climate()

#Precipitation
out_pr <- l4h_terra_climate(
  from = "2021-01-01",
  to   = "2021-12-31",
  band = "pr",
  region = colombia_sf,
  stat = "mean",
  scale = 5000
)

This code extracts and summarizes Land Surface Temperature (LST) and precipitation data by administrative region (NAME_1) for a specific month, defined by the variable selected_month (in this case, March). First, the lst_day dataset is processed by converting the date variable into a monthly index using month(), grouping observations by region and month, and calculating the mean surface temperature for each group. The st_union() function merges geometries that belong to the same administrative region, and finally the dataset is filtered to keep only the selected month. The same procedure is then applied to the precipitation dataset (out_pr), where the mean monthly precipitation is computed for each region. As a result, two spatial datasets are obtained—one for temperature and another for precipitation—aggregated at the regional level and aligned temporally to allow direct comparison or visualization with mapping tools such as mapview() or sync().

selected_month <- 3 

lst_monthly <- lst_day %>%
  mutate(month = month(as.Date(date))) %>%
  group_by(NAME_1, month) %>%
  summarise(mean_temp = mean(value, na.rm = TRUE),
            geometry = st_union(geometry),
            .groups = "drop") %>% filter(month == selected_month)

pr_monthly <- out_pr %>%
  mutate(month = month(as.Date(date))) %>%
  group_by(NAME_1, month) %>%
  summarise(mean_pr = mean(value, na.rm = TRUE),
            geometry = st_union(geometry),
            .groups = "drop") %>% filter(month == selected_month)

Map shows a clear thermal gradient across Colombia, with higher temperatures concentrated in the northern and northeastern regions—particularly along the Caribbean coast and parts of the Magdalena River valley—while cooler conditions dominate the southern Amazonian departments (e.g., Caquetá, Putumayo, Amazonas). This spatial pattern aligns with expected latitudinal and altitudinal influences: coastal lowlands and arid valleys exhibit greater solar exposure, whereas forested and high-altitude zones maintain lower surface temperatures.

Code
map_temp <- mapview(
  lst_monthly,
  zcol = "mean_temp",
  layer.name = paste0("LST – (°C)"),
  col.regions = colorRampPalette(c("#2c7bb6", "#abd9e9", "#ffffbf", "#fdae61", "#d7191c"))(5),
  at = pretty(lst_monthly$mean_temp, n = 5),  
  alpha = 0.9
)

map_pr <- mapview(
  pr_monthly,
  zcol = "mean_pr",
  layer.name = paste0("Precipitation (mm)"),
  col.regions = colorRampPalette(c("#8c510a", "#d8b365", "#f6e8c3", "#5ab4ac", "#01665e"))(5),
  alpha = 0.9
)

map_pr | map_temp

In contrast, the southern and southwestern Amazonian regions register the highest rainfall, while northern and central areas—especially along the Caribbean coast and inter-Andean valleys—show drier conditions. This reflects Colombia’s climatic duality, where equatorial convection and Amazonian moisture drive rainfall in the south, while northern lowlands experience more seasonal or arid regimes. Together, these maps highlight a negative correlation between temperature and precipitation: warmer areas tend to be drier, and wetter regions tend to be cooler, a pattern consistent with Colombia’s topography and climatic zonation.

3.6 How dengue cases vary in different conditions?

Code
dengue_month <- dengue_spat %>%
  filter(!is.na(state)) %>%
  mutate(
    month_num  = month(as.Date(date_start)),
    month_name = month(as.Date(date_start), label = TRUE)
  ) %>%
  group_by(state, month_num, month_name) %>%
  summarise(total_cases = sum(cases, na.rm = TRUE), .groups = "drop") %>%
  arrange(state, month_num,month_name)%>%
  mutate(
    state_clean = stri_trans_general(state, "Latin-ASCII") |> toupper()
  )

pr_monthly <- out_pr %>%
  mutate(month = month(as.Date(date))) %>%
  group_by(NAME_1, month) %>%
  summarise(mean_pr = mean(value, na.rm = TRUE),
            geometry = st_union(geometry),
            .groups = "drop")%>%
  mutate(
    NAME_1_clean = stri_trans_general(NAME_1, "Latin-ASCII") |> toupper()  
  )

dengue_pr_joined <- pr_monthly %>%
  left_join(
    dengue_month,
    by = c("NAME_1_clean" = "state_clean", "month" = "month_num")
  )

months_available <- unique(dengue_pr_joined$month_name)

biv_pal <- c(
  "1-1" = "#e8e8e8", "2-1" = "#b5c0da", "3-1" = "#6c83b5",
  "1-2" = "#b8d6be", "2-2" = "#90b2b3", "3-2" = "#567994",
  "1-3" = "#73ae80", "2-3" = "#5a9178", "3-3" = "#2a5a5b"
)

maps <- list()

for (m in months_available) {
  
  df <- dengue_pr_joined %>%
    filter(month_name == m) %>%
    filter(!is.na(mean_pr), !is.na(total_cases))
  
  if (nrow(df) > 5) {
    q_precip_vals <- quantile(df$mean_pr, probs = seq(0, 1, length.out = 4), na.rm = TRUE)
    q_dengue_vals <- quantile(df$total_cases, probs = seq(0, 1, length.out = 4), na.rm = TRUE)
    q_precip_vals <- sort(na.omit(unique(q_precip_vals)))
    q_dengue_vals <- sort(na.omit(unique(q_dengue_vals)))
    if (length(q_precip_vals) < 2 || length(q_dengue_vals) < 2) next
    
    df <- df %>%
      mutate(
        q_precip = findInterval(mean_pr, q_precip_vals, all.inside = TRUE),
        q_dengue = findInterval(total_cases, q_dengue_vals, all.inside = TRUE),
        biv_code = paste0(q_precip, "-", q_dengue)
      )
    
    p <- ggplot(df) +
      geom_sf(aes(fill = biv_code), color = "white", size = 0.2) +
      scale_fill_manual(values = biv_pal, na.value = "gray90") +
      theme_void() +
      labs(title = paste0(m)) +
      theme(
        plot.title = element_text(hjust = 0.5, size = 11, face = "bold"),
        legend.position = "none"
      )
    
    maps[[as.character(m)]] <- p
  }
}

map_faceted <- wrap_plots(maps, ncol = 4, nrow = 3) +
  plot_annotation(
    theme = theme(plot.title = element_text(size = 14, face = "bold"))
  )

n_breaks <- 3
var1_label <- "Precipitation"
var2_label <- "Dengue cases"

legend_df <- expand.grid(v1_class = 1:n_breaks, v2_class = 1:n_breaks) %>%
  mutate(fill = as.vector(t(biv_pal)))

legend_plot <- ggplot(legend_df, aes(x = v1_class, y = v2_class, fill = fill)) +
  geom_tile(color = "white") +
  scale_fill_identity() +
  coord_equal() +
  theme_void() +
  theme(plot.margin = margin(10, 10, 10, 10)) +
  annotate("segment", x = 1, xend = n_breaks + 0.5, y = 0.3, yend = 0.3,
           arrow = arrow(length = unit(0.25, "cm")), color = "black", size = 0.6) +
  annotate("segment", x = 0.3, xend = 0.3, y = 1, yend = n_breaks + 0.5,
           arrow = arrow(length = unit(0.25, "cm")), color = "black", size = 0.6) +
  annotate("text", x = (n_breaks + 1) / 2, y = -0.2,
           label = paste0(var1_label),
           size = 2, fontface = "bold") +
  annotate("text", x = 0, y = (n_breaks + 1) / 2,
           label = paste0(var2_label),
           size = 2, fontface = "bold", angle = 90)

final_plot <- ggdraw() +
  draw_plot(map_faceted, 0, 0, 1, 1) +                       
  draw_plot(legend_plot, 0.73, 0.05, 0.27, 0.27)          

final_plot