Playing With Fire and Priors: Learning the Limits of Bayesian Linear Regression with PyMC

This notebook demonstrates the complete workflow for building a Bayesian Simple Linear Regression model using PyMC to predict wildfire sizes solely from wind speed data. In it we formalize the model using statistical notation, implement prior predictive simulations to validate our assumptions, and then generate posterior distributions through Markov Chain Monte Carlo (MCMC) sampling. Unfortunately, the analysis revealed that wind speed alone is a weak (quite honestly, a terrible) predictor of fire size. Nonetheless, the value-add in our work was that it provided practical insights about the importance of model diagnostics and the pitfalls of violating assumptions of linearity and heteroscedasticity when working with real-world data!

Read More

An Intro to PyMC and the Language for Describing Statistical Models

This notebook introduces the language of statistical model notation used to describe Bayesian models, demonstrating how mathematical notation translates directly into executable PyMC code. Using a real-world example of Vancouver Island Coastal Wolves (a recently discovered subspecies known for displaying distinguished behaviour from other wolf populations such the ability to swim great distances or the significant prevelance of marine organisms in their diet), we build a Bayesian model to estimate gender ratios from limited sample data to emphasize how posterior distributions reveal the full range of plausible outcomes rather than single point estimates.

Read More

Why Most Introductory Examples of Bayesian Statistics Misrepresent It

This notebook introduces the basic idea behind Bayes’ Theorem and highlights some key differences between Bayesian Statistics and the widely taught traditional branch of statistics, commonly known as Frequentist Statistics. Furthermore we challenge the traditional medical testing example used in countless textbooks to introduce Bayesian Inference, arguing that using fixed constants (also known as point estimates) misrepresents the true nature of Bayesian Statistics. Rather than inputting single values in Bayes’ Theorem, we utilize a more faithful Bayesian approach by considering probability distributions of possible outcomes, thus revealing how disease prevalence uncertainty affects diagnostic accuracy.

Read More

How to Augment Wildfire Datasets with Historical Weather Data using Python and Google Earth Engine

This notebook demonstrates how to augment wildfire datasets with historical weather data using Python and Google Earth Engine’s ERA5 dataset. Here we transform basic fire records (coordinates and timestamps) into enriched datasets containing comprehensive environmental context including temperature, wind patterns, humidity, and soil conditions which are all critical information for fire risk modeling and analysis. The workflow includes automatic retry logic for handling API timeouts and incremental batch saving to prevent data loss during long processing runs.

Read More

You're up and running!

Next you can update your site name, avatar and other options using the _config.yml file in the root of your repository (shown below).

Read More