ocha-stratus Documentation¶
Contents
Overview¶
ocha-stratus is a Python package developed by the Data Science team at the Centre for Humanitarian Data for basic data access and storage operations against internally-managed Azure cloud infrastructure. It provides utilities for:
Reading and writing various data formats to Azure Blob Storage:
Parquet files
CSV files
Shapefiles
Cloud Optimized GeoTIFFs (COGs)
Managing Azure PostgreSQL database connections and operations
Supporting development and production environments
Quick Start¶
Install the package:
pip install ocha-stratus
Azure Blob Storage¶
import ocha_stratus as stratus
# Upload a pandas DataFrame as CSV
stratus.upload_csv_to_blob(df, "data.csv", stage="dev")
# Load it back
df = stratus.load_csv_from_blob("data.csv", stage="dev")
Azure PostgreSQL Database¶
import ocha_stratus as stratus
# Get database connection
engine = stratus.get_engine(stage="dev")
# Perform upsert operation
stratus.postgres_upsert(table, conn, keys, data_iter)
Load COGs and clip by a GeoDataFrame¶
import ocha_stratus as stratus
gdf = stratus.load_codab_from_blob(
iso3="NGA",
admin_level=0
)
date_range = ["2024-01-01", "2024-02-01", "2024-03-01"]
ds = stratus.stack_cogs("era5", date_range, "dev", clip_gdf=gdf)
Environment Configuration¶
This package depends on the following environment variables:
# Development Environment
DSCI_AZ_BLOB_DEV_SAS=your_dev_sas_token
DSCI_AZ_DB_DEV_PW=your_dev_db_password
DSCI_AZ_DB_DEV_UID=your_dev_db_uid
DSCI_AZ_BLOB_DEV_SAS_WRITE=your_dev_sas_token_w_write_permissions
DSCI_AZ_DB_DEV_PW_WRITE=your_dev_db_password_w_write_permissions
DSCI_AZ_DB_DEV_UID_WRITE=your_dev_db_uid_w_write_permissions
DSCI_AZ_DB_DEV_HOST=your_dev_db_host
# Production Environment
DSCI_AZ_BLOB_PROD_SAS=your_prod_sas_token
DSCI_AZ_DB_PROD_PW=your_prod_db_password
DSCI_AZ_DB_PROD_UID=your_prod_db_uid
DSCI_AZ_BLOB_PROD_SAS_WRITE=your_prod_sas_token_w_write_permissions
DSCI_AZ_DB_PROD_PW_WRITE=your_prod_db_password_w_write_permissions
DSCI_AZ_DB_PROD_UID_WRITE=your_prod_db_uid_w_write_permissions
DSCI_AZ_DB_PROD_HOST=your_prod_db_host