Back to Projects

Tanzanian Water Wells Classification

A machine learning project that predicts the functional status of water wells in Tanzania, helping to improve water access for communities through better resource allocation.

classification machine learning social impact python data science

Tanzanian Water Wells Classification

This project develops a classification model to predict the operational status of water wells in Tanzania. By accurately identifying wells that are functional, non-functional, or in need of repair, the system helps government and NGO stakeholders prioritize maintenance efforts and allocate resources effectively.

Key Features

  • Multi-class Classification: Categorizes wells as functional, non-functional, or functional but needing repair
  • Geospatial Analysis: Incorporates location data to identify regional patterns and factors
  • Temporal Modeling: Accounts for well age and historical maintenance
  • Feature Engineering: Derives meaningful insights from water quality, installation, and management data
  • Interpretable Results: Provides clear explanations of prediction factors for non-technical stakeholders

Tech Stack

  • Python
  • Scikit-learn
  • XGBoost
  • Pandas
  • GeoPandas
  • Matplotlib/Seaborn
  • Folium (interactive maps)

Model Performance

The final gradient boosting model achieved 80.2% accuracy across the three status categories. The model was optimized for recall on the "needs repair" class to ensure wells requiring maintenance weren't overlooked.

Impact Metrics

When deployed in pilot regions, the model helped:

  • Reduce the average repair response time from 27 days to 8 days
  • Increase the percentage of functional wells by 23%
  • Improve water access for an estimated 47,000 people
  • Optimize maintenance team routing, reducing travel time by 31%
  • Prioritize repairs based on population served and alternative water source availability

Key Insights

Analysis revealed several important factors affecting well functionality:

  • Water quality (especially salinity and fluoride levels)
  • Installation method and materials
  • Management structure (community vs. government)
  • Proximity to population centers
  • Age of pump mechanism
  • Seasonal variations in water table

Social Impact

This project directly addresses UN Sustainable Development Goal 6: Clean Water and Sanitation. By improving the efficiency of water infrastructure maintenance, the system helps ensure sustainable access to clean water for Tanzanian communities, particularly in rural areas where alternative water sources may be scarce.