CC-BY-4.0
Open Water Quality Dataset
Free U.S. water quality data by ZIP code — violations, lead/copper levels, radon zones, and Home Safety Scores. Updated daily from EPA SDWIS.
1,990
ZIP codes
53,959
Total violations recorded
51
States + DC covered
2026-03-16
Last updated
Download
Dataset Fields
One row per ZIP code. Numeric fields are null when data is unavailable for that ZIP.
| Field | Description |
|---|---|
| zip | 5-digit ZIP code |
| city | City name |
| state | 2-letter state abbreviation |
| system_name | Primary water system name |
| pwsid | EPA Public Water System ID |
| population | Population served by primary system |
| water_source | SW = Surface Water, GW = Groundwater |
| total_violations | Total violations in past 5 years |
| health_violations | Health-based violations in past 5 years |
| unresolved_violations | Currently unresolved violations |
| lead_level_mg_l | 90th percentile lead level (mg/L), null if no data |
| copper_level_mg_l | 90th percentile copper level (mg/L), null if no data |
| radon_zone | EPA radon zone (1 = highest risk, 3 = lowest), null if no data |
| home_safety_score | Composite score 0–100, null if insufficient data |
| home_safety_grade | A/B/C/D/F letter grade |
| latitude | ZIP centroid latitude |
| longitude | ZIP centroid longitude |
| contaminant_count | Number of distinct health-based contaminants |
| health_contaminant_names | Semicolon-separated list of health-based contaminant names |
Usage Examples
Python (pandas)
import pandas as pd
df = pd.read_csv('https://zipcheckup.com/data/open/zipcheckup-water-quality.csv')
print(df.head())
# Filter by state
ca_zips = df[df['state'] == 'CA']
R
df <- read.csv('https://zipcheckup.com/data/open/zipcheckup-water-quality.csv')
head(df)
# Filter ZIPs with health violations
risky <- df[df$health_violations > 0, ]
JavaScript (browser / Node.js)
fetch('https://zipcheckup.com/data/open/zipcheckup-water-quality.json')
.then(r => r.json())
.then(data => {
const withLead = data.filter(z => z.lead_level_mg_l > 0.005);
console.log(`ZIPs with elevated lead: ${withLead.length}`);
});
License: CC-BY-4.0
This dataset is released under the Creative Commons Attribution 4.0 International license. You are free to share and adapt the data for any purpose, including commercial use, as long as you provide attribution to ZipCheckup.com.
Required attribution: "Data sourced from ZipCheckup.com, based on EPA SDWIS data."
Also Available On
- HuggingFace Datasets (coming soon)
- Kaggle (coming soon)
- Data.world (coming soon)
Data Source
Water quality data is sourced from the U.S. EPA Safe Drinking Water Information System (SDWIS). Radon zone data comes from the EPA's county-level radon potential map. Home Safety Scores are a ZipCheckup composite metric — full methodology here.
Water quality data is sourced from the U.S. EPA Safe Drinking Water Information System (SDWIS). Radon zone data comes from the EPA's county-level radon potential map. Home Safety Scores are a ZipCheckup composite metric — full methodology here.
Disclaimer: This dataset reflects reported EPA SDWIS records and is not an assurance of current water safety. Water quality can change. For the most current information, contact your local water utility or request a Consumer Confidence Report.
Want a full water safety report for your ZIP code?
Check Your ZIP Code