Skip to content

Data Sources

Louis Roehrs edited this page Feb 14, 2023 · 3 revisions

Lists by region of 501(c)(3) organizations as provided by the IRS can be found here.

https://www.irs.gov/charities-non-profits/exempt-organizations-business-master-file-extract-eo-bmf

A csv for the Pacific Region is here. https://www.irs.gov/pub/irs-soi/eo3.csv

Data dictionary is here. https://www.irs.gov/pub/irs-soi/eo_info.pdf

EIN,NAME,ICO,STREET,CITY,STATE,ZIP,GROUP,SUBSECTION,AFFILIATION,CLASSIFICATION,RULING,DEDUCTIBILITY,FOUNDATION,ACTIVITY,ORGANIZATION,STATUS,TAX_PERIOD,ASSET_CD,INCOME_CD,FILING_REQ_CD,PF_FILING_REQ_CD,ACCT_PD,ASSET_AMT,INCOME_AMT,REVENUE_AMT,NTEE_CD,SORT_NAME

We care about EIN, NAME, STREET,CITY,STATE,ZIP, STATUS, DEDUCTIBILITY, NTEE_CD, ACTIVITY. Any others?

For categorization, will need to map ACTIVITY codes (assigned for non-profits before 1995) to NTEE codes (for newer non-profits). The goal of the mapping is to help users find non-profits they care about.

We could use this as a start for scraping. First we would filter this list to the SF Bay Area and to certain categories and STATUS == 1. (Which?)

Clone this wiki locally