Skip to content

Conversation

@alexrichey
Copy link
Contributor

@alexrichey alexrichey commented Dec 8, 2025

Adds the following to PLUTO

  • Additional FAR values
  • MIH Areas for Lots
  • Transit Zones for Lots (two addl fields)

Per Matt, all new fields should be at the very end of the file. Explanations are in the commit messages, or comments in the sql. Logic for assignments for MIH Areas and Transit Zones is at the top of those SQL files.

I'm running a build now, and will post a link to the output when it finishes. I'll hold off from merging these changes until confirming logic with GIS and checking that their scripts can gracefully handle the changes.

How to review? Commits are atomic, and there aren't that many changes. The additional FAR values are straightforward (just continuing an existing pattern) and for MIH Area and Transit Zone there are maps below to visualize the logic and edge cases.

TODO

  • MIH: AR check with Zoning about a threshold.
  • Transit Zone:
    • Determine the severity of a mis-assignment
    • Determine how / Implement something to hold the tz steady.
    • Double check pct logic with Zoning

MIH Areas

The questionable splits for MIH Areas can be found in the qaqc_int__mihareas_questionable_assignments view in my build schema in postgres (ar_pluto_new_fields_1893)

Some Questionable Splits

Overall, only 7 Lots are split between multiple MIH areas. Below are some odd cases

The following BBLs are split fairly evenly between two MIH Areas.

BBL: 3037050014
The southern MIH area covers 50.9% of the BBL... however, you can see that that property is in the northern MIH area.
This one might actually be wrong
image


BBL: 1020130029
These are split between an Option 1 and 2 and Option 1 area for One45 Harlem for All. However, I don't think this one matters - I think these lots are going to be demolished and redrawn into the MIH Areas as part of the One45 Harlem for All rezoning.
image


BBL: 2042260001
This is just sort of an odd case. This is PELHAM PARKWAY, and not that it matters terribly, but it probably shouldn't be in either MIH Area. However, it meets the threshold for both, with the slight winner being 1776 Eastchester Road - Option 2, which covers 21% of the lot.
image

Transit Zones

Basically all lots have a transit zone, except for ~2k that are either in the water or are large parks. (e.g. Central Park doesn't have a transit zone, understandably)

The questionable splits for MIH Areas can be found in the qaqc_int__transit_zones_questionable_assignments table. There are ~800 lots split between multiple zones.

Questionable Splits (that probably don't matter)

A walkway in Staten Island:
image

A Park in Queens
image

Questionable Splits (that might have implications)

There are a number of cases where the Transit Zones lines were just drawn poorly, and there's not much we can do about it.

BBL = 4125920057

This lot is split 47% left zone / 53% right zone.
image

And if you zoom out and look at it's block, that seems to indicate that our choice is correct.
image

BBL = 4036620026

One more for good measure.
image

We assign cases like this correctly, but there are certainly other cases where we don't. We could try to get clever, and in cases where we find ambiguity then give some preference based on the block. We could also introduce manual corrections.

@alexrichey alexrichey changed the title Ar pluto new fields 1893 PLUTO CoY Fields Dec 8, 2025
@codecov
Copy link

codecov bot commented Dec 8, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 74.68%. Comparing base (bc8faa2) to head (66a0007).
⚠️ Report is 17 commits behind head on main.

Additional details and impacted files

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@alexrichey alexrichey force-pushed the ar-pluto-new-fields-1893 branch 3 times, most recently from db078c1 to 7de5489 Compare December 8, 2025 22:43
@alexrichey alexrichey linked an issue Dec 10, 2025 that may be closed by this pull request
@alexrichey alexrichey force-pushed the ar-pluto-new-fields-1893 branch 6 times, most recently from 3079a14 to b9d6271 Compare December 17, 2025 22:20
@alexrichey alexrichey force-pushed the ar-pluto-new-fields-1893 branch 9 times, most recently from 25f51b0 to 20adcf1 Compare January 6, 2026 20:37
@alexrichey alexrichey force-pushed the ar-pluto-new-fields-1893 branch 2 times, most recently from 6580133 to 609b07d Compare January 7, 2026 16:54
This was previously pulling in Inclusionary Designated Housing
@alexrichey alexrichey force-pushed the ar-pluto-new-fields-1893 branch 3 times, most recently from 90a1600 to 501b2a3 Compare January 7, 2026 20:17
@alexrichey alexrichey force-pushed the ar-pluto-new-fields-1893 branch from 501b2a3 to 2939f8b Compare January 7, 2026 20:17
@alexrichey alexrichey marked this pull request as ready for review January 7, 2026 20:20
@fvankrieken
Copy link
Contributor

I feel like for some of these cases, we either just leave or manually correct (using the same corrections framework as other corrections), and then if anyone ever flags something just add corrections.

Like 3037050014 seems like a prime case for correcting.

Maybe we create a little csv/shp export of "questionable" bbls as part of the build? I feel like this is maybe more of a Matt question, or could kick it to Transportation/Zoning/Sam and see if anyone wants to make tweaks to the official boundaries

@alexrichey alexrichey force-pushed the ar-pluto-new-fields-1893 branch from 2939f8b to e4ab9bd Compare January 8, 2026 16:17
mnffar = b.mnffar,
affresfar = b.affresfar
FROM dcp_zoning_maxfar AS b
WHERE a.zonedist1 = b.zonedist;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So yeah looks like this happens before any splitting

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Phew! Seems like we should just drop every subsequent UDPATE statement, and fill in our table with some dashes. Or, as Sam pointed out... why not just use zeros instead of dashes 🤷‍♂️

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Zeroes also make sense to me lol. The dashes force the csv type to text, for better and for worse

@alexrichey alexrichey force-pushed the ar-pluto-new-fields-1893 branch from 4c69799 to 84c034b Compare January 8, 2026 16:28
) AS row_number
FROM zone_totals
WHERE total_pct_covered >= 10;
ANALYZE transit_zones_bbl_to_tz_ranked;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you test if this has any effect? Or is this a bit of a carry-over from other similar scripts? I just would be surprised if this has much effect on the update

Copy link
Contributor Author

@alexrichey alexrichey Jan 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It actually does! The subsequent update against PLUTO was taking about 1:45mins without, then a minute with the ANALYZE. Apparently the pg Analyze Daemon doesn't kick off immediately after a table is created, so even with indexes, pg doesn't have stats enough to query intelligently.

I think that UPDATE statements against large tables are very very slow, regardless of indexing and analyzing. Wouldn't be surprised if the immutable DBT approach sped things up significantly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though I thought I'd put an index on BBL for this table... actually, IIRC I tried adding it, and it didn't really help

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. Yeah just joining a bunch of tables with sensible indexes will certainly be at least a tad better than this approach

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Though I thought I'd put an index on BBL for this table... actually, IIRC I tried adding it, and it didn't really help

Makes sense, it's already using the pluto index. Has to go through all records of one of the tables regardless. Plus it takes time to make create that index in the first place.

@alexrichey
Copy link
Contributor Author

I feel like for some of these cases, we either just leave or manually correct (using the same corrections framework as other corrections), and then if anyone ever flags something just add corrections.

Like 3037050014 seems like a prime case for correcting.

Maybe we create a little csv/shp export of "questionable" bbls as part of the build? I feel like this is maybe more of a Matt question, or could kick it to Transportation/Zoning/Sam and see if anyone wants to make tweaks to the official boundaries

@fvankrieken totally agree - I pinged GIS to start the conversation about what type of QA outputs they want. I'd imagined just pointing them at tables in our build schema, but csvs/shapefiles might be ideal. Will probably add something after talking to them.

@damonmcc damonmcc linked an issue Jan 12, 2026 that may be closed by this pull request
@damonmcc damonmcc removed a link to an issue Jan 12, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

add CoY PLUTO columns

5 participants