-
Notifications
You must be signed in to change notification settings - Fork 3
PLUTO CoY Fields #2108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
PLUTO CoY Fields #2108
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
db078c1 to
7de5489
Compare
3079a14 to
b9d6271
Compare
25f51b0 to
20adcf1
Compare
6580133 to
609b07d
Compare
This was previously pulling in Inclusionary Designated Housing
90a1600 to
501b2a3
Compare
501b2a3 to
2939f8b
Compare
|
I feel like for some of these cases, we either just leave or manually correct (using the same corrections framework as other corrections), and then if anyone ever flags something just add corrections. Like 3037050014 seems like a prime case for correcting. Maybe we create a little csv/shp export of "questionable" bbls as part of the build? I feel like this is maybe more of a Matt question, or could kick it to Transportation/Zoning/Sam and see if anyone wants to make tweaks to the official boundaries |
products/pluto/models/qaqc/intermediate/qaqc_int__mihareas_questionable_assignments.sql
Show resolved
Hide resolved
2939f8b to
e4ab9bd
Compare
products/pluto/models/qaqc/intermediate/qaqc_int__mihareas_questionable_assignments.sql
Outdated
Show resolved
Hide resolved
products/pluto/models/qaqc/intermediate/qaqc_int__transit_zones_questionable_assignments.sql
Show resolved
Hide resolved
| mnffar = b.mnffar, | ||
| affresfar = b.affresfar | ||
| FROM dcp_zoning_maxfar AS b | ||
| WHERE a.zonedist1 = b.zonedist; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So yeah looks like this happens before any splitting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Phew! Seems like we should just drop every subsequent UDPATE statement, and fill in our table with some dashes. Or, as Sam pointed out... why not just use zeros instead of dashes 🤷♂️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Zeroes also make sense to me lol. The dashes force the csv type to text, for better and for worse
4c69799 to
84c034b
Compare
| ) AS row_number | ||
| FROM zone_totals | ||
| WHERE total_pct_covered >= 10; | ||
| ANALYZE transit_zones_bbl_to_tz_ranked; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you test if this has any effect? Or is this a bit of a carry-over from other similar scripts? I just would be surprised if this has much effect on the update
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It actually does! The subsequent update against PLUTO was taking about 1:45mins without, then a minute with the ANALYZE. Apparently the pg Analyze Daemon doesn't kick off immediately after a table is created, so even with indexes, pg doesn't have stats enough to query intelligently.
I think that UPDATE statements against large tables are very very slow, regardless of indexing and analyzing. Wouldn't be surprised if the immutable DBT approach sped things up significantly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though I thought I'd put an index on BBL for this table... actually, IIRC I tried adding it, and it didn't really help
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting. Yeah just joining a bunch of tables with sensible indexes will certainly be at least a tad better than this approach
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though I thought I'd put an index on BBL for this table... actually, IIRC I tried adding it, and it didn't really help
Makes sense, it's already using the pluto index. Has to go through all records of one of the tables regardless. Plus it takes time to make create that index in the first place.
@fvankrieken totally agree - I pinged GIS to start the conversation about what type of QA outputs they want. I'd imagined just pointing them at tables in our build schema, but csvs/shapefiles might be ideal. Will probably add something after talking to them. |
Adds the following to PLUTO
Per Matt, all new fields should be at the very end of the file. Explanations are in the commit messages, or comments in the sql. Logic for assignments for MIH Areas and Transit Zones is at the top of those SQL files.
I'm running a build now, and will post a link to the output when it finishes. I'll hold off from merging these changes until confirming logic with GIS and checking that their scripts can gracefully handle the changes.
How to review? Commits are atomic, and there aren't that many changes. The additional FAR values are straightforward (just continuing an existing pattern) and for MIH Area and Transit Zone there are maps below to visualize the logic and edge cases.
TODO
MIH Areas
The questionable splits for MIH Areas can be found in the
qaqc_int__mihareas_questionable_assignmentsview in my build schema in postgres (ar_pluto_new_fields_1893)Some Questionable Splits
Overall, only 7 Lots are split between multiple MIH areas. Below are some odd cases
The following BBLs are split fairly evenly between two MIH Areas.
BBL: 3037050014

The southern MIH area covers 50.9% of the BBL... however, you can see that that property is in the northern MIH area.
This one might actually be wrong
BBL: 1020130029

These are split between an
Option 1 and 2andOption 1area forOne45 Harlem for All. However, I don't think this one matters - I think these lots are going to be demolished and redrawn into the MIH Areas as part of theOne45 Harlem for Allrezoning.BBL: 2042260001

This is just sort of an odd case. This is PELHAM PARKWAY, and not that it matters terribly, but it probably shouldn't be in either MIH Area. However, it meets the threshold for both, with the slight winner being 1776 Eastchester Road - Option 2, which covers 21% of the lot.
Transit Zones
Basically all lots have a transit zone, except for ~2k that are either in the water or are large parks. (e.g. Central Park doesn't have a transit zone, understandably)
The questionable splits for MIH Areas can be found in the
qaqc_int__transit_zones_questionable_assignmentstable. There are ~800 lots split between multiple zones.Questionable Splits (that probably don't matter)
A walkway in Staten Island:

A Park in Queens

Questionable Splits (that might have implications)
There are a number of cases where the Transit Zones lines were just drawn poorly, and there's not much we can do about it.
BBL = 4125920057
This lot is split 47% left zone / 53% right zone.

And if you zoom out and look at it's block, that seems to indicate that our choice is correct.

BBL = 4036620026
One more for good measure.

We assign cases like this correctly, but there are certainly other cases where we don't. We could try to get clever, and in cases where we find ambiguity then give some preference based on the block. We could also introduce manual corrections.