-
Notifications
You must be signed in to change notification settings - Fork 1
Add detailed site-breakage data #71
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -80,11 +80,68 @@ | |
| "additionalProperties": false, | ||
| "properties": { | ||
| "breakage": { | ||
| "description": "List of URLs to issues where actual site breakage is tracked. If available, point the link directly to a comment with clear STR", | ||
| "description": "List of issues tracking known site breakage.", | ||
| "type": "array", | ||
| "items": { | ||
| "type": "string", | ||
| "format": "uri" | ||
| "anyOf": [ | ||
| { | ||
| "type": "string", | ||
| "format": "uri", | ||
| "description": "URL of tracking issue" | ||
| }, | ||
| { | ||
| "type": "object", | ||
| "properties": { | ||
| "url": { | ||
| "type": "string", | ||
| "format": "uri", | ||
| "description": "URL of tracking issue" | ||
| }, | ||
| "site": { | ||
| "type": "string", | ||
| "format": "uri", | ||
| "description": "URL of broken site or page" | ||
| }, | ||
| "platform": { | ||
| "type": "array", | ||
| "items": { | ||
| "type": "string", | ||
| "enum": ["all", "desktop", "mobile", "windows", "macos", "linux"], | ||
| "description": "List of affected platforms. Default is 'all'" | ||
| } | ||
| }, | ||
| "last_reproduced": { | ||
| "type": "string", | ||
| "format": "date", | ||
| "description": "Most recent date the issue was successfully reproduced" | ||
| }, | ||
| "intervention": { | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we ever have > 1 intervention per site?
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For the "same" issue that is; I imagine we could have a site affected by multiple kinds of breakage that get seperate interventions.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. While >1 intervention per site is possible in theory (hasn't happened yet), more than one intervention per site per issue is unlikely. (Edit: yeah, what you just wrote. Your reply wasn't there yet when I hit "comment")
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think there is a google maps intervention that has css and js files
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. yeah, but that's "the same" intervention, and we should probably link to the definition, instead of the individual files (in that case? in general?).
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. True, and I guess in this case the js intervention is the more important one.. could either link to it, or the definition. In general, it seems individual files would be easier to locate, if the intervention is removed ( |
||
| "type": "string", | ||
| "format": "uri", | ||
| "description": "URL of intervention that is shipping or has been shipped. Link to the code in the GitHub repository, and use the canonical URLs to ensure persistance over time" | ||
| }, | ||
| "impact": { | ||
| "type": "string", | ||
| "enum": ["site_broken", "feature_broken", "significant_visual", "minor_visual", "unsupported_message"], | ||
| "description": "Type of breakage" | ||
| }, | ||
| "affects_users": { | ||
| "type": "string", | ||
| "enum": ["all", "some", "few"], | ||
| "description": "What fraction of users are affected. 'all' where any site user is likely to run into the issue, 'some' for issues that are common but many users will not experience, and 'few' where the breakage depends on an unusual configuration or similar." | ||
| }, | ||
| "resolution": { | ||
| "type": "string", | ||
| "enum": ["site_changed", "site_fixed"], | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Should
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I thought
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. oops sorry - I thought
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I was assuming it would be easier to treat "still broken" as the missing value default rather than an explicit state.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ok, that makes sense :)
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This raises the question again of "what do we do with KB-entries about platform issues that have been fixed". If we, in some future, fix Keeping that in mind, we probably need different "fixed" resolutions, depending on what happened:
or something along those lines.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think if the platform issue is fixed we should either a) have some top-level status field that records that or b) move the entry into a different directory. I think I favour b) because it seems easier to implement and also probably makes tooling easier (you just ignore all the "fixed" issues for most operations by not looking in that directory). My thinking for not considering "fixed_by_intervention" as a "fixed" status is that the issue isn't really fixed and we already have a field to link to the intervention, if there is one.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
This is true, but that distinction is relevant for our ranking. If we have an issue with critical breakage on critical sites, but we can and do ship interventions for that, the issue should probably be ranked lower than an issue we can't ship interventions for.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right, but I was imagining the presence of an intervention would cut the computed priority quite a lot. Are there cases where we have an intervention but the issue isn't really "fixed" by the intervention?
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think so. If we can't completely fix an issue with an intervention, we usually don't even bother shipping it. So if you want to take the presence of an intervention as an additional signal, not just the state, then yeah, your version works fine. :) |
||
| "description": "If the issue no longer reproduces on this site, the kind of change that happened. 'site_change' if there was a general redesign or the site is no longer online, 'site_fixed' if the specific issue was patched." | ||
| }, | ||
| "notes": { | ||
| "type": "string", | ||
| "description": "Any additional notes about why the other fields for this issue are set to the given values." | ||
| } | ||
| } | ||
| } | ||
| ] | ||
| } | ||
| }, | ||
| "platform_issues": { | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wonder if we want to store just the domain name here? Though probably not a big deal as we could parse and extract it to determine site rank on the computing stage :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I think for the tranco list the hard part is actually knowing what the eTLD+1 is, since I think that's what it's storing (e.g. we don't want to lookup
drive.google.combut justgoogle.com). So I think putting the whole URL is better because a) it's easier to copy and b) it preserves information in case we can do more accurate lookups in the future, without causing any additional difficulty because we still need to parse it and do a public suffix list lookup.Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That made me think that we have logic for this on webcompat.com (it extracts url from the issue body, checks first level domain and then less level domain (if no rank is found), gets the ranking from Tranco db or Alexa db) https://github.com/webcompat/webcompat.com/blob/main/webcompat/webhooks/helpers.py#L122. We use it for adding priority labels, but I was planning on changing it to add numerical rank in webcompat/webcompat.com#3707.
Maybe I could add an API as well, which we can call to get the rank when computing the overall impact? We have Tranco data and can fetch more per country data from Alexa (while it's still accessible).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, an API that takes a URL or a (full) domain and returns the site rank sounds amazing. I was looking at implementing it, but it makes sense that you already have it for webcompat.com and calling an API is easier than writing all the logic :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, I'll look into it then!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although, thinking about it, for an API we would presumably want to have rate limits to avoid abuse, and that could make the client code more complex than just downloading the tranco list and matching each part of the domain from the end. So if you don't get to the API part don't worry.