Skip to content
This repository was archived by the owner on Apr 29, 2024. It is now read-only.
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
168 changes: 168 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

36 changes: 32 additions & 4 deletions data/sidebar.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,37 @@ solutions:

references:
breakage:
- https://github.com/webcompat/web-bugs/issues/11622
- https://github.com/webcompat/web-bugs/issues/55685
- https://github.com/webcompat/web-bugs/issues/67848
- https://github.com/webcompat/web-bugs/issues/106830
- url: https://github.com/webcompat/web-bugs/issues/11622
site: http://www.election.gov.np
platform:
- all
intervention: https://github.com/mozilla-extensions/webcompat-addon/blob/4a3094f52d561925620ac441d5c4423766ec5c29/src/webextension/injections/js/bug1472081-election.gov.np-window.sidebar-shim.js
impact: feature_broken
affects_users: all
resolution: site_changed
- url: https://github.com/webcompat/web-bugs/issues/55685
site: http://www.susu09.com/forum.php
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonder if we want to store just the domain name here? Though probably not a big deal as we could parse and extract it to determine site rank on the computing stage :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think for the tranco list the hard part is actually knowing what the eTLD+1 is, since I think that's what it's storing (e.g. we don't want to lookup drive.google.com but just google.com). So I think putting the whole URL is better because a) it's easier to copy and b) it preserves information in case we can do more accurate lookups in the future, without causing any additional difficulty because we still need to parse it and do a public suffix list lookup.

Copy link
Contributor

@ksy36 ksy36 Jul 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think for the tranco list the hard part is actually knowing what the eTLD+1 is, since I think that's what it's storing (e.g. we don't want to lookup drive.google.com but just google.com).

That made me think that we have logic for this on webcompat.com (it extracts url from the issue body, checks first level domain and then less level domain (if no rank is found), gets the ranking from Tranco db or Alexa db) https://github.com/webcompat/webcompat.com/blob/main/webcompat/webhooks/helpers.py#L122. We use it for adding priority labels, but I was planning on changing it to add numerical rank in webcompat/webcompat.com#3707.

Maybe I could add an API as well, which we can call to get the rank when computing the overall impact? We have Tranco data and can fetch more per country data from Alexa (while it's still accessible).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, an API that takes a URL or a (full) domain and returns the site rank sounds amazing. I was looking at implementing it, but it makes sense that you already have it for webcompat.com and calling an API is easier than writing all the logic :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I'll look into it then!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although, thinking about it, for an API we would presumably want to have rate limits to avoid abuse, and that could make the client code more complex than just downloading the tranco list and matching each part of the domain from the end. So if you don't get to the API part don't worry.

platform:
- all
last_reproduced: 2020-07-27
impact: site_broken
affects_users: all
resolution: site_fixed
- url: https://github.com/webcompat/web-bugs/issues/67848
site: https://indichords.com/
platform:
- all
last_reproduced: 2021-03-08
impact: feature_broken
affects_users: all
resolution: site_fixed
- url: https://github.com/webcompat/web-bugs/issues/106830
site: https://www.pontefractfhs.org.uk
platform:
- all
last_reproduced: 2022-07-09
impact: site_broken
affects_users: all
resolution: site_fixed
platform_issues:
- https://bugzilla.mozilla.org/show_bug.cgi?id=1428302
63 changes: 60 additions & 3 deletions schemas/entry.schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -80,11 +80,68 @@
"additionalProperties": false,
"properties": {
"breakage": {
"description": "List of URLs to issues where actual site breakage is tracked. If available, point the link directly to a comment with clear STR",
"description": "List of issues tracking known site breakage.",
"type": "array",
"items": {
"type": "string",
"format": "uri"
"anyOf": [
{
"type": "string",
"format": "uri",
"description": "URL of tracking issue"
},
{
"type": "object",
"properties": {
"url": {
"type": "string",
"format": "uri",
"description": "URL of tracking issue"
},
"site": {
"type": "string",
"format": "uri",
"description": "URL of broken site or page"
},
"platform": {
"type": "array",
"items": {
"type": "string",
"enum": ["all", "desktop", "mobile", "windows", "macos", "linux"],
"description": "List of affected platforms. Default is 'all'"
}
},
"last_reproduced": {
"type": "string",
"format": "date",
"description": "Most recent date the issue was successfully reproduced"
},
"intervention": {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we ever have > 1 intervention per site?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the "same" issue that is; I imagine we could have a site affected by multiple kinds of breakage that get seperate interventions.

Copy link
Member

@denschub denschub Jul 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While >1 intervention per site is possible in theory (hasn't happened yet), more than one intervention per site per issue is unlikely. (Edit: yeah, what you just wrote. Your reply wasn't there yet when I hit "comment")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is a google maps intervention that has css and js files
https://github.com/mozilla-extensions/webcompat-addon/blob/main/src/data/injections.js#L170

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, but that's "the same" intervention, and we should probably link to the definition, instead of the individual files (in that case? in general?).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, and I guess in this case the js intervention is the more important one.. could either link to it, or the definition.

In general, it seems individual files would be easier to locate, if the intervention is removed (
provided we use canonical urls 🙂), like you mentioned in #70 (comment)

"type": "string",
"format": "uri",
"description": "URL of intervention that is shipping or has been shipped. Link to the code in the GitHub repository, and use the canonical URLs to ensure persistance over time"
},
"impact": {
"type": "string",
"enum": ["site_broken", "feature_broken", "significant_visual", "minor_visual", "unsupported_message"],
"description": "Type of breakage"
},
"affects_users": {
"type": "string",
"enum": ["all", "some", "few"],
"description": "What fraction of users are affected. 'all' where any site user is likely to run into the issue, 'some' for issues that are common but many users will not experience, and 'few' where the breakage depends on an unusual configuration or similar."
},
"resolution": {
"type": "string",
"enum": ["site_changed", "site_fixed"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should site_broken be also an option? (noticed it is used in one of the sidebar breakage reports)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought site_changed could also cover "this site just doesn't exist anymore" because the use case I had in mind was tracking how often sites patch our issues directly vs how often they persist until the site undergoes larger changes; for the latter "broken" and "redesigned" are basically the same. But maybe there is some other case where we would benefit from tracking the difference?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oops sorry - I thought site_broken is used in resolution, but it's actually one of the impact fields.
I guess my question is more about whether we want to track that site is still "broken" (i.e. issue reproducible) in the resolution field, or such issues will have no resolution field?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was assuming it would be easier to treat "still broken" as the missing value default rather than an explicit state.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, that makes sense :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This raises the question again of "what do we do with KB-entries about platform issues that have been fixed".

If we, in some future, fix <marquee>. Do we delete marquee.yml? Do we remove all site breakage because we no longer see some? Do we move the entry into a "fixed" folder? imho, having historic data is good, because it allows us to go back in time and check back on things we've succeeded on.

Keeping that in mind, we probably need different "fixed" resolutions, depending on what happened:

  • fixed_by_intervention
  • fixed_by_platform_change
  • fixed_by_site

or something along those lines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if the platform issue is fixed we should either a) have some top-level status field that records that or b) move the entry into a different directory. I think I favour b) because it seems easier to implement and also probably makes tooling easier (you just ignore all the "fixed" issues for most operations by not looking in that directory).

My thinking for not considering "fixed_by_intervention" as a "fixed" status is that the issue isn't really fixed and we already have a field to link to the intervention, if there is one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that the issue isn't really fixed and we already have a field to link to the intervention, if there is one.

This is true, but that distinction is relevant for our ranking. If we have an issue with critical breakage on critical sites, but we can and do ship interventions for that, the issue should probably be ranked lower than an issue we can't ship interventions for.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but I was imagining the presence of an intervention would cut the computed priority quite a lot. Are there cases where we have an intervention but the issue isn't really "fixed" by the intervention?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so. If we can't completely fix an issue with an intervention, we usually don't even bother shipping it. So if you want to take the presence of an intervention as an additional signal, not just the state, then yeah, your version works fine. :)

"description": "If the issue no longer reproduces on this site, the kind of change that happened. 'site_change' if there was a general redesign or the site is no longer online, 'site_fixed' if the specific issue was patched."
},
"notes": {
"type": "string",
"description": "Any additional notes about why the other fields for this issue are set to the given values."
}
}
}
]
}
},
"platform_issues": {
Expand Down
Loading