Page MenuHomePhabricator

What to do with old open patches for unmaintained/inactive repositories when not even the uploader responds
Closed, ResolvedPublic

Description

Criteria to identify inactive code repositories:

  1. Premise: no patches have been merged to the repository in at least 180 days (localization updates don't count).
  2. If there are open changesets submitted without any review or stuck with 0/ 1 after 90 days, the repository is labelled POSSIBLY INACTIVE. Ideally, a notification would be sent to the identified maintainers and other contributors to the project.
  3. If there are open changesets submitted without any review or stuck with 0/ 1 after 180 days, the repository is labelled INACTIVE.

Background

There is this pattern in Gerrit:

  • Old open patch(es) waiting for review.
  • No active maintainers around.
  • For the past 1-2 years, commit history only shows localizations updates and an occasional hotfix from an external developer (usually a WMF employee).
  • Not even the initial uploader responds when asking for updates in the changeset.

It would be useful to have a process agreed and documented for these cases.

  • Should these changesets be marked as abandoned after, say, a year?
  • Should the wiki page related to this extension be marked as unmaintained, welcoming patches only after there are maintainers?
  • Should we remove the repository from our metrics on http://korma.wmflabs.org/ ? (this is only a problem when there are patches open for review, not -1 or WIP)

See for instance https://gerrit.wikimedia.org/r/#/c/148020/

Actions

See draft in https://www.mediawiki.org/wiki/Gerrit/Inactive_projects

To be marked as inactive

According to the criteria above, this is the initial list of repositories that could be marked as inactive and be filtered out from code review metrics in korma:

As per project descriptions in Gerrit:

  • analytics/kraken - (deprecated) Wikimedia's self-service data platform See the refinery repos for the currently used code.
  • analytics/kraken/deploy - (deprecated) Deployment project for Kraken. See the refinery repos for the currently used code.
  • labs/incubator - TO BE DELETED
  • mediawiki/extensions/InterwikiMagic - OBSOLETE. This extension is now obsolete, as it has been integrated into the Interwiki extension; see bug #68241 and gerrit:147755. ShoutWiki Interwiki Magic is a MediaWiki extension that fetches interwiki links (as opposed to interlanguage links) from $wgSharedDB while still fetching interlanguge links from the local database.
  • mediawiki/extensions/Narayam - ARCHIVED - Input method extension
  • mediawiki/extensions/OpenSearchXml - ARCHIVED - MediaWiki extension OpenSearchXml.
  • mediawiki/extensions/ProxyListDb - ARCHIVED - MediaWiki extension ProxyListDb.
  • mediawiki/extensions/skins - [OLD AND OBSOLETE] Collection of MediaWiki skins For current, maintained, functional skins, please see their respective mediawiki/skins/* repositories. All new skins' repositories should follow the mediawiki/skins/ naming convention. This repository is a left-over from the SVN era.
  • mediawiki/extensions/WebFonts - ARCHIVED - Dynamic font embedding for Mediawiki pages
  • mediawiki/extensions/WikibaseSolr - DEPRECATED (in favor of ElasticSearch support for Wikibase) MediaWiki extension WikibaseSolr.
  • mediawiki/php/wikidiff - PHP extension wikidiff (obsolete)
  • operations/debs/libanon - DEPRECATED, do not use. We have imported the source of libanon and are using a different repository to create debian packages. Use this: https://gerrit.wikimedia.org/r/#/admin/projects/analytics/libanon
  • operations/debs/sartoris - DELETE ME
  • operations/puppet/cdh4 - This repository has been deprecated in favor of https://gerrit.wikimedia.org/r/#/admin/projects/operations/puppet/cdh
  • operations/software/mwprof/reporter - MwProf profiling data webapp. -- DEPRECATED https://phabricator.wikimedia.org/T97509
  • wikimedia/fundraising/civicrm - Deprecated.

Related Objects

StatusSubtypeAssignedTask
DuplicateQgil
ResolvedQgil
ResolvedQgil
InvalidNone
InvalidNone
DeclinedNone
DeclinedNone
ResolvedQgil
ResolvedQgil
ResolvedQgil
ResolvedQgil
ResolvedNone
ResolvedAklapper
ResolvedDicortazar
ResolvedQgil
ResolvedAklapper
OpenNone
ResolvedQgil
ResolvedAklapper
ResolvedAklapper
ResolvedDicortazar
ResolvedAklapper
ResolvedDicortazar
Resolved mmodell
ResolvedLegoktm
Resolvedtstarling
Resolvedgreg
ResolvedAklapper
ResolvedAklapper
ResolvedAklapper
ResolvedAklapper
ResolvedNone
DeclinedNone

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

Can you still upload a review request for an read only repository? (As it was said above that we should not prevent people from doing that.)

Nop Gerrit rejects patches (tested with mediawiki/extensions/Narayam)

! [remote rejected] HEAD -> refs/publish/master (project is read only)

[...] it would be very nice to rename the repository under a different Gerrit namespace (such as /archived/ or /attic/) and have them set read-only in Gerrit [...]

If repos change their namespace when being (un-)archived, it would make it unnecessarily hard to link to them reliably.

@demon has the same concern. For abandoned / archived repos I am not sure it is much a problem though.

Also, Gerrit does not allow to rename repos.

I suggested that since that is what OpenStack does. They have some (outdated?) manual detailing the steps to rename a repo. Needs a couple UPDATE to the database while Gerrit is done and moving the git repo files on the server. http://docs.openstack.org/infra/system-config/gerrit.html#renaming-a-project

Other gerrit sites:

  • set repos to read-only (newer Gerrits show a lock icon for read-only repos in the repo overview. Also, one can easily detect read-only projects through gerrit's API), and
  • prepend a marker (e.g.: Deprecated.) to the project description.

E.g.: https://gerrit-review.googlesource.com/#/admin/projects/?filter=bugzilla shows hooks-bugzilla with a lock icon, and its project description starts with ´Deprecated`.

Our old Gerrit misses the lock icon, but the setting read-only and prepending a marker text to the project description also happens in WMF's gerrit. E.g.: https://gerrit.wikimedia.org/r/#/admin/projects/mediawiki/extensions/OpenSearchXml

read-only and prefixing the description is probably good enough. I recently did that for three extensions:

$ gerrit ls-projects --type CODE --description|grep ARCHIVE
mediawiki/extensions/Narayam - ARCHIVED - Input method extension
mediawiki/extensions/OpenSearchXml - ARCHIVED - MediaWiki extension OpenSearchXml.
mediawiki/extensions/WebFonts - ARCHIVED - Dynamic font embedding for Mediawiki pages
$

Annoyingly ls-projects does not let you filter out read only repositories. The Gerrit REST API to list projects returns ProjectInfo which does not have the state, but it can be grabbed from the config entry point.

So in short, I am fine with just marking read-only, optionally prefixing the description.

Reminder that it would be great to also do some updates to
https://www.mediawiki.org/wiki/Developers/Maintainers
whilst doing this other work.

As this task is about inactive repos, "some updates" means removing any mentioned maintainers from that page? Or anything else?

PS: Thanks a lot for working on the templates!

I suggested that since that is what OpenStack does.

Whatever requires steps that require shell access should be out of scope for this task T102920 and should please go to a separate enhancement request.
Let's keep the "Actions" section in the task description with stuff that can be prepared by anybody (which e.g. still includes preparing a CI patch as anybody can get Gerrit access).

As this task is about inactive repos, "some updates" means removing any mentioned maintainers from that page? Or anything else?

Yup.

PS: Thanks a lot for working on the templates!

Oh... checking now (6 months after I changed the templates) it looks like the links to phab are all 404ing due to capitalization inconsistencies. :-(
E.g. links to https://phabricator.wikimedia.org/tag/mediawiki-extensions-CSS/ instead of https://phabricator.wikimedia.org/tag/mediawiki-extensions-css/
I don't know how to best fix that, either. (I'll stop tangenting on this task, now... :)

Whatever requires steps that require shell access should be out of scope for this task T102920 and should please go to a separate enhancement request.
Let's keep the "Actions" section in the task description with stuff that can be prepared by anybody (which e.g. still includes preparing a CI patch as anybody can get Gerrit access).

I will probably get rid of the 'archived' template in CI since that clutter the configuration file in favor of marking the repository read-only and removing it from CI.

I will probably get rid of the 'archived' template in CI since that clutter the configuration file in favor of marking the repository read-only and removing it from CI.

That will prevent people from picking up an abandoned extension again by uploading a patch. Then people only can upload their patches to phabricator or github. I don't think we should prevent people from using gerrit unless we get rid of gerrit. I don't think the answer to too many review requests is preventing people from making them.

I will probably get rid of the 'archived' template in CI since that clutter the configuration file in favor of marking the repository read-only and removing it from CI.

@hashar: When marking a repository read-only (in Git? in Gerrit? in CI?), at which specific point/step would a code contributor realize that there is not much sense in contributing a patch to such a repository?
I'm after defining some process here that allows more or less anybody to trigger some kind of "this is not active code but if you're interested you could become the maintainer"-notification to anybody interested in contributing to such an inactive repository. And so far using CI for that looked feasible at least.

@hashar: When marking a repository read-only (in Git? in Gerrit? in CI?), at which specific point/step would a code contributor realize that there is not much sense in contributing a patch to such a repository?
I'm after defining some process here that allows more or less anybody to trigger some kind of "this is not active code but if you're interested you could become the maintainer"-notification to anybody interested in contributing to such an inactive repository. And so far using CI for that looked feasible at least.

It seems repositories being archived have their MediaWiki page flagged as such and the repository is cleaned up. So one would figure it out as soon as he find the repo or at worth once cloning it.

For the three extensions I have marked readonly recently:

There are some crafts such as OpenSearchXml and WebFonts lacking a README to make it clear the extension is archived and WebFonts still having the code in the repo. But that is surely solvable.

It seems repositories being archived have their MediaWiki page flagged as such and the repository is cleaned up.

Many archived extensions seem to have been fully superseded by some modern equivalent.

From the description:

INACTIVE projects would be filtered out from Wikimedia code review metrics on http://korma.wmflabs.org/ (how exactly, needs to be defined).

Now we have a way to blacklist repositories from Gerrit metrics in Korma.

I have started to list in the description repos to be marked as inactive. I want to blacklist them in korma sooner in order to offer cleaner data for T88531: Goal: Organize a Gerrit Cleanup Day on September 23, 2015 and T107562: Tech community KPIs for the WMF metrics meeting. If nobody complains soon, I will start blacklisting in korma. If there are complaints about a specific repo, de-blacklisting is easy.

Is it correct that if they are marked as archived in the CI they don't need to be blacklisted in korma?

I, too, don't understand what's currently proposed to be done with those repos. Only setting up the automatic V-1 ( consequences)? Some of those extensions seem to have a non-negligible usage so we don't want to make them read-only.

I, too, don't understand what's currently proposed to be done with those repos. Only setting up the automatic V-1 ( consequences)? Some of those extensions seem to have a non-negligible usage so we don't want to make them read-only.

Not doing that but there being nobody to review and merge patches isn't improving anything. With an automatic -1 at least it is obvious that a maintainer is needed.
Which repos do you think should not get automatic -1s?
Can you find someone willing to maintain them?

Is it correct that if they are marked as archived in the CI they don't need to be blacklisted in korma?

Good question. I'm not sure, but I can check. How can I get the list of fail-archived-repositories (marked as archived in the CI)?

Also a question: can those archived repositories still get localization updates merged? Can they get these patches removing deprecated functions, done by developers like anomie "with parachute"? That would be desirable, since these are the type of recent commits you can see in many of these inactive repositories.

can those archived repositories still get localization updates merged?

No. At least in gerrit, read-only really means read-only.
(Not sure about Diffusion though)

Can they get these patches removing deprecated functions, done by developers like anomie "with parachute"?

No. (Same as the above).

That would be desirable, since these are the type of recent commits you can see in many of these inactive repositories.

If we want to keep such updates, the repos are not really "unmaintained". Someone still cares for them, if it's only (half-)automatic commits.
In that case, Gerrit's "read-only" status is out of the game.

How can I get the list of fail-archived-repositories (marked as archived in the CI)?

It's in https://git.wikimedia.org/blob/integration/config.git/HEAD/zuul/layout.yaml :

  • name: mediawiki/extensions/ClickTracking
  • name: mediawiki/extensions/OpenSearchXml
  • name: mediawiki/extensions/Parsoid
  • name: mediawiki/extensions/Vector
  • name: mediawiki/extensions/WikiGrok

QChris answer was about the read-only gerrit setting. Here the answer about repos that the CI considers archived:

Is it correct that if they are marked as archived in the CI they don't need to be blacklisted in korma?

Good question. I'm not sure, but I can check. How can I get the list of fail-archived-repositories (marked as archived in the CI)?

Also a question: can those archived repositories still get localization updates merged? Can they get these patches removing deprecated functions, done by developers like anomie "with parachute"?

Technically you can bypass a -1 made by the CI. But if you do then no automated tests were made by the CI. So this should be avoided.

That would be desirable, since these are the type of recent commits you can see in many of these inactive repositories.

I think that is a contradiction we should resolve to not do work that benefits no one.

I see no reason to modify a repo if it is archived, unless there is a maintainer willing to work on it. It is unmaintained, meaning nobody is there to keep it working, thus there is no reason to make compatibility fixes to it. There is also no reason to translate it. So the repo should get removed from translatewiki.net before marking it archived. Working on it while not maintaining it wastes work and still doesn't help any users of the extension. If someone helped the users of the extension enough so that they would fix it if breaks then it would by definition be maintained and then it should not be marked archived.

I think it is more important to not be unclear about if an extension is maintained than what technical measure we use to show that (archived in CI or readonly in gerrit). By being unclear we also prevent people from noticing that a maintainer is needed. If there is a maintainer needed we should clearly say so.

True, silly me hadn't realized that translation updates and compatibility fixes are only useful when someone is planning to release a new version for the potential users of that software.

I wonder whether translatewiki.net has a a way of archiving as simple to "unarchive" as Gerrit, in case a new maintainer shows up.

It's very easy to enable or disable an extension in translatewiki.net.

It's very easy to enable or disable an extension in translatewiki.net.

Ah, thanks. Good to hear. What would be the steps that allowed someone (like me) to "make that happen"?

To remove an extension from translatewiki.net it seems one removes it from https://phabricator.wikimedia.org/diffusion/GTWN/browse/master/groups/MediaWiki/mediawiki-extensions.txt possibly leaving behind a comment saying it is archived because it is not maintained anymore.
But one can also instead create a task with I18n to find someone else to do it.

To remove an extension from translatewiki.net it seems one removes it from https://phabricator.wikimedia.org/diffusion/GTWN/browse/master/groups/MediaWiki/mediawiki-extensions.txt possibly leaving behind a comment saying it is archived because it is not maintained anymore.
But one can also instead create a task with I18n to find someone else to do it.

Thanks a lot for investigating! Would be awesome if @Nemo_bis or @siebrand could confirm above steps so we can make them "part of the process" in the task summary.

To remove an extension from translatewiki.net it seems one removes it from https://phabricator.wikimedia.org/diffusion/GTWN/browse/master/groups/MediaWiki/mediawiki-extensions.txt possibly leaving behind a comment saying it is archived because it is not maintained anymore.
But one can also instead create a task with I18n to find someone else to do it.

Thanks a lot for investigating! Would be awesome if @Nemo_bis or @siebrand could confirm above steps so we can make them "part of the process" in the task summary.

Correct. If a patch is submitted, please add @Nikerabbit, @Raymond and @siebrand as reviewers.

Any further comments before I call this "final"?
To me the Criteria and Actions sections above look ready. So I'll document this in a new "Inactive / unmaintained projects" section on mw:Gerrit/Project_ownership soon and then close this task as resolved.

(And thanks to everybody who has participated in this interesting discussion so far!)

If a repository is an inactive, I would like it to be read-only in Gerrit to prevent new patches and unconfigure it from CI entirely. Every repo has some kind of maintenance overhead.

If I get the last comment right, that would mean for this task's summary:

It's impossible to assess that without a definition of INACTIVE.

If I get the last comment right, that would mean for this task's summary:

Sounds good CI wise and Gerrit wise, thank you! Additionally we will want to update the project description and prefix it with the repo status / friendly hint.

It's impossible to assess that without a definition of INACTIVE.

Sure we need to have properly defined terms. Luckily both INACTIVE and POSSIBLY INACTIVE have been defined on the task detail of T102920.

I'm not sure it's desirable to disable translations on an extension just because there were no merged patches in 180 days: how many extensions are in this state? Probably hundreds.

No patches merged AND open changesets waiting for review for more than 180 days. I don't think there are hundreds of extensions in this situation.

In any case, the point is that such extensions are not being maintained, nobody is preparing new releases, and the translations made by volunteers are not being released to users. If those translators would know, they probably would focus their time in other projects, active projects.

The number of extensions which do "releases" is near zero: only MLEB and the Semantic Bundle, basically.

..., and the translations made by volunteers are not being released to users. If those translators would know, they probably would focus their time in other projects, active projects.

New/updated translations are committed nearly every day for every extension. That means, via Git clone and on mw.org/Special:ExtensionDistributor the latest translations are always available.

Update the description; leaving the translatewiki part as "to be discussed" as per last comments here.

As there is no agreement on disabling the repository on translatewiki.net, it was removed and is not mentioned in the actions to take on the wikipage.

I have transfered the definitions and proposed actions from the task summary to the new wikipage https://www.mediawiki.org/wiki/Gerrit/Inactive_projects and marked it as a draft.

The list of current actions isn't exactly short, rather manual, and some of them require special permissions, but that's what's currently feasible given the technical infrastructure and constraints.
How many people will actually perform these actions and how to "easily" identify such repositories (via e.g. korma statistics or such) is left to the future and a different task.
I've also added a preliminary link to that draft page on https://www.mediawiki.org/w/index.php?title=Gerrit/Project_ownership&type=revision&diff=1894368&oldid=1813794

So we are pretty much done here; I'm only leaving this open to leave a few days to see if the edit on [[mw:Gerrit/Project ownership]] will trigger notifications that provide more input. Then I'll remove the "Draft" banner and close this.

(Note to myself: I'm wondering how "archived" is defined in [[mw:Category:Archived extensions]] compared to the also existing "unmaintained".)

As there is no agreement on disabling the repository on translatewiki.net, it was removed and is not mentioned in the actions to take on the wikipage.

That might be an approach worthy of Solomon the Wise ;) . Because then bot commiting changes from translatewiki.net will get refused from uploading a patch to gerrit. I hope we can reach a better agreement.

As there is no agreement on disabling the repository on translatewiki.net, it was removed and is not mentioned in the actions to take on the wikipage.

Because then bot commiting changes from translatewiki.net will get refused from uploading a patch to gerrit. I hope we can reach a better agreement.

Hah, thanks Jan. Alright, so I hope the translatewiki folks can find some agreement here.

Maybe the agreement is that, as a first step, if an extension is still receiving localization patches, then it would not be blocked in Gerrit, but we would still mark it as inactive in the Extension page, remove it from Korma, etc. I guess a developer willing to work on a patch will check the extension page at least.

Inactivity does not define whether localisation updates are useful or not, that is determined by whether the extension is used somewhere and that the version in use is periodically updated to newest version available [1].

[1] So that someone using an old version of an extension with an old version of MediaWiki does not count as the extension might still be broken in practice.

Hence for translatewiki.net, it makes sense to separate enhancements vs. bug fixes when considering whether the extension is still relevant or not. In other words, we want to support inactive but working extension, but we do not want to support inactive and broken extension.

The definition of broken is of course a matter for discussion. For example, not answering to issues reported by translators in translatewiki.net is from out perspective a bug in the extension and thus a good reason to drop support for it.

I would suggest there is a some kind of path from inactive to broken and translatewiki.net support would be dropped for broken extensions. Due nature of software development, inactive software has tendency to become broken over time, but that time might be years during which the software is still useful.

Have you also considered how the case of extensions being purposefully discontinued (e.g. WebFonts and Narayam) would relate or fit in this process?

cscott subscribed.

Updated the "dead" parser-related repositories in the description.

Please, let's blacklist the 26 repositories mentioned in the description before the end of the month (say by next Tuesday, latest), so we can have cleaner metrics for September.

let's blacklist the 26 repositories mentioned in the description before the end of the month (say by next Tuesday, latest), so we can have cleaner metrics for September.

https://github.com/Bitergia/mediawiki-repositories/pull/3

Is it correct that if they are marked as archived in the CI they don't need to be blacklisted in korma?

Good question. I'm not sure, but I can check.

Can you first do that? Then blacklisting would not be needed and we wouldn't need to maintain so many places with the same information.

Also same question for read-only repos (they can't contain open patches as these are rejected at upload)?

After a quick look at that blacklist, I have no idea why certain repos are on it. Namely, at least, integration/raita (rCIRA). How is that blacklist worthy?

Noting here that I updated the etherpad with Fundrasing Tech's list of deprecated repos.

Aklapper raised the priority of this task from Medium to High.Sep 27 2015, 6:27 PM

I updated the etherpad with Fundrasing Tech's list of deprecated repos.

Thanks! Created https://github.com/Bitergia/mediawiki-repositories/pull/4 to blacklist them on korma.wmflabs.org

After a quick look at that blacklist, I have no idea why certain repos are on it. Namely, at least, integration/raita (rCIRA). How is that blacklist worthy?

The idea behind is to ignore pulled 3rd repos from our stats, see T103984. But mistakes happen and git.wikimedia.org/summary/integration/raita.git looks rather in-house.
@greg: Anything else to remove?

Summarizing the outcome: We don't mark read-only in Gerrit or twn to allow translation updates for "very stable" inactive repos and we leave patches open in Gerrit but remove the repositories from our code review metrics CI tests, plus encourage inviting contributors to consider becoming maintainers.

That's our current answer to "what to do". Hence I'm closing this task as resolved.

Thanks everybody who took part in this discussion and provided input! Very appreciated.

This process can be found at https://www.mediawiki.org/wiki/Gerrit/Inactive_projects
I admit I have doubts how many people will follow those several steps listed there, given our many infrastructure tools, but let's continue improving our process while doing.

After a quick look at that blacklist, I have no idea why certain repos are on it. Namely, at least, integration/raita (rCIRA). How is that blacklist worthy?

The idea behind is to ignore pulled 3rd repos from our stats, see T103984. But mistakes happen and git.wikimedia.org/summary/integration/raita.git looks rather in-house.
@greg: Anything else to remove?

integration/consistency
integration/doc
integration/docroot
integration/junitdiff
integration/raita