|
|
Subscribe / Log in / New account

Leading items

Welcome to the LWN.net Weekly Edition for September 12, 2024

This edition contains the following feature content:

This week's edition also includes these inner pages:

  • Brief items: Brief news items from throughout the community.
  • Announcements: Newsletters, conferences, security updates, patches, and more.

Please enjoy this week's edition, and, as always, thank you for supporting LWN.net.

Comments (none posted)

A mess in the Python community

By Jake Edge
September 11, 2024

The Python community has been roiled, to a certain extent, by an action taken by the steering council (SC): the three-month suspension of a unnamed—weirdly—Python core developer. Tim Peters is the developer in question, as he has acknowledged, though it could easily be deduced from the SC message. Peters has been involved in the project from its early days and, among many other things, is the author of PEP 20 ("The Zen of Python"). The suspension was due to violations of the project's code of conduct that stem from the discussion around a somewhat controversial set of proposed changes to the bylaws for the Python Software Foundation (PSF) back in mid-June.

Changes

The proposed bylaw changes were, apparently, debated on a non-public PSF mailing list (psf-vote) around the same time they were posted to the PSF category of the Python discussion forum. Only one of the three changes sparked much discussion on the forum; it would change the way that PSF fellows can be removed for code-of-conduct violations. PSF fellows are members of the community who have been recognized "for their extraordinary efforts and impact upon Python, the community, and the broader Python ecosystem".

The proposal announcement focused on removing fellows, who are given PSF membership for life, but the actual change would allow removing any PSF member "as a consequence of breaching any written policy of the Foundation, specifically including our Code of Conduct". Instead of requiring a two-thirds majority of PSF members (which includes all of the fellows) to remove a member, the proposed wording would simply require a majority vote of the PSF board of directors to do so. All three of the changes to the bylaws passed easily, as noted in a mid-July announcement, though the controversial change received notably less support than the other two.

The main objection is that a simple majority is seen, by Peters and others, as too low of a bar; Peters, who was a board member for 12 years, argued that since nearly all board decisions are unanimous anyway, it would seem reasonable to require unanimity (or at least two-thirds) of the board to remove a fellow. There was a lengthy discussion of multiple aspects of the change and why some thought it might not be such a good idea to allow a simple majority, particularly for some future PSF board that had some kind of ill-intent. Others thought that the scenario was far-fetched. There were quite a few more posters in support of the change over the course of the discussion, though the posting volume of those opposed (or, at least, questioning it) was dramatically higher. There are, however, logistical problems in making any alterations to the proposed changes, board member Christopher Neugebauer said, so increasing the number of votes needed would have to wait for the next board-election cycle in a year's time.

Everyone seemed to agree that there needs to be a way for fellows to be removed by the board; even without the changes, other types of members could have been removed by the board using its existing powers, but not fellows. Some thread participants thought that a supermajority (e.g. two-thirds) might be better, but that could be handled with an amendment down the road; for now, it is better to have a way to remove misbehaving fellows without trying to do so in a public vote among all of the PSF members. The conversation went on for two weeks or so, with 175, often quite long, posts, but it was seemingly all fairly cordial, without rancor or seriously angry exchanges. There was disagreement, certainly, but little that stands out, at least in comparison to other public disputes in mailing lists and elsewhere over the years.

Peters was an eager, probably overeager, participant in the discussion in that thread, and in another that was split off to discuss questions and thoughts about the legal advice the PSF board received with regard to the changes. The discussions were sprawling, often heading in multiple directions, some of which could easily be considered off-topic—and were. Some of the posts were flagged by readers, leading to them being temporarily hidden by the Discourse forum software, which, in turn, led to complaints that dissent was being suppressed. Eventually, a moderator put the discussion into slow mode in hopes of getting it back on track.

During the final stages of the discussion, Neugebauer posted a FAQ from the board that addressed the various questions that had been posed in the discussion. While it does not—cannot—say so outright, the FAQ certainly gives the impression that there are fellows currently under consideration for removal. There was an undercurrent in the discussion about fellows whose behavior is in violation of the code of conduct, perhaps sometimes in non-Python spaces, who are being tied to the PSF through the fellowship designation. SC member Thomas Wouters said that he has seen "ample evidence that there are, in fact, Fellows who repeatedly flaunt the CoC [code of conduct]". In another "example", Alan Vezina outlined a problematic scenario that is hypothetical, but "also grounded in real events".

SC statement

On July 11, well after the discussions about the proposed change had died down, Gregory P. Smith posted a statement from the SC titled: "Inclusive communications expectations in Python spaces". It said that during those discussions:

[...] we witnessed some disturbingly unprofessional conversations take place, involving many well known individuals. These comments have alienated many members in our community and act to prevent others from wanting to join and become a part of Python's future.

What followed was a list of examples that were clearly aimed directly at Peters. In fact, the third entry points to "resisting soft conduct moderation", which had apparently just been tried with Peters without success. As he noted in the resulting thread, he was contacted privately about his behavior, though he did not agree with the complaints. The posted message seems to be something of a "brushback pitch" from the SC, warning Peters to back off "[w]hile code of conduct discussions are ongoing"—or face the consequences. From his responses, that seems to be how Peters saw it as well.

Beyond resisting the SC's efforts to have him moderate his posts, the other two examples listed are not particularly convincing, at least to Peters and others who responded. The first ("Bringing up examples of sexual harassment and making light of workplace sexual harassment training.") apparently refers to a post that decries how workplace sexual harassment is handled. Perhaps that message also runs afoul of the second example ("Utilizing emojis and turns of phrase in ways that can be misconstrued or perceived entirely differently by different people.") since Peters did use the "wink" emoji—something that makes frequent appearances in his posts. The real crux of the matter may well be contained in the rest of the second example:

Be mindful when writing that your audience is a much broader diverse professional community than the small collegial group of insiders that Python evolved from decades ago. Some communication styles that were unfortunately common in the past are rightfully recognized as inappropriate today.

As might be guessed, Peters replied at length; multiple other people also replied, both in support of the SC's position and in support of Peters. The SC clearly struggled with the wording, but it did itself no favors with the examples it chose. However, the committee, and the Python community as a whole, perhaps, are trying to turn the page on the old ways. The belief is that those old ways are chasing people away from the project and the SC is the ultimate arbiter of how to stop that. Peters would likely agree that he is a clear and obvious example of those old ways.

Underneath the surface, though, is a general disagreement about the code of conduct and how it is applied. The SC statement said:

When someone tells you something said earlier was problematic, listen. That is not an opportunity for debate. That is an expression of pain. It is time to stop and reflect upon what has happened.

But Peters and others are concerned that goes too far in one direction without applying "reasonable person" standards. Peters noted that there are "more than just a few PSF members" who are worried that code-of-conduct actions will ruin their careers, which keeps them from participating fully. Meanwhile, there is a two-way street with regard to these kinds of complaints. "When someone is saying they find the SC's actions or statements are over the line, that too is an expression of pain."

Smith said that the examples listed in the statement came directly from community-member complaints:

They do not owe anyone an explanation of how or why they interpret things perhaps different than yourself. The important point is to recognize that people do. The right course of action is to believe them, accept that, and learn. Not try to tell them they are wrong.

He also said that those who are facing code-of-conduct actions are fully responsible for that state of affairs. A single complaint is not likely to result in removal from the project; it is instead repeated patterns of "disrespectful behavior" coupled with an inability or unwillingness to "learn and improve" when they are pointed out. While the statement was not aimed at a single person, since multiple people were involved, Peters "was one big part", so the SC did contact him privately to discuss things.

For his part, Peters said that he appreciated the SC contacting him, but that he did not agree with what was said:

The original expressed concern was that I was "making light of the SH [sexual harassment] itself". Which I replied couldn't be read that way by a reasonable person. I didn't get a response, but next thing I saw was a public post in which the claim had morphed into that I was "making light of workplace SH training". Which is also an implausible reading, although slightly less implausible.

[...] I decided to let it go, since there's no point fighting confirmation bias. If you're determined to take offense, offense is what you'll find.

Brendan Barnwell was also concerned that, based on "the tenor of the comments from the moderation team/SC/PSF board", the decision essentially comes down to: "'If Person A does something and Person B interprets that in a way they find objectionable, it is automatically (or at least by default) Person A who must accept they are in the wrong.'" But that is not fully reflecting the code-of-conduct guideline of "'being [respectful] of differing viewpoints and experiences'", he said. Sometimes Person B needs to rethink their interpretation too.

Like Barnwell, several complained that the examples in the SC statement were hard to interpret, so that they really did not help others who might be trying to learn what to avoid. Karl Knechtel, Chris McDonough (a PSF fellow, like Peters), and "Paddy3118" specifically called out the wording. While Smith said that the SC felt that it could not really be more specific in its statement, no one else seemed interested in defending the wording of the examples given.

Suspension

In any case, though he was warned a few different ways, Peters did not change his behavior. He continued posting, started new potentially controversial threads, participated in two threads on members removing themselves from the Python discussion forum, and generally kept doing what he has always done. One of those "I'm leaving" threads was from former PSF chairman, board member, and emeritus fellow Steve Holden, while the other was from Knechtel. The latter thread managed to get Knechtel an indefinite suspension from the forum, though the thread is somewhat difficult to follow due to edits, either by the moderators or Knechtel himself. Around the same time, former board member David Mertz renounced his PSF fellow status, which was, apparently, the final straw for him, as Mertz drew an indefinite forum suspension as well.

So it probably came as no surprise to anyone who was paying attention that the SC suspended Peters as a core developer for three months on August 7. The announcement, which was posted by Wouters, mostly quoted from the code-of-conduct work group's recommendation, which had a bullet list of ten items. As might be guessed, those items are unlikely to sway those who feel like Peters has been mistreated—much the reverse, in truth. The announcement specifically does not mention Peters, though the first bullet item clearly and obviously identifies him: "Overloading the discussion of the bylaws change (47 out of 177 posts in topic at the time the moderators closed the topic), which created an atmosphere of fear, uncertainty, and doubt, which encouraged increasingly emotional responses from other community members." The list is likely to be a litmus test of sorts; some will nod approvingly, while others will find it severely lacking for a variety of reasons.

SC member Emily Morehouse pointed out that it is often difficult to communicate what led to a specific code-of-conduct action; the list provided is only meant to give the flavor of the offenses:

This is not the first time that there was no singular incident or sentence that comprised an offense. The list of behaviors, in my opinion, is not a list of individual things so egregious that resulted in a suspension. They are examples that attempt to summarize an overall communication style that pushed boundaries too far and caused harm to multiple people in our community.

[...] In CoC violations in general, I wish that there was a way to paint the whole picture for the community at large, but it's unfair to both the person(s) who report issues and the people who are being suspended. I do hear the feedback on this and hope that our communications and process can be improved in the future.

Throughout the various threads (and others that have not been mentioned), there have been concerns expressed about the moderation of forum threads. Those manifest as posted complaints, of course, as well as, undoubtedly, private messages to the moderators group, some of which are less than friendly—likely wildly abusive at times. That makes the moderators sensitive to criticism, thus more willing to moderate posts of that nature, which is seen as more evidence that moderation is out of control. In addition, moderation is perceived a tool of those in power in the Python world (e.g. SC, PSF board), so its use in controversial threads like the bylaws-change discussion just perpetuates the vicious cycle.

Moderation in any community is a hard and thankless job; we certainly struggle with it here at LWN. Trying to apply rigid boundaries to human (mis)conduct is always fraught, so moderators will always be in the "wrong" with some portion of the audience. In the final analysis, Peters will have to recognize that his communication style is no longer welcome in the Python world, then either decide to change how he communicates—or bow out. The latter would be unfortunate, since some large percentage of his posts are useful, interesting, helpful, amusing, or some combination of those, without crossing or even approaching any of the code-of-conduct "lines".

Peters's return from his three-month suspension is far from guaranteed, though. In previous cases, it has required a request to the SC for reinstatement, which has generally not happened with others in his shoes. But it is not really a surprise that Peters is already missed. Serhiy Storchaka asked about contacting Peters for his technical expertise, which Wouters seemingly discouraged; Storchaka specifically said he was not trying to route around the SC decision, but might be able to continue to work with Peters through GitHub. Meanwhile, in a thread about possible changes to the voting system for SC elections, Guido van Rossum noted that the voting-system expert he knows is banned: "Maybe we can wait until his 3-month ban expires and ask him for advice?"

There are lots of other aspects to these incidents, including several other threads, a withdrawn call for a vote of no confidence in the SC, and more. The no-confidence-vote and the voting-system threads both indicate that more changes may be coming to try to address some of the complaints and concerns that were raised by these events. Where any of that will lead is unclear at this point, but may warrant a look down the road. In the meantime, one has to hope that Peters, the creator of Timsort and so much more, will find his way back to the community he helped found more than 30 years ago.

Comments (142 posted)

Attracting and retaining Debian contributors

By Jake Edge
September 9, 2024
DebConf

Many projects struggle with attracting and retaining contributors; Debian is no different in that regard. At DebConf24, Carlos Henrique Lima Melara and Lucas Kanashiro gave a presentation about efforts that the Brazilian Debian community has made to increase participation. Their ideas and the lessons learned can be applied more widely, both for other Debian communities and for other projects.

Kanashiro introduced himself as a software engineer, Debian developer since 2016, and an Ubuntu core developer working on Ubuntu Server at Canonical. Melara said that "everybody calls me Charles", and that is his name within Debian; he is a computer engineer interested in security and operating systems, working on embedded systems at Toradex. He has recently become a Debian developer.

History

Eight or nine years ago, a local Debian community was formed in Brasília, which is the capital of Brazil, "right in the middle of the country", Kanashiro said. He was working on Debian without any local colleagues at the time and wanted to try to get others involved. So he and his former professor began holding events and installfests, which worked pretty well, "but in the end we were not able to engage people" and retain them in a local community.

[Carlos Henrique Lima Melara]

Later, he was able to work with some university students who were interested and they formed a group that met monthly, sometimes in conjunction with a local Linux users group. His goal, though, was not to be a users group; instead he wanted to have a local community of Debian contributors. But, then, along came the pandemic, and everything stopped.

A few months after the start of the pandemic, the group started to have online meetings. That is "when things started to get some traction", largely because people from other cities were able to join in. The group also started to be able to contribute to Debian; it now has four Debian developers, five Debian maintainers, and those numbers increase every semester.

Brazil now has an engaged and active community of Debian contributors. He showed a picture of attendees at the MiniDebConf Belo Horizonte that was held in April; there were around 225 participants, Kanashiro said. He also showed a picture of all of the Brazilian attendees at last year's DebConf in Kochi, India; there were 14 people in the picture. The Brazilian Debian community likes to promote what it is doing, by showing up and participating in events like those, but it is also interested in improvement and in helping other communities to improve as well.

To that end, Melara asked the audience how many were Debian developers (lots) and maintainers (two); then he asked how many "have a thriving local community where you are living?" There were a few hands raised, which is "nice", he said, but is indicative of the problem that they have been trying to solve, both in the city of Brasília and in the whole country of Brazil.

There are two things that they think are important to overcome in order to create and build a new local community. The first is to lower the barriers for newcomers to contribute. But once you have attracted a newcomer and gotten them to contribute, how can they be retained so that they become long-term contributors or even become Debian maintainers and developers? The main goal of the talk, he said, is to relate what the Brazilian community has done "to make those two things a reality".

Barriers and problems

Kanashiro described some of the barriers for newcomers that the Brazilian community has encountered. Documentation for newcomers is often missing, or is spread around in various places if it exists, so it is hard to find. Another problem is that there are too many different tools and workflows for Debian packaging, which is confusing to newcomers. Searching for packaging tools, or talking to different Debian developers about it, will lead to too many options. Communicating via mailing lists and IRC is the norm in Debian, but "that does not work well" for the "new generation"; if you want to bring those folks into the project, there needs to be some kind of workaround. "Of course, we will not change everything, because the community works like that, but, for sure, we need to find a way to lower this barrier."

English is not an easy language for many potential newcomers in Brazil, but most of the documentation that is available is written in English, so that needs to be addressed. People, especially new participants, are shy about asking for help; they do not know who the right people to ask about how to contribute are and worry that, even if they did, those people are so busy that a newcomer should not bother them. The final barrier that he relayed is the technical jargon, terms, and acronyms that pervade Debian and its packaging; newcomers find it hard to understand what people are even saying when they are talking about those topics.

Melara said that the list of barriers is long; he had some more to add to the pile. Newcomers obviously already know about Git, rebasing, packaging tools, using the command line, and more, right? "Yeah, no, that's not true." Most of the new people are coming from Windows, so they are not familiar with the command line and only have a general sense of how computers and operating systems work. As noted earlier, the diversity of tools and workflows is intimidating as well.

Once you get someone interested, engaged, and perhaps even doing some packaging work or fixing a bug, he asked, how do you encourage them to continue to contribute in the long term? "It's difficult to make this transition from a first-time contributor to a long-term one."

Once a contributor gets to the point where they know how to use the tools, fix bugs, and do packaging tasks, local community members help them to become Debian maintainers. But, there is a similar problem when that happens, he said; they get upload permissions for the packages they maintain, but there is no clear path forward. The goal is for the Debian-maintainer role not to get stale.

The last problem is with visibility for local groups. For those doing packaging, there are various statistics that can show the work that is going on, which can help increase the visibility of the group. But there are other tasks that are not really measured that way, including putting together events like the MiniDebConf. Those things are important for the community but are not as visible.

So, with those barriers and problems in mind, Melara said, they wanted to talk about the solutions that the local group has come up with. These things have been worked out over the last four or five years; they would be presenting what they are doing now, which "seems to work for us".

Solutions

Kanashiro said that the documentation problems are being addressed with a wiki for Debian Brasil. They had encountered a number of problems with using the official Debian wiki due to IP-address blocks, which restricted access and required pinging people to get addresses unblocked. Beyond that, the Debian wiki is not particularly modern, while the Debian Brasil wiki, which is based on Wiki.js, has Markdown-editing capability and allows editing pages side-by-side with their rendering, both of which make it easier to use.

[Lucas Kanashiro]

All of the wiki content is in Portuguese, since it is targeting Brazilians. There are some basic tutorials on setting up an environment to do packaging work, the structure of a Debian package, using packaging tools, and so on. There are also videos from previous events on topics of interest. "Everything is there, everything is in Portuguese". Since it is a wiki, it can be edited, of course; if someone has a problem when they follow the steps on the wiki, they are asked to edit the item to improve the document for the next person.

This helps address two of the barriers: lack of documentation for newcomers and the prevalence of documentation in English. They could have translated the existing documentation, he said, but what they have is easier and works better for the Brazilian community, at least for now. There has been some interest in updating the Debian wiki to something modern, which might make it easier to merge the two at some point.

The tool-proliferation problem has been fixed by the simple expedient of choosing "a very opinionated set of tools" for newcomers to use for packaging, Melara said. These are the same tools that he, Kanashiro, and others use, so they can easily review the work that is being done and "make sure that everything is OK for uploading", Melara said. The process uses sbuild, git-buildpackage (or gbp), DEP-14 Git layouts, and the Salsa GitLab instance for Debian.

All of the review comments are picked up by a bot to send to the Matrix channel for the group. That allows others to see what changes people are asking for in review and to learn from that. There is a "very, very detailed workflow", which is well-documented and that only requires learning two tools and Git; that is the starting point for newcomers, though they can use other tools and workflows once they get their feet under them. This helps address the problems with too many tools and workflows; "it's pretty easy for newcomers".

The Brazilian community is using modern tools for communication, Kanashiro said. It is difficult to make big changes in the overall Debian community, but their local community can make its own way. IRC is the default chat mechanism for Debian, but younger people do not use IRC; in Brazil, they use Telegram instead. It is not practical to expect them to leave Telegram and move to IRC, so Debian Brasil has created a bridge that connects IRC, Matrix, and Telegram so that people can choose the one that works best for them "and everyone will be there discussing things".

One of the big complaints about IRC is that messages get lost when people are not logged in, so Debian Brasil provides a Convos service for its users. Convos is an web client and server-side "bouncer" for IRC, which handles message persistence so that users can always get the messages sent. These tools have been helping with the communication barrier that was identified, but also helps with long-term commitment to the community since people can track and review all of the different comments being made on package reviews, merge requests, and other activities.

Helping people to transition from proprietary tools to free ones ("as in free speech, not free beer") is another thing that the group does, Melara said. There is an ongoing effort to convince people to move from Telegram to Matrix or IRC. It takes a long time, he said, but he thinks that progress is being made there.

Another way to help keep people involved is to keep in touch frequently. Debian Brasil has weekly online meetings using Jitsi. As might be guessed, those calls are in Portuguese. Normally the first hour and a half or so are spent on problems that people are having; after that, it becomes a "happy hour" where people talk about other things. It is a bonding experience that helps make newcomers more comfortable by seeing the other members in a social setting.

It is important to be consistent and not skip any weeks, because if people try to attend and the meeting is not held, that may make them less likely to come the next time. So Melara tries to be there every week; his slide had a picture from DebCamp the previous week of a few Brazilians attending the Jitsi meeting from Korea at 7am Friday, which was the usual 7pm Thursday meeting time in Brazil.

Debian Brasil has a well-defined process for packaging contributions, Kanashiro said; it is based around using issues and merge requests on Salsa. Newcomers often know about using GitHub, so it is easy to get them using similar features on Salsa. Instead of going to a Debian mailing list to ask for help in English, they can get help in Portuguese via chat. There is a real effort to provide feedback quickly, he said, rather than having them wait weeks or months, which can happen in other parts of the Debian project. They have been successful in getting new people to contribute this way, he said.

The process can be easily tracked because tags can be assigned in GitLab, Melara said. That allows the progress to be tracked by email or via the web client; reviews can be made and checked in email and the comments get relayed to the Matrix channel so that people can keep up with what others are doing. "Yeah, it's pretty nice."

Reviews are handled by any available Debian developer or maintainer, but they have also developed a mentorship program to try to help new Debian maintainers to keep contributing. The mentor is a Debian developer who gets assigned to the maintainer to help them, on a one-to-one basis, explore different possibilities of things to work on. This program is new to Debian Brasil; it is still being worked on, including in a discussion in an informal birds-of-a-feather (BoF) session earlier in DebConf, Melara said.

But the plan is for it to be more than just working with the mentor, since they may be unavailable at times. So there will also be group discussions to share knowledge and look for tasks to work on beyond just the handful of packages that a maintainer is responsible for. That way, there will be faster responses with input from more people, which is especially helpful for harder questions and problems.

Application

Other local communities can use some of what they have learned, Kanashiro said. He suggested that Debian could create "virtual environments for local groups" that come with "a bunch of different tools that helped us a lot". Debian Brasil has people available to set up and run all of those tools, but other local communities may not.

A local group could ask to have this environment set up for its members. They would get a wiki area for the group with some amount of storage reserved for them. The IRC bouncer would be configured and started. There would be a way to generate statistics for the activities of the group. Providing "an easy way to do all of this for any local community" would be "great to have" for Debian.

The younger generation ("Gen Z") is not interested in lengthy videos, Melara said, so they have been working on three-to-four-minute videos explaining basic packaging concepts, answering FAQs, and so on. The localization team created videos a ways back, which were well-received, so they are trying to do something similar for packaging.

Eventually, they would like to collect them all into a video course on packaging, Kanashiro said. It would present their opinionated view on packaging tools and processes; much of it is already done and is being used. The goal is to start with everything in Portuguese, naturally, but eventually to also have them in English. Getting the video content onto the Debian PeerTube is in the works as well.

There are some open questions, Melara said. He wondered if the contribution and mentorship processes would make sense elsewhere; can those be applied or adapted for other local communities? Are the problems that have been identified mostly Brazilian in nature or are they more widespread? There was not a lot of time left in the session at that point, so those questions mostly went unanswered.

A packaging-team member from the audience said that regular calls were useful, but that it is hard to do when people are distributed across the globe. For local groups, that have participants in similar time zones, it is much easier. Mentoring people using GitLab issues "is a really good idea", he said, which makes it easier to see "what's going on and what's not". Kanashiro suggested that recording the calls and making them available can help those who cannot attend them live.

Samuel Henrique, who is part of the Debian Brasil group, pointed out that there are participants in the call from elsewhere, including him from Ireland; they have managed to find a time that works, though "it's not perfect" since he ends up staying up until 3am most Fridays. Melara said that some people come early, some stay late, and some do both.

There is a question of how to promote the work that local groups do, Kanashiro said. One way is to do more frequent blog posts and make them available on planet.debian.org so that others can see what the group has been doing. They have also been talking with the people working on contributors.debian.org about adding data sources for things like event organization, meeting attendance, and various group activities. Currently, those things are not tracked.

The final item in their presentation was about new Debian maintainers getting more exposure to the whole project and, in particular, to people outside of the Brazilian community. The mentors are often encouraging these people to broaden their horizons and to get in touch with the project-wide Debian mentors organization. The intent is to get them working on more complex projects once they come up to speed on packaging. But the local mentoring is a new program for Debian Brasil, so they are still working it all out and are interested in what others are thinking.

With that, time expired on the session, which provided some ideas that may well be applicable to other Debian local groups—and beyond. It is interesting to note that some of the choices that are being made at the local level (e.g. opinionated packaging, moving away from IRC-only) are going to be difficult to push at the distribution-project level, at least for some time to come. The relative success of local organizations at attracting and retaining new people could—slowly—start pushing Debian as a whole toward more modern approaches. Time will tell.

A WebM video of the talk is available for those who are interested.

[I would like to thank the Linux Foundation, LWN's travel sponsor, for its assistance in visiting Busan for DebConf24.]

Comments (10 posted)

The trouble with iowait

By Jonathan Corbet
September 10, 2024
CPU scheduling is a challenging job; since it inherently requires making guesses about what the demands on the system will be in the future, it remains reliant on heuristics, despite ongoing efforts to remove them. Some of those heuristics take special note of tasks that are (or appear to be) waiting for fast I/O operations. There is some unhappiness, though, with how this factor is used, leading to a couple of patches taking rather different approaches to improve the situation.

In theory, a task that is waiting for short-term I/O (a state referred to in the kernel as "iowait") will need to execute soon. That means that there can be some advantages to treating the task as if it were executing now. The kernel maintains a one-bit field (called in_iowait) in the task_struct structure to mark such tasks. This bit is set prior to waiting for an I/O operation that is expected to be fast (typically a block I/O operation) and cleared once the operation completes. The kernel then uses this information in a couple of ways:

  • When an iowait task wakes on completion of the I/O, the scheduler will inform the CPU-frequency governor. The governor, in turn, may choose to run the target CPU at a higher frequency than it otherwise would. Normally, the CPU-frequency decision is driven by the level of utilization of the processor, but tasks performing a lot of I/O may not run up a lot of CPU time. That can lead the CPU-frequency governor to choose a slower frequency than is optimal, with the result that the next I/O operation is not launched as quickly and throughput suffers. Raising the frequency for iowait tasks is meant to help them keep the I/O pipeline full.
  • If a CPU goes idle, the system will normally try to put it into a lower-power state to save energy. The deeper the sleep state, though, the longer it takes for the CPU to get back to work when a runnable task is placed on it. The number of iowait tasks queued for a CPU is used as a signal indicating upcoming CPU demand; the presence of those tasks can cause the governor to choose a shallower sleep state than it would otherwise.

In theory, judicious use of the in_iowait flag can lead to significantly improved throughput for I/O-intensive tasks, and there are cases where that is demonstrably true. But the iowait handling can bring other problems, and its effectiveness is not always clear.

Iowait and io_uring

Back in July 2023, Andres Freund encountered a performance problem in the kernel. It was not quite as sensational as certain other problems he has run across, but still seemed worth fixing. He noticed that PostgreSQL processes using io_uring ran considerably slower (as in, 20-40% slower) than those using normal, synchronous I/O. In the synchronous case, the in_iowait flag was set, keeping the CPU out of deeper sleep states; that was not happening in the io_uring case. Freund's proposed fix was to set the in_iowait flag for tasks waiting on the io_uring completion queue; that recovered the lost performance and more. Io_uring maintainer Jens Axboe was quickly convinced; he merged the patch for the 6.5 kernel, and marked it for inclusion into the stable updates as well.

Once that patch was distributed in stable kernels, though, problem reports like this one from Phil Elwell began to appear. Suddenly, tasks on the system were showing 100% iowait time, which looked like a confusing change of behavior: "I can believe that this change hasn't negatively affected performance, but the result is misleading," Elwell commented.

The root of the problem is the treatment of the iowait state as being something close to actually running. User-space tools (like top or mpstat) display it separately and subtract it from the idle time; the result is the appearance of a CPU that is running constantly, even though the CPU is actually idle almost all of the time. That can result in the creation of confused humans, but also seemingly can confuse various system-management tools as well, leading them to think that a task with a lot of iowait time has gone off the rails.

Axboe responded with a change causing in_iowait to only be set in cases where there were active operations outstanding; it was merged later in the 6.5 cycle. That addressed the immediate reports, but has not put an end to the complaints overall. For example, in February, David Wei pointed out that tools can still be confused by high iowait times; he included a patch to allow users to configure whether the in_iowait flag would be set or not. That patch went through a few variants, but was never merged.

Pavel Begunkov had objected to an early version of Wei's patch, saying that exposing more knobs to user space was not the right approach. Instead, he said, it would be better to separate the concepts of reporting iowait time to user space and influencing CPU-frequency selection.

It took a while, but Axboe eventually went with that approach. His patch series, now in its sixth version, splits the in_iowait bit into two. One of those (still called in_iowait) is used in CPU-frequency decisions, while the other (in_iowait_acct) controls whether the process appears to be in the iowait state to user space. Most existing code in the kernel sets both bits, yielding the same user-space-visible behavior as before, but io_uring can only set in_iowait. That, Axboe hopes, will bring an end to complaints about excessive iowait time.

This change is not universally popular; Peter Zijlstra expressed some frustration over the seeming papering-over of the problem: "are we really going to make the whole kernel situation worse just because there's a bunch of broken userspace?" User space is what it is, though, and Axboe's patch set can address some of the complaints coming from that direction — in the short term, at least.

Eliminating iowait

The discussion on the visibility of the iowait state has brought to the fore a related topic: does the iowait mechanism make any sense at all? Or might iowait be a heuristic that has outlived its time? Christian Loehle thinks that may be the case, and is working to remove the iowait behavior entirely.

There are a number of problems with how iowait works now, he said. A CPU-idle governor might keep a CPU in a higher-power state in anticipation that some iowait tasks will soon become runnable, but there is no guarantee that any of those tasks will actually wake in a short period of time. "Fast I/O" is not defined anywhere, and the kernel has no sense for how long an I/O operation will actually take. So the CPU could be wasting power with nothing to do. When a task does wake, the scheduler will pick what appears to be the best CPU to run it on; that may not be the CPU that was kept hot for it.

Boosting a CPU's frequency after a task wakes may appear to avoid these problems, but there are problems there too. A task can migrate at any time, leaving its boosted CPU behind. The targeted tasks run for short periods of time; the fact that they do not use a lot of CPU time is why the separate boosting mechanism was seen as necessary in the first place. But changing a CPU's frequency is not an instant operation; the iowait task is likely to have gone back to sleep before the CPU ramps up to the new speed. That means that the CPU must be kept at the higher speed while the task sleeps, so that the boost can at least help it the next time it wakes. But, again, nobody knows when that will be or if the task will wake on the same CPU.

On top of all this, Loehle asserted that CPU-frequency boosting is often not helpful to I/O-intensive tasks in any case. All of this reasoning (and more) can be found in the above-linked patch series, which removes the use of iowait in CPU-idle and CPU-frequency management entirely. On the idle side, Loehle noted that the timer events oriented (TEO) governor gets good results despite having never used iowait, showing that the iowait heuristics are not performance-critical. So, along with removing the use of iowait, the patch series makes TEO into the default CPU-idle governor, in place of the menu governor that is the default in current kernels.

Loehle insisted that the iowait heuristics are only useful for "synthetic benchmarks". For the io_uring case described above, he said, the real problem was the CPU-idle governor using iowait (or the lack thereof) to put the CPU into a deeper sleep state. His patch series removes that behavior, so there is no longer any need for io_uring to set the in_iowait flag — or for changes to how iowait tasks are reported to user space.

He clearly thinks that this is the proper way to solve the problem; he described Axboe's patch series as "a step in the wrong direction". Axboe, though, does not want to wait for the iowait removal to run its course; his series solves the problem he is facing, he said, and it can always be removed later if iowait goes away.

Chances are that things will play out more-or-less that way. Axboe's patches could land as soon as 6.12, bringing an end (hopefully) to complaints about how io_uring tasks appear to be using a lot of CPU time. Heuristics, though, have been built up over a long time and can be harder to get rid of; there will be a need for a lot of testing and benchmarking to build confidence that changing the iowait heuristics will not cause some workloads to slow down. So Loehle's patch series can be expected to take rather longer to get to a point where it can be merged.

Comments (11 posted)

Testing AI-enhanced reviews for Linux patches

By Joe Brockmeier
September 6, 2024
Netdev

Code review is in high demand, and short supply, for most open-source projects. Reviewer time is precious, so any tool that can lighten the load is worth exploring. That is why Jesse Brandeburg and Kamel Ayari decided to test whether tools like ChatGPT could review patches to provide quick feedback to contributors about common problems. In a talk at the Netdev 0x18 conference this July, Brandeburg provided an overview of an experiment using machine learning to review emails containing patches sent to the netdev mailing list. Large-language models (LLMs) will not be replacing human reviewers anytime soon, but they may be a useful addition to help humans focus on deeper reviews instead of simple rule violations.

I was unable to attend the Netdev conference in person, but had the opportunity to watch the video of the talk and refer to the slides. It should be noted that the idea of using machine-learning tools to help with kernel development is not entirely new. LWN covered a talk by Sasha Levin and Julia Lawall in 2018 about using machine learning to distinguish patches that fix bugs from other patches, so that the bug-fix patches could make it into stable kernels. We also covered the follow-up talk in 2019.

But, using LLMs to assist reviews seems to be a new approach. During the introduction to the talk, Brandeburg noted that Ayari was out of the country on sabbatical and unable to co-present. The work that Brandeburg discussed during the presentation was not yet publicly available, though he said that there were plans to upload a paper soon with more detail. He also mentioned later in the talk that the point was to discuss what's possible rather than the specific technical implementation.

Why AI?

Brandeburg said that the interest in using LLMs to help with reviews was not because it's a buzzword, but because it has the potential to do things that have been hard to do with regular programming. He also clarified that he did not want to replace people at all, but to help them because the people doing reviews are overwhelmed. "We see 2,500 messages a month on netdev, 10,000-plus messages a month on LKML", he said. Senior reviewers have to respond "for the seven thousandth time on the mailing list" to a contributor to fix their formatting. "It gets really tedious" and wastes reviewers' time to have to correct simple things.

There are tools to help reviewers already, of course, but they are more limited. Brandeburg mentioned checkpatch, which is a Perl script that checks for errors in Linux kernel patches. He said it is pretty good at what it does, but it is "horrible for adapting to different code and having any context". It may be able to spot a single-line error, but it is "not great at telling you 'this function is too long'".

The experiment

For the experiment, Brandeburg said that he and Ayari used the ChatGPT-4o LLM and started giving it content to "make it into a reviewer that is an expert at making comments about simple things". He said that they created a "big rule set" using kernel documentation, plus his and other people's experience, to set the scope of what ChatGPT would review. "We don't really want AI to be just, you know, blowing smoke at everybody."

Having a tool provide feedback on the simple things, he said, would allow him to use his "experience and knowledge and context and history, the human part that I bring to the equation". But, another benefit is that the tool could be consistent. Looking through the mailing list, "people get inconsistent responses even on the simple things". For example, patches may lack correct subject lines or have terrible commit messages but "someone commits it anyway".

Brandeburg said that they tried to build experiments that would see if AI reviews could work, and compare its results with real replies as people worked through reviews and posts on the netdev list. He displayed a few slides that compared LLM review to "legacy automation" as well as human reviews and walked through some examples of feedback given by each. The LLM reviews actually offer suggestions or help, he said, but reviewers often do not. "They say stuff like 'hey, will you fix it?' or 'hey, can you go read the documentation?'" But ChatGPT gives good feedback in human-readable language. In addition, LLMs are "super great at reading paragraphs and understanding what they're trying to say" which is something that tools like checkpatch cannot do.

Another thing that LLMs excel at is judging if a commit message is written in imperative mood. The patch submission guidelines ask for changes to be described "as if you are giving orders to the codebase to change its behaviour". It is, he said, really hard to write programs that can interpret text to judge this the way that an LLM can.

Brandeburg said that there was something else that LLMs could, in theory, do that would be "very, very hard" for him as a reviewer: go back and look at previous revisions of a patch series to see if previous comments had been addressed. It would take him "hours and hours" for each series to look at all of the comments he had made. Sometimes "little stuff sneaks through because the reviewer's tired, or you switch reviewers mid-series". An LLM could be much better at going back to review previous discussions about a patch to take into account for the latest patch series.

LLMs can do something else that "legacy" tools cannot: they can make things up, or "hallucinate" in the industry terminology. Brandeburg said that they saw the LLM make mistakes "occasionally", if a patch was "really tiny" or if the LLM did not have enough context. He mentioned one instance where a #define used a negative number that the LLM flagged as an error. It also did not make sense to him as a reviewer, so he posted to the netdev mailing list about it "and found out that the code was perfectly correct". He said that was great feedback for him and the AI because it helps to refine its rules based on new information.

Humans did provide better coverage of technical and specific issues, which is "exactly what we want them to be doing". People are great at providing context and history, things that are "almost impossible" for an LLM to do. The LLM is only reviewing the content of the patch, which leaves a lot of missing context. Replies from people tended to be "all over the place", though. One of the slides in the presentation (slide 11) compared "AI versus human" comments as a percentage of issues covered. It showed only 9.3% "overlap" between human reviewers and the AI commenting on the same issues.

Questions

A member of the audience asked if that meant that humans were "basically ignoring all the style issues". Brandeburg said, "yeah, that's what we found." Human reviewers "didn't want to talk about the stupid stuff". In fact, he cited instances of people on LKML telling other reviewers to "quit complaining about the stupid stuff". He said that he understood that why someone who does a lot of reviews would say that, but that letting "trivial problems" slide meant that the long-term quality of the codebase would suffer.

Another audience member asked if the LLM ever said, "looks good to me" or simply did not have a reply for a patch. They observed that it is often hard for an LLM to say "I don't know" in response to a question. Brandeburg said that it was set up so that it could make comments if it had them, and not make comments if it didn't. He added that he was certainly not ready to have the AI add an "Acked-by" or "Signed-off-by" tag to patches.

Someone else in the audience said that this seemed like great work, but wondered what the plans were for getting human feedback if the AI has an incorrect response to a patch. Brandeburg said that he envisioned posting the rule set to a public Git repository and allowing pull requests to revise and complete the rules.

One attendee asked if Brandeburg and Ayari had compared the LLM tool's output to checkpatch, noting that some people may not comment on issues that checkpatch would pick up anyway. Brandeburg said that he did not imagine it replacing checkpatch. "I think this is an added tool that [...] adds more context and ability to do things that checkpatch can't". He acknowledged that comparing results might help answer the question of whether human reviewers simply ignored things that they knew checkpatch would catch.

As the session was running out of time, Brandeburg took a final question about whether this LLM would reply to spam messages. He said, that it probably would if the mail made it through to the mailing list, but he joked that "hopefully, the spam doesn't have code content in it" and wouldn't be committed by a maintainer who wasn't paying close attention.

He closed the session by inviting people to read through the slides, which have answers to frequently asked questions like "will this replace us all as developers?" He added, "I don't think so because we need humans to be smart and do human things, and AI to do AI things".

Brandeburg did not go into great detail about plans to implement changes based on the experiment and its findings. However, the "Potential Future Work" slide in his presentation lists some ideas for what might happen next. This includes ideas like making an LLM-review process into a Python library for reviewers, a GitHub-actions-style system for providing patch and commit message suggestions, as well as fully automated replies and inclusion into the bot tests if the community likes LLM-driven reviews.

Human reviewers are still going to be in high demand for decades to come, but LLM-driven tools might just make the work a little easier and more pleasant before too long.

Comments (40 posted)

Application monitoring with OpenSnitch

By Daroc Alden
September 5, 2024

OpenSnitch is an "interactive application firewall". Like other firewalls, it uses a series of rules to decide what network traffic should be permitted. Unlike many other firewalls, though, OpenSnitch does not ask the user to create a list of rules ahead of time. Instead, the list of rules can be built up incrementally as applications make connections — and the user can peruse both the rules that have built up over time, and statistics on the connections that have been attempted.

The OpenSnitch project was started in 2017 by Simone Margaritelli as a native Linux alternative to the Little Snitch firewall application for Apple devices. Usually, firewalls focus on blocking unwanted inbound connections; both Snitches, on the other hand, specialize in blocking unwanted outgoing connections — hopefully foiling unwanted tracking, advertising, and malware connections to command-and-control servers. Over the past seven years, the GPLv3-licensed project has accepted contributions from 80 other contributors, and grown into a capable firewall. Version 1.6.6, released at the beginning of July, contains a small set of bug fixes and improvements.

OpenSnitch produces both .deb and .rpm packages for installation, but the project is only included in the official package lists for Debian, Ubuntu, Arch, and NixOS. The Debian package maintainer, Gustavo Iñiguez Goya, is also the project's most prolific contributor. There are two separate packages, one for the core daemon, and one for the user interface.

Using OpenSnitch

Upon first starting OpenSnitch the user is presented with a series of notifications like this:

OpenSnitch's strength is in its interactivity. Every time an application attempts to open a connection for which there is not an existing firewall rule, the user is shown a dialog that lets them select the appropriate action. If the user doesn't respond within a set amount of time, a configurable default action is taken. In either case, the action is recorded as a new rule, so that further connections from the same application don't trigger new dialogs. Rules can refer to specific executables (identified by path or by hash), command lines, mount points, source and destination IP addresses and ports, users, processes, or protocols. While the popups don't allow one to be too granular, the rule-editor also supports setting regular expressions or other conditions for all of these attributes. The advantage of identifying executables by hash is that malware cannot surreptitiously hijack existing programs to circumvent the firewall rules. The disadvantage is that the rules will require updating after an operating-system update; whether this extra work is worth the security compared to identifying programs by path is, of course, up to the user.

The project's "Getting started" documentation suggests setting the default action to "allow" for a while, so that OpenSnitch can build up a profile of what connections are normal. Then the user can inspect the set of created rules and determine how they would like to move forward. While the initial flood of notifications is a tad overwhelming, once the first rules are in place for the applications currently in use, everything becomes much more manageable. The main user interface (written in QT, using QT's Python bindings) allows one to inspect which firewall actions have been taken recently, peruse the list of accumulated rules, look at what domains are being connected to, and other related administrative actions:

One nice touch is that the user interface does not necessarily have to run on the same computer as the daemon itself. This lets users run the daemon on a fleet of machines, and have them report back to a single centralized computer — a design that makes OpenSnitch useful for more than just personal use. The user interface is not without flaws, however. It disagreed with my usual choice of window manager (Sway), showing black boxes in the rule-editing dialog, until I set QT_QPA_PLATFORM to xcb as recommended in the release notes. The images above are from GNOME, which seems to have no trouble with the user interface.

The graphical user interface isn't the only option for configuring OpenSnitch, however. Users can also write rules directly in JSON format. Or, more reasonably, export them from one computer to use on another. The JSON format was previously a bit complicated to write by hand, but the 1.6.6 release included some cleanup to make rules easier to write. Here's an example from the documentation of a rule that blocks executables with a non-empty LD_PRELOAD environment variable from making connections:

    {
      "created": "2024-05-31T23:39:28 02:00",
      "updated": "2024-05-31T23:39:28 02:00",
      "name": "000-block-ld-preload",
      "description": "",
      "action": "reject",
      "duration": "always",
      "enabled": true,
      "precedence": true,
      "nolog": false
      "operator": {
        "operand": "process.env.LD_PRELOAD",
        "data": "^(\\.|/).*",
        "type": "regexp",
        "sensitive": false
      }
    }

Implementation details

OpenSnitch can monitor processes in three different ways: using BPF, using the kernel's auditing support, or by watching files in /proc. The preferred method is BPF, because it is both more performant and has fewer problems with intercepting connections under some conditions. OpenSnitch ships with three BPF programs: one that intercepts DNS requests, one that tracks processes as they start and terminate, and one that tracks non-DNS connections. They require a kernel newer than 5.5 to function properly, because they depend on the bpf_probe_read_user_str() helper function. Of the three methods OpenSnitch supports, BPF is the only one that can intercept connections made from kernel space — such as might be caused by a kernel compromise.

If BPF cannot be used, OpenSnitch can instead rely on the kernel's audit subsystem. This requires auditd, the user space component of the system, to be installed and configured properly. Unfortunately, unlike BPF, the audit subsystem doesn't permit intercepting and blocking the connection attempt itself. Instead, OpenSnitch uses iptables to redirect outgoing connections to a queue that it monitors. When a new connection is made, the program uses netlink and information from the audit subsystem to determine which process is responsible. Once it has the information needed to apply its rules, it either drops the connection or sets up a new temporary iptables rule for it.

When the audit system is not available, OpenSnitch gets the information it needs by reading files from /proc — which is available on most systems, making it a reasonable fallback option. The downside is that not all connections can be tracked this way. For example, most connections are visible under /proc/net/, but connections made from a container are instead under /proc/<PID>/net/ for the process identifier (PID) of the container. Because the decision-making parts of OpenSnitch run in user space, the non-BPF methods can cause problems when the computer is under heavy load; connections can be closed or otherwise interfered with before OpenSnitch can find the relevant information.

Despite the performance and reliability improvements compared to searching /proc, the other two methods still remain officially experimental. Still, they have been functional for several years at this point, and using BPF by default seems like a reasonable choice.

In all, OpenSnitch provides a unique offering compared to other firewall solutions. While there are other products that focus on blocking outbound connections — such as Pi-hole — OpenSnitch's interactivity and easy configuration make it a good choice for personal use. The project is not perfect, but it is a useful tool for preventing unwanted connections, and definitely something that security-conscious users should consider.

Comments (5 posted)

Page editor: Jonathan Corbet
Next page: Brief items>>


Copyright © 2024, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds