Page MenuHomePhabricator

[Project proposal] Improve searchability and filtering of PageTriage feed
Open, Needs TriagePublic

Description

Summary
PageTriage is an MediaWiki extension that allows patrollers on the English Wikipedia to track, categorize and deal with problematic new pages. One of it's features is the VueJS based New pages feed which allows patrollers to filter specific interesting pages they might want to patrol based on certain criteria. However, these filters are often limited and there has been some interest amongst the community to introduce newer filters and in general improve the ability to search for specific content on the New pages feed.

As part of this project, we would like to enhance the filtering and searching capabilities of the New pages feed. Particularly, we would like to add AI based topic prediction (leveraging the ORES API), the ability to search for a specific keyword in a article, filter by how many pageviews a article gets and be able to search by how similar a particular page is to other deleted pages. While other ideas are also welcomed, they might need to be reviewed by community members before being implemented.

Reachout: Our dedicated channel technical channel is #page-triage, located at the NPP Discord. For general Google Summer of Code doubts feel free to reach out at the Wikimedia Zulip

Skills: VueJS, some familiarity with PHP

Mentors: @Soda, @TheresNoTime

Size: 350 hours

Difficulty: Intermediate

Microtasks:

  • Create a small independent tool/web app that iteracts with any Wikimedia API and displays some information about a article. The tool must have a frontend built using VueJS and the Wikimedia Codex UI library. Include a link to the source code in your proposal
  • Setup the PageTriage extension (using these draft instructions) along with MediaWiki-Docker.
  • Attempt to solve one (or more) small task(s) (there are a few good beginner tasks at good first task, or you could choose one from PageTriage that you are able to understand). Make sure to specifically link to the tasks you have attempted in your proposal.

Core deliverables:

  • T218132 Add ORES topic prediction to the NewPagesFeed and allow filtering by the same
  • T207238 Special:NewPageFeed - add option to filter by pageviews
  • T207761 Keyword Search for New Pages Feed
  • T327955 See and filter with percent similarity to top deleted revision

Event Timeline

Soda renamed this task from Improve searchability and filtering of PageTriage feed to [Project proposal] Improve searchability and filtering of PageTriage feed.Feb 12 2024, 5:49 PM

Hi @Soda

Thank you for sharing your project proposal. Kindly add the following details to your porposal:

  • Brief summary: In 8-10 lines
  • Skills: Programming skills, specific technologies and Phabricator project tags
  • Microtasks: Links to easy and self-contained tasks on Phabricator that students could work on to get familiar with the project. GSoC / Outreachy candidates are required to complete microtasks during the application period to prove their ability to work on a three month long project
  • Size: 90, 175 or 350 hours
  • Difficulty: Kindly add an easy, intermediate or hard/difficulty rating. This helps the more inexperienced folks not get overwhelmed and they can focus on reviewing easy project ideas.

Feel free to reach out with any questions.

Thank you,

Hi @Soda

I have added this project to our GSOC 2024 Media Wiki page: https://www.mediawiki.org/wiki/Google_Summer_of_Code/2024#Ideas_for_projects

Kindly share your project via Wikitech by replying to this thread: https://lists.wikimedia.org/hyperkitty/list/[email protected]/thread/Y7PRNX3SMKLTT6ABLGYADTLT2NQ7MKJE/

If this is your first time mentoring via GSoC, I recommend reviewing this guide for mentors: https://www.mediawiki.org/wiki/Google_Summer_of_Code/Mentors. Additionally, i have added you and your co-mentor to our Zulip chat where you can connect with fellow mentors for ongoing support and collaboration.

Greetings,
Apologies for asking basic things here but after looking into projects list for GSOC 24' , I found this task to be the most Interesting
Can someone please tell me how to proceed and get familiar with this project
As phabricator is new to me, any help regarding this will suffice
Thanks

I am Ahmed Sobhy, a computer engineering student at Cairo University.

I like the idea and the skillset needed fits perfectly with mine.

I wanted to get more details about how the extension currently works and how the feature/idea would look like to be at the end of the project. Also, I want to ask if there are any resources or research that could help me to get ready to work with you.

Thank you!

@AhmedSobhyOfficial Take a look at Special:NewPagesFeed on the English Wikipedia to see the current extension and how it works :)

@FireNdIce3 @AhmedSobhyOfficial As a start, y'all can try and figure out how to get a local instance of Mediawiki running (instructions here) and try to install the PageTriage extension (draft instructions here) :)

Btw, if you have more questions, feel free to ping me on Wikimedia Zulipchat :)

@Soda , I already have mediawiki setup and will proceed with the PageTriage Extension installation . I have previously contributed to WikiEduDashboard and InlineComments Extension under Wikimedia, that being said , will my previous contributions help my proposal being selected?

@Soda , I already have mediawiki setup and will proceed with the PageTriage Extension installation . I have previously contributed to WikiEduDashboard and InlineComments Extension under Wikimedia, that being said , will my previous contributions help my proposal being selected?

Feel free to put any and all contributions that you have made in your proposal (the more context you give us about why you are a good fit for the project the better).

Hello @Soda, I'm currently in the process of setting up Pagetriage, but I'm encountering some issues with Docker setup. Could you please advise me on where I can seek assistance for this?
Do we have any dedicated channel for this project?

Here's my notes on how to set up MediaWiki Docker PageTriage on Windows. https://en.wikipedia.org/wiki/User:Novem_Linguae/Essays/Docker_tutorial_for_Windows_(WSL)

Our dedicated channel is #page-triage, located at https://discordapp.com/invite/heF3xPu

You could also seek assistance in this Phab ticket by sharing more details, such as the error message.

Here's my notes on how to set up MediaWiki Docker PageTriage on Windows. https://en.wikipedia.org/wiki/User:Novem_Linguae/Essays/Docker_tutorial_for_Windows_(WSL)

Our dedicated channel is #page-triage, located at https://discordapp.com/invite/heF3xPu

You could also seek assistance in this Phab ticket by sharing more details, such as the error message.

Thank you for this. I solved the error by following resources available on google.

Hi @Soda, I'm working on the first pre-GSoC task. Due to CORS restrictions, I'm unable to receive responses from the URL. Should I consider implementing a backend solution using PHP or Node.js, or should I use a CORS proxy?

Hi @Soda, I'm working on the first pre-GSoC task. Due to CORS restrictions, I'm unable to receive responses from the URL. Should I consider implementing a backend solution using PHP or Node.js, or should I use a CORS proxy?

I believe you can append &origin=* to bypass CORS(see this page for more info). MediaWiki has a very useful Api sandbox with the different parameters and how they affect the output

Hi, @Soda , Have you seen my project, that of Vuejs? , I sent you an invite as a collaborator and added it to my initial proposal(still being modified).

Hi @Soda, I'm working on the first pre-GSoC task. Due to CORS restrictions, I'm unable to receive responses from the URL. Should I consider implementing a backend solution using PHP or Node.js, or should I use a CORS proxy?

Hi @Soda, I'm working on the first pre-GSoC task. Due to CORS restrictions, I'm unable to receive responses from the URL. Should I consider implementing a backend solution using PHP or Node.js, or should I use a CORS proxy?

I believe you can append &origin=* to bypass CORS(see this page for more info). MediaWiki has a very useful Api sandbox with the different parameters and how they affect the output

Thanks, I will look into it!

Hi and thank you for your interest! Please check thoroughly https://www.mediawiki.org/wiki/New_Developers (and all of its communication section!). The page covers how to get started, assigning tasks, task status, how to find a codebase, how to create patches, where to ask general development questions and where to get help with setup problems, and how to ask good questions. Thanks a lot! :)

i would like to work on this task as a GSOC intern.

Hi and thank you for your interest! Please check thoroughly https://www.mediawiki.org/wiki/New_Developers (and all of its communication section!). The page covers how to get started, assigning tasks, task status, how to find a codebase, how to create patches, where to ask general development questions and where to get help with setup problems, and how to ask good questions. Thanks a lot! :)