Currently, the namespace and article title are merged into a single tag, but it would make life easier to have a separate <namespace> tag.
Version: unspecified
Severity: enhancement
drdee | |
Feb 28 2011, 12:16 AM |
F7546: add_namespace_to_page.diff | |
Nov 21 2014, 11:27 PM |
Currently, the namespace and article title are merged into a single tag, but it would make life easier to have a separate <namespace> tag.
Version: unspecified
Severity: enhancement
Status | Subtype | Assigned | Task | ||
---|---|---|---|---|---|
Invalid | ArielGlenn | T29772 XML data post-processing (tracking) | |||
Resolved | ArielGlenn | T29775 namespace should have it's own XML tag |
I will briefly expand on this. Right now if you want to determine whether an article belongs to the main namespace, you need to rule out that it does not belong to any other namespace. So, you iterate over all the local names of the namespace and make sure that the title of the article does not match to the namespace. If none of the namespaces match then you can conclude the article belongs to the main namespace. So this is a lot of extra work and a separate <namespace>0</namespace> tag would be ideal.
I agree. The text matching currently necessary doesn't have to be there. But besides the suggested <namespace> tag I would suggest also the more concise <ns> tag, or even better, just add an attribute either "ns" or "namespace" to the <title> tag.
Just a note to say please make sure that the XML dump version number is bumped at the same time a dump feature is added so that dump parsers that need to work with all dump versions can enable support for features based on the version number. It can make the code faster and the version number wasn't changed when the <redirect> tag was added.
sumanah wrote:
Added the "patch" and "need-review" keywords; Mark hopes to get someone to review the patch soon.
This patch looks ok to me.
Bear in mind that it's possible for the namespaces to change in the middle of a run, for example if a custom namespace is added to accomodate content that the community wishes to move out of the main namespace. That won't happen often but dump users will probably get bitten by it once in awhile.