Topic on Extension talk:CategoryTree

Creating an index extension addition

3
Jamzze (talkcontribs)

Hi all, from a conversation I had on the Wikipeda: Village pump, I wanted to approach this talk page to ask/ suggest for an extension. If this should be posted elsewhere, please do let me know!


Proposed idea: to create an automated index function that displays all the pages within a given category and a given level of sub-categories A-Z on a page.


Background: Currently, index articles throughout wikipeda are built manually through linking various pages onto a dedicated page (examples: sociology, physics, environment) and displaying these alphabetically A-Z, showing all the pages related to a given topic one after another in a paragraph, rather than a list, under "A", "B", "C", etc. headings. These also have some variations in their formatting, such as sociology displaying all of their index on one page, whilst physics has a dedicated page for each letter of the index.


Current limitations: this manual curation of pages related to a given project is highly time consuming as its manual production requires dedicated users to maintain these pages. This requires a distinct knowledge of what pages are already present on Wikipedia to know what to include, offers time gaps between pages being created and then added to the index, and creates delays for pages that are removed from a given project if they are decided not to be related.


Current work arounds:

  • Categorytree: through a number of conversations, I have been pointed to use the categorytree extension as a workaround to automate an index. However, this does not serve the current purpose of what an index is as it shows pages sorted via sub-categories, rather than a top-level summary off all the pages shown alphabetically. The current function of the categorytree as well is not formatted for purpose for an index as its drop-down menus and sub-categories are distinctly different to the A-Z, paragraph layouts associated with indexes. This make it unsuitable for use as an index tool and requires further functionality to adapt it for use.
  • External solutions: there have also been a number of suggestions to use external functions like petscan to create a list and then add this query saved to an index page. However, as index pages are established elements of wikipedia and other wiki rojects, this is limited by not being overseen by wiki support/ subject to dead links, as well as not being visible within the wiki page itself. This makes it unsuitable in the long-term for use as an indexing tool for wiki pages.


Suggested request/ solution:

To create an extension of categorytree that would allow for pages from an assigned category to be displayed on a page A-Z through the following:

  • Similar to categorytree, the ability to identify a given category of interest and the provision to include a given number of x sub-categories in generating an index of relevant pages.
  • Instead of the categorytree's drop down menu, to be formatted similarly to other index pages (see above in background). This would entail generating headings in the style of wikipedia's formatting for a heading style for each letter (e,g, heading "A", etc.) and display all the corresponding pages within a given category starting with that letter under that heading.
  • As some indexes are formatted via each letter of its index having its own page (to cope with the size of the index - see physics above in background) it would also be good to be able to limit the generated index of pages to a certain letter. E.g. creating an index of only the pages starting with "Z" from x given category(ies).
  • As categories and their sub-cats fork, etc. they can have a number of pages present more than once. This functionality would need to be able to cancel out any duplicated pages present in the index.
  • Within each paragraph generated, their formatting would follow each pages' name one after another, rather than generating a drop-down list. For example, this would look something like: "Acid mine drainage - Acid rain - Adsorption Method for Sampling of Dioxins and Furans - Aegean Sea (oil spill)".

Note: I do not have a working knowledge of how categorytree functions/ can be adapted by. The above are suggestions to the functionality that I am looking for to make an automated index possible and posting here as categorytree seems the most likely function similar to it. If adapting categorytree would not be possible to achieve the above and an entirely knew functionality is needed, do let me know.


Benefits of suggestion

  1. Up-to-date and concise indexes: Indexes play an important function within encyclopedic projects as a form of navigation and subject finding. They provide a complementary wayfinding function to categories, but instead of being thematically presented, they are presented alphabetically to provide the reader a all-in-one view of what articles are included around a given subject. Presenting this index in an automated fashion, users of a given wikipedia index can be assured that they are looking at a concise index for their given topic of interest, including all the articles they might be interested within A-Z.
  2. Maintenance burden decreased: Automatically generated indexes reduce the time and resources current users provide in maintaining manually curated index pages. Instead, this can be redirected elsewhere such as category maintenance, etc.
  3. More reflective of recent changes: Relying on category tagging of pages ensures that changes to pages, such as moving them to other project areas, deletion, merging, etc. are shown more quickly within the index of a given thematic area.
  4. Helps highlight irrelevant pages: Focusing on categories to generate index pages also helps in providing a new tool to highlight irrelevant/ missed pages within an index for the associated thematic area of wiki that use indexes. As their inclusion/ exclusion on an index can help outline mistagged categories, missing category tags, etc. to editors which would help the overall upkeep of a given area of wikipedia articles.
  5. Trying not to reinvent the wheel: by using categories, rather than suggesting a totally new system of generating an index from, helps to reduce the effort (hopefully) needed to be able to create this functionality. It bases it on something editors are already familar with and can be built upon from already existing ways of organising the wiki project.


Limitations of suggestion

  1. Development time/ help needed: I am not knowledgeable in the ins and outs of wikicode, etc. so someone with more intimate knowledge will be needed to make this request a possiblity - requiring additional time from another user(s). However, if developed, I believe that this would be a worthwhile feature and usable across wikipedia and other wiki projects. Collaboration on this project then would generate useful outputs and provide a useful feature to be adapted by wiki projects., making the time spent developing it worthwhile.
  2. Massive category interconnections: as the above suggestion to generate indexes is based on existing categories, this creates limitations in the potential endless index that could be created due to categories being so interconnected on wikipedia. However, I have suggested above to include similar limitations that categorytree offers in being able to set the number of sub-category levels that the index would be generated from - helping to limit the size of the index and make it useful for users.
  3. Current state of categorisation networks: the index would be generated from current categories and based on their current networked relations. For a number of projects, this might generate an index with a number of problems, from missing articles to unwanted articles within their automated index compared to their manually curated index. However, this can easily be overcome with a cleanup to article categories (e.g. removing categories from pages that are not relevant, reworking category relations to ensure they showup at the right level to be included within the index, etc.). As well, I feel like this would only be a real issue to large and broad wiki projects, like Sociology. whereas more tightly defined areas, like the environment, would face less of a problem. As well, this automated index would then be benefical in providing a visual outline of what, for example, "Sociology" currently includes on wikipedia in terms of articles and act as an indicator for pages that need attention (e.g. to be recategorised, removed, etc. from sociology).
  4. Testing: this would all require an element of testing by the user interested in creating an automated index to ensure that the index created is reflective/ sensitive to the articles that should be included within any given topic.


I hope this offers an overview of what I think would be a valuable function for a lot of wikipedia projects. If further elaboration is needed on any part do let me know.

Tacsipacsi (talkcontribs)

Another limitation of your suggestion is that changes are invisible on watchlists. People may put the index pages on their watchlists in order to be notified about new articles (or even removals, which may be due to vandalism), but if the list is generated by the extension, these changes don’t cause edits and thus don’t cause watchlist notifications. Using a bot instead would cause the lists be less reflective to recent changes (as it would run only at given intervals, probably once or a few times a day), but it would keep watchlists usable.

Jamzze (talkcontribs)

I think a bot would be helpful for this - having the downside of it being less reflective of changes would be outweighed by the benefits this brings for watchability.

Reply to "Creating an index extension addition"