Categorizing questions with tags

Problem: we need a way to identify questions about specific topics. As a visitor, I want to refine my search using well-identified categories (for example: programming language, geographical location, plant, tractate, …). As a potential answerer, I want to browse (or receive notifications about, etc.) questions about the topics that I’m an expert on.

Solution sketch: tags work well on Stack Exchange.

MVP features:

  • Each question has 1–5 tags.
  • Tag names are ASCII letters and a few punctuation signs.
  • Anybody with a certain privilege can create a tag.

Justification for MVP: a dozen posts in, we already find the categories on this forum too restrictive, and aren’t happy that only admins can create categories. The early days of a site are when a lot of tags get created, so we need a low-friction way to do that.

We may want to refine how they work, to have fancy displays, to allow non-ASCII characters, to add tag wikis, to add tag-specific guidance, to have renaming and mass-retagging tools, to define synonyms, maybe <gasp> have some form of tag hierarchy. But none of that is MVP.

4 Likes

Tags are a must. Fancy editing, not so much. Suggest for MVP:

  • Simple admin page for “moderator/admin” (however we define that) initially. Version 2 can include users-with-X-reputation get to edit tags, etc.
  • Initial tag list for each topic site can be somewhat arbitrarily created based on top tags of comparable SE site
  • Tags should include, but not necessarily require, a free-form text explanation.

Suggestion for MVP: each tag has an optional text description. The purpose of the description is to explain to people (primarily people asking questions) when to use that tag.

Ultimately we’ll want the longer description (wiki, on SE) too. I don’t think it’s required for the MVP, but if it’s easy we should try to do them together.

3 Likes

It’s been a week since this thread had any activity. Are we agreed on MVP for tags? Specifically:

  • Each question has 1–5 tags.
  • Tag names are ASCII letters and a few punctuation signs.
  • Each tag has an optional text description. (Wiki is not required for MVP but nice to have.)
  • Anybody with a certain privilege can create a tag.
  • Anybody with a certain privilege can edit tag descriptions.
  • People can browse the list of tags.

(Data import from SE should include tags, but let’s do that under data import as it’s not really a requirement of tags themselves.)

3 Likes
  • The tag page should show how many questions use the tag
  • When you select a tag, you should be able to see (mouseover or similar) the text and the # of questions that use the tag - that helps to encourage people to use the more popular of 2 (or more) similar tags, allowing one-off tags to be merged or removed at a later time
  • There needs to be some (moderator or high-reputation) ability to merge and delete tags. See https://diy.meta.stackexchange.com/questions/1476/we-should-finish-this for an example of the problems that occasionally need to be fixed.
3 Likes

We appear to have reached consensus here.

  • Each question has 1–5 tags.
  • Tag names are alphanumerics and a few punctuation signs.
  • Each tag has an optional text description. (Wiki is not required for MVP but nice to have.)
  • Anybody with a certain privilege can create a tag.
  • Anybody with a certain privilege can edit tag descriptions.
  • People can browse the list of tags.

I have altered “ASCII letters” to “alphanumerics”; there is little reason to exclude Unicode, and given that it will be allowed by default, would take more technical work to disallow it. Please feed back if you believe this is mistaken.

This topic will close 36 hours after the last reply. If I have missed or misconstrued anything, please reply here.

(Added to MVP list.)

5 Likes

Added to requirements; please update if this changes.

I propose to restrict the MVP to ASCII because normalization is hard. Which of these characters are equivalent? aAΑА𐌀𝖠𝐴𝛢ᎪÁÁĂӐᾸ The right answer may well depend on the community (especially on what language it’s in).

1 Like

Sometimes 5 tags is just one less than really should be applied. The limit seems targeted to SO, not other sites. Is there really such a great problem with 6 tags, or even 7?

1 Like

IMHO, the right way to store tag information is with a linked table. There is some push (and I actually agree for some items) to cram as much as possible into the main record to minimize joins. However, it can actually be much easier and much more flexible to use linked tables for things like tags. If we do that (and I really think we should) then the “max number of tags per question” is simply one configuration item in the database and nothing more. A community or an entire instance of Codidact could then adjust as needed.

3 Likes

Agreed. The ideal database structure for this would look like something along these lines - AFAIK, this is how SE does it too:

CREATE TABLE codidact.Posts (
    -- ... lots of other fields
    -- tags get cached on the post as a string for easy display without a join
    Tags varchar(1000) NOT NULL
);

CREATE TABLE codidact.Tags (
    Id BIGINT NOT NULL PRIMARY KEY AUTO_INCREMENT,
    Name VARCHAR(255) NOT NULL,
    Description TEXT
);

CREATE TABLE codidact.PostTags (
    PostId BIGINT NOT NULL,
    TagId BIGINT NOT NULL,
    
    PRIMARY KEY (PostId, TagId),
    CONSTRAINT FKPostTagsPosts FOREIGN KEY (PostId) REFERENCES Posts (Id),
    CONSTRAINT FKPostTagsTags FOREIGN KEY (TagId) REFERENCES Tags (Id)
);

This way, we have tags cached on the post as a string - probably just space-separated for ease of use - which means we can display them on the post page without a join. We also have them stored as an N-N association through the PostTags table, which means we can still index and query them easily.

2 Likes

In cases where you see multiple identical titles, tags would help distinguish between them. Basically wherever you can see a question’s title, try to always show tags too. That should remove the need to put it in the title.

Also, search on SE is too basic and non-interactive for the kind of auto-suggestion logic that can be built in 2019, so let’s imagine a better future and implement it together in code.

3 Likes

I never agreed with the “don’t put tags in titles” rule. Titles without tags show up in important places, including that drop-down list of questions that might be dupes of the one you’re trying to ask. The title should contain all the essential information, somehow – whether that’s words typed by the user or importing tags or some hybrid. (A 5-tag question might make for a bloated title.)

4 Likes

@BornToSuffer I don’t see any shortcomings of tags on SE in your post.

No, that’s wrong. The right way to write this question is almost always ”How do I do X in a?“. There is no rule or guidelines on Stack Exchange to specifically exclude information that is conveyed by tags from titles. There is a minority of users who push for such a rule, but their only rationale is the catch phrase “don’t put tags in titles” which does not mean that.

Stack Exchange does this automatically by prepending the most popular tag to the title in some contexts.

Stack Exchange does this with tag wiki excerpts.

1 Like

This topic was automatically closed 36 hours after the last reply. New replies are no longer allowed.