A URL-pattern proposal

We need to settle on a pattern for URLs within Codidact.

I suggest the following:

(Everything is prepended with /community/[x] for technical reasons, this I don’t care about here.)

For user-supplied items (outside a category):
/[object(pl)]/[id]/[url-friendly-name]
or
/[object(pl)]/[id]/[url-friendly-name]/[action]  ([url-friendly-name] may be "-" to be considered blank)

For user-supplied items (inside a category):
/category/[cat-name]/[object(pl)]/[id]/[url-friendly-name]
or
/category/[cat-name]/[object(pl)]/[id]/[url-friendly-name]/[action]  ([url-friendly-name] may be "-" to be considered blank)

For administrative items (not user-supplied):

/[object(sg)]/[url-friendly-name]

Examples of valid URLs according to these rules:

  • /category/main/questions/42/why-is-codidact-so-awesome
  • /category/meta/questions/101/-/vote
  • /users/1/luap42
  • /users/1/luap42/suspend
  • /admin/suspended-users
  • /auth/logout

Examples of invalid URLs: (what’s wrong)

  • /category/main/question/42/why-is-codidact-so-awesome (singular)
  • /category/meta/questions/101/vote (no “-”)
  • /users/luap42 (no ID)
  • /users/suspend/1/luap42 (wrong order)
  • /admins/suspended-users (plural)
  • /logout (no group)

What do you think of this? Could the language/style be simplified? Please help me with phrasing.

On action-type links, you don’t need to include the URL-friendly name, even in “blank” form. These will overwhelmingly be POST routes, and even for those that aren’t, SEO is unnecessary.

So, these:

  • /category/meta/questions/101/-/vote
  • /users/1/luap42/suspend

Should (in contrast to your “invalid” section) be these:

  • /category/meta/questions/101/vote
  • /users/1/suspend
2 Likes

I mostly agree to that, too, however I am not sure, whether it might lead to routing problems, for example when a user is named “suspend” or so. (Or a meta question titled “Vote!”).

Therefore it might be hard to distinguish between the url-friendly-name and the action.

As category names will be a very limited, instance-wide group (i.e., I would not expect them to vary by community other than “each community will use a subset of available categories”), there should not be any need to have /category included. So:

  • /main/questions/42/why-is-codidact-so-awesome
  • /meta/questions/101/-/vote

As of now we are looking at a group of categories along the lines of (not all will be MVP, not all will necessarily ever happen, and there may be others):

  • main - Maybe this should be qa instead (but it is the “main” one and the only “universal” except meta)
  • meta
  • blog
  • canonical
  • wik
  • chat
  • discussion
  • forum
    etc.

You solve that by putting the action routes first (or whatever method .NET Core uses for giving them higher priority), so that only if no action matches can you add the friendly title to the end.

Also, in many cases, the action link will be of the type /category/main/posts/101/vote rather than /category/main/questions/101/vote, which also goes some way to alleviating that issue.

3 Likes

I thought we were letting communities define those categories. For example, I could imagine a community deciding that its set of canonical posts should be called “documentation”; that doesn’t mean we need to add another category to the whole instance (confusing people configuring new communities).

Category names are going to be emergent; patterns will become apparent over time, but we can’t know the full set of ones that will be used now.

4 Likes

Sure, that’ll work for some of them. But probably not all (“users”) comes to my mind

That’s probably true.

If the category names are really that variable then we may have to do /category/[categoryname]/stuff-to-actually-do. I was hoping to avoid that extra level, though arguably it doesn’t matter all that much since people can (and will) either start out at a deep link via Google/etc. or start at a Community home page and drill down from there.

My thought was that different categories would have different inherent characteristics (e.g., whether they have Answers or not). If we wanted to allow “documentation” vs. “canonical” etc. then we have a few options (using this specific example, but could apply to any category type):

“category” identifier in URL

  • Define “canonical” as the canonical (yes, that was intentional) example of that category type.
  • Each community can have 0 or more “canonical” categories, as long as they have different names.
  • Each community can change the name of their canonical category, as long as it does not conflict with any other category.
  • All URLs into categories (including main/qa) have /category/ or some other instance-specific or community-specific (but always the same for categories within a community) identifier (e.g., a Spanish site might use categoria)

Advantage: Maximum flexibility in # of categories and names of categories

Shorter URL with Rename

  • Define “canonical” as the canonical example of that category type.
  • Each community can have 0 or more “canonical” categories, as long as they have different names.
  • Each community can change the name of their canonical category, as long as it does not conflict with any other category or with any other top-level URL identifier (e.g., users, admin``, auth`)

Advantage: Shorter URL. Flexible # of categories but not (absolutely speaking) names of categories.

Shorter URL without Rename

  • Define “canonical” as the canonical example of that category type.
  • Each community can have 0 or 1 “canonical” categories.
  • Each community can change the display name of their canonical category, but can’t change the URL identifier - e.g., /canonical/questions/blah… even if the page actually displays it as “Documentation”

Advantage: Shorter URL.Simpler URL parsing.

I don’t understand what you mean by this; can you clarify? Are you talking about baking in a category name? I don’t think most sites are going to have a “canonical” category at all, that those who do are more likely to call it “wiki” or maybe “doc” because users know what those words mean, and if we want to bake in anything it’d be meta, not “canonical”. But I think we should leave it all to configuration and not bake in anything.

There will be some standard configurations of name + post type(s) allowed + tag set. That doesn’t mean that every site would use them exactly the same way.

Build a flexible platform. Give communities good packages of configuration so they don’t have to tune a hundred knobs if they don’t want to. Let communities build what’s best for them.

My browser’s address bar easily fits e.g. https://long-community-name.codidact.om/category/long-category-name/questions/122211/a-long-question-title-aaaa-bbbbbbbb-cccc-asfkjajflkjaslkdfjlajflaklsdfja on smaller monitors that might get cut off, but I don’t think that’s much of an issue.

Having /category/ in the URL is more explicit and allows people to understand the URL pattern and use it to navigate manually.

2 Likes

Everything after /questions/122211 in your example is optional anyway – human-readable info but not required to identify the page to be served.

Well it sounds like based on your response and others that /category/[category-name] is the way to go (what I gave as the first option " “category” identifier in URL"). Most flexible. Just looks like a little more (one field) extra stuff in there than necessary.

Presumably the one additional variant would be “default to main, default to questions”, so that if someone goes to:

writing.dodidact.com

it would automatically rewrite/map/redirect/whatever (the specifics could be done a few different ways) to an effective location of

writing.codidact.com/category/qa/

(if qa is the “main” category for writing) and that would be the home page but which would likely be the same as (or a variant of) the list of most recent questions at:

writing.codidact.com/category/qa/questions/

1 Like

Why even have the category in the URL? It seems unnecessary to me.

1 Like

The category needs to be in the URL for a bunch of reasons. Using my last example:

writing.codidact.com/category/qa/questions/

will give a very different page from:

writing.codidact.com/category/blog/questions/ (Blogs)
or
writing.codidact.com/category/wiki/questions/ (Wiki)

That being said, a deep link to a question could be shortened from:

writing.codidact.com/category/qa/questions/12345/what-should-I-do-about-my-awful-protagonist

to

writing.codidact.com/category/qa/questions/12345/ (remove the human readable title - will still work fine)

and potentially to:

writing.codidact.com/posts/12345/

(The word “category” and the category name “qa” are not needed to identify a post since they are all in one database).

and even:

codidact.com/posts/12345/

since all communities within an instance will share a database, there is no duplication of ID.

That being said, I would probably argue against allowing “no subdomain or community identifier” from being allowed (except for specific instance-level pages) as that could lead to some confusing links. But allowing:

writing.codidact.com/posts/12345/

as a shortlink is almost as good as a (ugh) bit.ly while still remaining within our system and easy to unambiguously parse/redirect/display.

4 Likes

Please make that /category/something/posts/12345 – posts not questions. Not all top-level things will be questions (for example, blog), but all questions are posts. Let’s use the general name.

2 Likes

That’s fine. I deliberately used /posts/ in the shortest URL for that reason. It has an advantage not just for blog vs. canonical vs. wiki etc. but also for the MVP questions vs. answers - they are all posts, all in one table, etc.

2 Likes

For a listing of the questions sure, but for individual pages, what’s wrong with this as the canonical URL for an individual Q&A page?

writing.codidact.com/questions/12345/what-should-I-do-about-my-awful-protagonist

@cellio suggested (and I agree) using posts instead of questions as it is more generic - fits whether it is a Q or A or Wiki or Blog or whatever.

5 Likes

My proposal, to give a rough idea: <base_url> / <category or action> / <human readable identifier> / <if needed: id>

We should also keep “category or action” identifiers short. Now, examples:

https://writing.codidact.com/q/what-should-I-do-about-my-awful-protagonist

And, if there is a question with the same name:

https://writing.codidact.com/q/what-should-I-do-about-my-awful-protagonist/2

Answers would be fragments:

https://writing.codidact.com/q/what-should-I-do-about-my-awful-protagonist#2 (second answer by date)

User profiles (user names need to be unique, anyways):

https://writing.codidact.com/u/some_user

If the account is deleted later on, the next some_user would also get an id: https://writing.codidact.com/u/some_user/2

Additionally, we should provide our own URL shortener. Shortened URLs could look like this: https://codidact.com/s/<hex number>

This makes it impossible to reliably update URLs to match title edits. IDs should always be in the URL, that way the engine can just issue redirects as needed.

4 Likes