What are we trying to build?

Late to the party and I apologize as I have not read everything (in this thread much less on the site). I also don’t have a lot of personal time, so my participation will be limited. Just wanted to throw my thoughts into what I would look for in an SE alternative and you can take or leave it as you like.

One of the great advantages that SE has is in the number of different communities that it hosts. This is also one of the greatest disadvantages of their family of sites. Users (old, but especially new) often are confused as to where they should post a particular question and this results in a large number of post closures due to off topic posts, especially on sites that may have topics that overlap to some degree. This adds to frustration and wasted time by posters and members of the communities. Or users who might participate on one topic may not do so as they spend their limited time on only a few communities.

How would I re-imagine such a site? A much more ambitious effort than the present SE that isn’t evolved based on old technology. Make use of today’s much easier access to processing power, cheap storage, and advancements in machine learning (which should be key to any such new endeavor in my opinion). Take advantage of advances in technology that were simply not available with SO was first being put together.

Build a single site for any topic and start users at a “blank slate” where they can search existing questions/answers or ask questions. They may only see some sort of “hot topics” list of questions which allows them to start “curating” their view of the site.

As users make use of the site, the site learns about what interests them; what they visit, what they like/dislike (i.e. vote), what they ask about or answer, etc. As the site learns about a user, it can start showing them other questions in which they may have an interest. Give them the ability to split things that are found for them into different self-configurable “views” of some sort (work related, hobby, etc).

You now have a site that never has the need for a question being closed as “off topic.” Closures now only need to exist because they are duplicates or “bad” in some way.

Let the site “tag” questions (and answers?) automatically based on post content, but allow the community to adjust the tagging with reinforcement learning (disliking tags, liking potential tags, etc). This should allow the site to better adjust which users see which questions automatically.

Let the tags be divisible into more specific tags in some sort of tree structure. If the question is tagged with “Networking” let me add the “Wireless” sub-tag to it rather than having “Wireless” as a totally separate and unrelated tag. This would also allow for content to be tagged more specifically. For example separating server network interface configuration (which often carries a “Networking” tag) from questions about networks (switches/routers/etc) from questions about social networking.

Let all users “vote” on everything, but when users vote on something (questions, answers, tags, etc), those votes aren’t equal. Weight them as appropriate to what they are voting on and their “activity” in that “area” (based on their votes, what they have posted, others’ votes on their posts, etc). For example, if SE ran this way, my votes (likes/dislikes) on “wireless networking” question/answers/tags in particular may carry the most weight, “networking” a fair amount, “security” a small amount and “programming” topics very little. It might only take a few users active in “Wireless” to vote to add/remove the “Wireless” tag to a question, but MANY more users who have little/no activity with wireless questions.

Weighted votes can also be used to help automatically combat things issues like “sock puppets.” If a high percentage of a user’s votes are toward questions/answers of a single user (or a small group of users), decrease the vote weight. Accounts that are only created to make a couple votes won’t carry the weight to seriously impact any sort of ranking.

Don’t show vote counts. These aren’t really productive and can influencing voting of other users. Just rank/sort the answers “silently” by their weighted vote.

Let the community determine the correct answer to a question. This prevents the “unanswered” questions because the original poster never selects an answer or where they accept what is actually an incorrect answer. Mark the answer that is accepted by the user who asks the question in some way and give that “vote” more weight, but don’t make it the deciding factor for the correct answer. Let vote weight decline over time in some fashion, allowing for correct answers to change over time when necessary (how many times is an answer posted 8 years ago no longer the “correct” answer today?).

Minimize the need for moderation as much as possible. Let moderators focus on issues/disputes that need user intervention (rude/abusive posts, spam, etc). Don’t make them housekeepers for maintaining the content of the site (closing topics, adjusting tags, etc). As users are more active on the site, give them more options to help maintain content (again based on what the site has learned about the user). If users take advantage of those options, give them more opportunity.

Introduce some sort of gamification. It isn’t perfect, but clearly it works and helps foster community activity or so many successful sites/services wouldn’t make use of it. Just figure out a way that doesn’t create a large separation between “elite” users and other users (especially new users). Something more cosmetic than functional would likely work for most aspects of the site.

Anyhow, those are some of my rambling thoughts on how I would build a more ideal SE alternative, if I had the time/ability/resources to do so. I know details of implementation would be difficult for such a site, but if done well it is something I would personally much prefer to spend my time on than the SE network of sites.

3 Likes

@dustypaws I’m also lost at .NET, but I’d be lost with ruby also.

IMO the best option is to have a clear API definition so that people can create their own backends in different languages if they want. This being an open source project, and wanting to leave the door open to decentralization in the future (not MVP launch), making sure the design is not married to any one platform or language is important.

So from that perspective it doesn’t matter if it’s C# or Ruby or anything else, just as long as the API is clearly defined. Inputs and outputs are what matter; the rest is semantics.

2 Likes

To me, this is what I want from the new platform in terms of technical culture. More canonical posts. More rewards for merging posts together. Actually, I’d change the language from “click here to ask a question” to “click here to submit a topic”, or something of that nature.

Make it harder to submit and publish a question, because it’s curating a high-quality list of Q&A posts, and your “plz fix my code” question isn’t going to be seen by many eyeballs in the future. I’d make the apprentice system that was in rough beta on SE mandatory - you’re not going to get your post publicly visible until your teacher says “it’s good”.

I always thought Stack Overflow optimized for readers of the future, not the actual question asker. (Granted, with recent changes like the more reputation for questions, that’s no longer the case).

Now, this is a very extreme position, and not every question is canonical.

7 Likes

That’s the one thing I find discouraging, especially for new users:

  • you’re a hobbyist
  • you have a coding-related problem
  • you stumble upon SO (mostly via Google, etc…)
  • you ask a question which doesn’t live up the very comprehensive standards of the SO staff
    …and then at least one of x things will happen:
  • You get a duplicate question notice
  • Your question gets locked
  • You receive a bazillion downvotes for what, in your mind, was a perfectly valid question

Will a new user that has been treated like that ever be an active contributor to the site? I don’t think so.

Just a straight +1 from me.

3 Likes

Vote counts and breakdowns are very useful on answers. On SE (until I found the right userscript) it bothered me that I couldn’t see this to know whether an answer I was considering following was controversial or community consensus. A score of 10 means very different things if that’s +10/-0 vs. +25/-15. The latter should give me pause.

But maybe that’s not so important for questions.

Votes on questions serve two audiences: the OP, who’s trying to get an answer, and everybody else – the people providing answers today, the people who’ll find that page in a year via Google, and the people looking for the best content from the site. For the OP, votes convey important information about how the question is being received. I’ve been thinking that if we don’t show votes we need some way to communicate, at least broadly, that the community has issues with this question – but maybe that’s not correct? If people leave helpful comments then the comments tell you what you need to fix; if people downvote but nobody comments, what can the OP do with that feedback?

That’s the OP case, but there’s also everybody else. Voting on questions is useful – for granting privileges, for weighting search results, for feeling good about a well-asked question (for the OP), and probably other things too. We do want to show it sometimes.

I don’t know how to reconcile those concerns. I’ve always preferred more information to less information, but users have varying feelings about this.

5 Likes

Exactly! And I’m just taking the POV of the new user.

I remember that once I was new to PHP, I had a question, so I asked. And as far as I can remember, I’ve received no comments or answers, just some downvotes and a lockdown. That just felt unjustified from my perspective, because at this point in time, I couldn’t have phrased my question any different. Remember, I came there to get help.

3 Likes

Are they though? Take the +25/-15 example. If the +25 is from seasoned professionals and the -15 are from “hobby” level users, what does that say?

Flip that on it’s head. If the +25 votes are by “hobby” users and the -15 are from seasoned professionals, what is the take away? And the positive relative “score” of the answer makes it easier for those who aren’t sure to give it additional up votes (well if others like it, it must be good, right?).

Or it could simply be that “user X” pissed off 30 people in chat and half of those went out and voted down a few of X’s answers. So what do the votes really tell anyone about the answer?

Unless a post gets some statistically significant number of votes, it really doesn’t actually provide valuable information. Sorting by weighted votes would be a better indicator of good answers. I agree that sorting isn’t enough by itself (after all, if every answer is bad, the top will only be the least bad answer). Perhaps as additional feedback for users generate a more generic ranking metric; something like giving up to three thumbs up/down to indicate how well received an answer is by the community (I like a more simplistic visual solution personally) or rating community approval on a 100 point scale.

As a positive feedback mechanism for the author, you could show only the author of the post the count but this can also be indicated by a more generic ranking system as well (if my post has three thumbs down, I know it needs to be improved).

1 Like

What’s the alternative to votes? Without votes, how would you sort answers? I don’t think anybody here wants to build Yahoo Answers – that nugget of info you crave is down there, 37 posts in, if you choose to wade through all the other stuff.

We don’t know who those 40 voters are, no. The best we can do is to assume that the distribution of votes is somewhat consistent – that experts and non-experts are voting across the site, not favoring one case over another, and it all washes out. Coupled with this, of course, we need some means of dealing with voting fraud including targeted voting.

Voting isn’t perfect, but it’s widespread, it generally works on SE where our core users are coming from, and it beats the alternatives. (Of which I can think of two: no votes, which I already addressed, and screening of voters for credentials, which is a non-starter IMO.)

5 Likes

I don’t recall anyone posting that there shouldn’t be votes of some sort. If you go back and read my first two posts here, I certainly never say such a thing. Whether that is an up/down system or a like/dislike system or something else entirely, voting is clearly an important aspect to a Q&A site. I also think everyone should be able to vote (and that the votes should be weighted).

I suggested (and I believe others as well) that there isn’t a value in displaying the actual vote counts. They aren’t necessary if the mechanisms sorting posts (and providing any sort of rating metric) are working correctly.

1 Like

@YLearn has a point, imo. There’s no need to display the vote count as long as it’s stored somewhere in the backend, the system can do the sorting and whatever it needs to do. Users could still receive some kind of Rep.points, and those could still be public.

That’s a super fascinating idea to me – I’ve noticed there are different “classes” of questions on SO, though. A lot of people join the site because “I’m having trouble with this thing right now and I’m not finding any answers” It’s true many of these questions are answered, but the problem is that the person asking needs help understanding what to ask. If the barrier for asking questions is too high people won’t want to participate. But we also want to avoid millions of duplicates.

What do you think of having two or three “classes” of questions? The more canonical ones - almost wiki-like - and the personal / problem-solving ones?

6 Likes

I’d still prefer the SO way of displaying questions, I guess I like the timeline-y feel \o/. What I could imagine is grouping of similar questions. So that if you click on a question to read it, it somewhere displays a selection of similar / related, preferably already answered, questions. Whether the grouping is done by tags or some other kind of backend-magic could be tried / decided at a later date… :slight_smile:

4 Likes

Sorry for misunderstanding you. I thought you were saying the votes didn’t have value because we don’t know the voters’ qualifications. I see now that you were only talking about display.

2 Likes

So you want Quora? Ok, Quora has many downsides, but one of its downsides is precisely that it is a single undifferentiated site. There’s a huge amount of cruft. The obvious reason is a lack of curation, but even if Quora wanted to push for more curation, it would be extremely hard because there’s no community to do the curation. Stack Exchange works a lot better because every site has an audience. Even sites that have overlapping topics (e.g. Super User, Server Fault, Unix & Linux, Ask Different, Ask Ubuntu, etc.) do their curation differently. They have a different idea of what makes a good question and what makes a good answer. This is simply impossible on Quora.

And then they have a culture clash with the next user who has an overlapping, but different view. And the next, and so on. With no community, you aren’t going to get much curation.

Maybe, but it’s hard. And if you go down that path, why limit yourself to a tree? [windows-linux-interoperability] is a child of both Windows and Linux.

Here’s another reason why a single site for everyone doesn’t work: social networking has nothing to do with computer networking. If a tag name means fundamentally different things to different people, it isn’t a good tag name. If your audience is everyone then pretty much nothing is a good tag name.

Showing vote counts to active participants is not good, indeed. But a vast majority of the users of the site are anonymous readers. For them, vote counts are useful: if a question has two answers, it makes a big difference whether one is +2 and the other is +1, or one is +100 and the other is +1. Which isn’t to say that the actual vote count should be shown — maybe the +1 answer came years after the +100 answer and is actually better but is languishing because no one has seen it yet. But some form of rating that isn’t just ordering is useful.

Right. The concept of “accepted answer” as done on Stack Exchange made some sense, but it doesn’t withstand the passage of time. If the asker has gone away, there’s no way to change the accepted answer anymore. This wasn’t a big problem when SO was a month old and most askers were still around, but it is a major problem now that SO is a decade old and most askers aren’t around anymore. Old questions do receive better answers often enough for this to be an issue.

Beware that gamification is a source of perverse incentives, even if it’s cosmetic. After all SE reputation is purely cosmetic beyond 20k, yet a sizable number of people will post any crap they can get away with to get a bigger number.

5 Likes

If you get a correct duplicate notice, that’s good news: you’re getting an answer. If you get an incorrect one, that’s a problem, but it’s unrelated to voting. In Rating and moderating questions and askers we discussed rating “usefulness” and “skillfulness” separately. I think that usefulness should be shown (it’s a tool for readers to find the most useful questions) and should be unbounded, but skilfullness should not be shown at all and should be sublinear and bounded (so if you receive 10 negative feedback “points” on the same question, it only has a little more impact than if you receive a single one).

1 Like

On many SE sites, what defines a canonical question is being linked in a tag wiki. Only a very small fraction of questions rate this, and the question itself doesn’t need to be shown as canonical.

Wikipedia has ratings on some article. There are ratings for both content quality and topic importance. Maybe we can draw inspiration from this?

7 Likes

It’s a chicken and egg thing really. If you discourage new users which might ask questions and receive useful answers, less questions will be asked /answered which in turn reduces the site’s value to users just seeking an answer which they’re unable to find because nobody cared to ask.

No apology necessary from my viewpoint. Type is easy to misinterpret and I am not always the best when communicating in type so any misunderstanding is just as likely my fault.

Votes certainly have value and I personally think the more a user is “positively active” in a topic, the more their votes on the topic (and related topics to some degree) should weigh in the calculation. I just don’t think the votes have as much value to the end users individually and that often the user’s (mis)perception of what the votes actually mean can lead to false assumptions.

To give the votes value, you would need some statistically significant number of votes. Since there is no gaurantee that all posts will get a significant number of votes, it is better to provide other feedback to end users than vote counts.

In the sense that there is one single point of entry, yes. Keep in mind though that Quora, like SE, is based on older technology and ideas. They have continued to try to evolve their platform, but they are still tied to the legacy structure/data they initially started with 10+ years ago.

If you are trying to create something new, base it on what is cutting edge now. Don’t try to make a “better version” of another platform. Or do you simply want a “better clone” of a platform? I personally would like to see something new that stands apart from existing platforms.

Steep the site in machine learning to customize the user experience. Combined with weighted voting, this can be a very potent way to create a common platform that would allow communities to naturally form within it as users with similar interests will automatically begin to cross paths and find each other and have more say in how the content they are interested is curated.

Who cares if users have overlapping, different views? They are curating their own view(s), not the view for everyone. Their choices and actions on the site help to determine what content is shown to them. The more they use the site, the more the site becomes tailored for the types of content with which they want to interact. And the more they interact with the content they see, the more impact they have when curating the content of the site. This is something easily within the realm of machine learning today.

A tree structure is something I am familiar with so made an easy example for me, but if someone has a better organizational paradigm I am all for it. The point I was trying to make is to make tags/categories/whatever into related pieces of information rather than something entirely independent. Which ties into…

Your first point is exactly what I was saying. But “Technology->Computers->Networking” (an example and not a suggested final product), “Technology->Computers->Servers->Networking” and “Interpersonal->Social->Networking” are now three complete different “categories” even if they are all “Networking.”

And yes, it is complicated which is why I suggested the site determine the “categories” and allow users to vote on removing or adding potential “tags” (which can be used as reinforcement learning…notice how I keep coming back to these machine learning concepts?).

Agreed, thus my statement of “It isn’t perfect.” But what it comes down to is that gamification works in promoting activity. Whether you like it or not, it gets the job done.

The post you linked to has less to do with gamification and more with people answering quickly to get their answers voted on more often to stay at the top. These issues could be addressed to some degree by some of the other suggestions in my posts, namely weighted voting and votes decreasing in value over time.

2 Likes

I think what’s missing from this “one site, customized views” idea is communities. The smaller communities, especially, work on SE because they’re made up of people with a common purpose, who get to know each other and who keep bumping into each other. Newcomers are always welcome of course but there’s a common, recognizable core. (This doesn’t work at SO scale, though it might at SO-tag scale.)

You might argue that ad-hoc communities would form, but I don’t think it’d be the same. It wouldn’t be predictable in the way that going to site X and expecting to run into A and B and get back to that question from C is.

Also, the “one big bucket” model can exacerbate tensions from differences. Someone once proposed (on Meta.SE) combining all the religion sites into one – “you guys basically all have the same interests and you can sort it with tags, right?”. Um, no – the core axioms are too different, and that pervades everything. Consider a question about the book of Genesis, asked by a Jew – and it gets answers about how “that means Jesus” and “that’s a foreshadowing of Muhammad” and so on. I sure don’t want to participate on a site like that, where every time I turn around I’m being evangelized on a religion site by people whose religions call for converting or subverting me. I did participate on a site that turned into that, and it was bad.

I’m using religion as an example because it’s very personal to many of the people who are interested enough to use such a site at all. But this comes up in other areas too, probably even technical ones.

When I want to see (or ask) questions about Judaism I want to go to Mi Yodeya, where we all acknowledge some foundational principles. When I want to see (or ask) questions about Mac OS, I want to go to Ask Different, where I won’t be told to just use Windows. Et cetera. Sure, I could dig those questions out of a machine-learning-aided vast pool, losing some signal and picking up some noise along the way, but I’m coming to this site not just for questions but for my community. And my community shouldn’t be at the whim of an ML algorithm.

A community might choose to have a broad scope, like SO does or (at a much smaller scale) like Writing does, but it should be decided by the people who are there using the site.

9 Likes

I think with machine learning, you will still have groups of users that come together with a common purpose, get to know each other and keep bumping into each other. It is just through a different process. But once you have the “core” site working, if you want to tailor entry sites with “specific starting views” to help facilitate the process, that should be doable.

The same? Likely not. But just because it isn’t the same doesn’t make it bad; change can be good. After all, part of the reason people are here is that they don’t want “the same” experience as site Y or Z.

It really comes down to implementation and we are still discussing generalities here, but if users can configure their view(s) so they have one that gives the same content as “site X” would, there is no reason they wouldn’t run into A and B and get back to that question from C just as they would if it were a statically created small site.

Certainly a concern and would take some thought to implement well for a larger community. While I do understand your expressed concern, let me give it a spin that currently isn’t addressed by the multiple small community model. Say I am a relative outsider to religion and I have a question about the book of Genesis. Maybe I don’t realize that Jewish, Christian and Muslim traditions all might have a different view on the topic but it might benefit me in my own personal search for understanding or open up new world views for me? Or maybe I specifically want the views of more than one of these traditions? Does it make sense that I have have to join multiple communities and ask the same question?

Then again, where do we stop with small communities? We can further subdivide Christianity into Catholic, Orthodox, and Protestant and maybe I only want the view of one of them. Where do I ask without getting views from the others?

We can divide those up into even more specific groups/denominations/cultures. Or maybe by theological differences (Covenant vs. Dispensationalism, etc)?

Understood, and I am using Christianity as I am more familiar with it when using examples than other traditions where I would have to stumble through the Hasidic vs. Sephardic divide (purposely mixing bad divisions to make my point, I apologize in advance if I offend anyone as that was not my intent).

And when I want to set up my Linux box as a gateway router participating in OSPF and have questions, do I ask on Network Engineering, Server Fault, Superuser, Unix & Linux, Ask Ubuntu or possibly somewhere else? There are probably people at all those sites that can at least answer part of my question, but cross posting is generally frowned upon, so where should I go? Do I need to wait and post bounties at each if I don’t get a good response before moving to the next?

And if you ask about “Mac OS”, weighted voting where the users with more weight with “Mac OS” questions that would likely dislike/downvote “just use Windows” responses, making them inconsequential.

I can also dig out the product I want to buy at Amazon, even if I use a generic search term (and the machine learning they use on their site isn’t all cutting edge). Yes, there may be a lot of “noise” initially, but I can with a few clicks sort out much of it and narrow my search significantly. Yes, the selection of products at Amazon is huge, but I don’t have to go to six different sites for slightly different products. I used to love NewEgg, but I hardly visit there anymore because the experience is just so much easier for me at Amazon to get anything I need.

Communities are made up of people and people don’t have singular interests. I often have limited time, so I have one SE site I frequent most days (that still has a lot of noise I don’t need or care about), several more I frequent probably once I week, a dozen more I frequent once a month, and probably at least 20 more that I have some interest in but haven’t had time or have only visited a few times. How do these fractured communities serve me as a user and how can I participate in them when I don’t have the time to visit a dozen different sites on a regular basis?

Why tie people to a select few communities? What if there are people similar to me that I would like to form a community with but miss crossing paths because their priorities are a little different than mine and they spend more time on a different site? Should what I perceive as my community be at the whim of what selection of small sites someone else believes should be available and I have the time to visit?

And ultimately, if I am a new user, how am I supposed to know where to post my question? Maybe it is clear cut, maybe it isn’t. We have dealt with frustrated users on our site where they have asked their question on three previous sites, had it closed and told to go ask elsewhere. How is that serving the user?

That is certainly one viewpoint. I am merely trying to provide another. And there is no perfect answer. Discourse is good and perhaps someone reading these posts will come up with a completely different solution that neither of us has thought of before.

I think it should be up to the user to decide what content they want to interact with (or not interact with) when visiting the site, and as such be able to find, interact and build communities with people they might otherwise miss in a fractured site. I think today’s level of machine learning can make this possible in ways that wasn’t possible even two or three years ago and I think it can be far better for the user than what SE or other Q&A sites provide today.