I have come here from Stats.StackExchange and am similarly dissatisfied with the organization as others here.
Over there I wondered whether a more decentralized website/format could be possible.
What I imagine is the organisation of the Q&A site in terms of a set of protocols that allows posting and retrieving Q&A and the edits that are made to them (as well as potentially governing the comments and votes, although this might possibly be done separately).
I imagine that the content should be able to be hosted as a peer to peer network or as a network of servers. Then the content is being handled in a format that is very flexible in how it is being spread out to the world.
This type of organisation of content will disentangle the host of the content/backend from the host of the frontend and reduces any dependency on the owner of the frontend website (which is currently StackExchange/StackOverflow). Possibly it might also make the hosting of the site cheaper (by that I mean the physical part, the hardware and electricity).
Is this viable in this Codidact project, or is it either a step too far or a silly idea?
There has already been some discussion in the Discord system (gradually being moved to this forum) about decentralized systems. Bottom line: My opinion, and some others though not 100% consensus, is that it just doesnât work well in the real world. There are a bunch of technical/logistical reasons why it is just not likely to work well.
What this project does allow (no matter how the details of each instance are eventually done) is for another person or organization to copy the entire project and then do whatever they want (within license privileges which are still to be determined, but which will be on the relatively open side of things). So if Alice & Bob decide they want to have their own Codidact server with their own data, therefore separate (decentralized) from the main project, they can do that without a problem. And if they want to fork the code to include distributed decentralized storage across multiple servers, they will be free to do that too - but I donât think it will make sense for the main project.
Iâd still like us to keep the door open to decentralization. Even if, for the forseeable future, weâre working toward a single-hosted Codidact system with multiple communities, please build it in a way that a community (or several) hosted elsewhere could still join the network without moving data.
This means APIs (for joining, sharing content, network-wide stuff of whatever sort) and also some sort of governance policy â what criteria apply if a site wants to join, or do we say anybody can? There might be a community that needs to be separate for legal or national-censorship reasons, on the one hand, and weâd want to support that. And on the other hand, we might be unwilling to put the Codidact stamp of approval on a site dedicated to child porn. In between lies much murkiness.
The key question I think then is âthe networkâ vs. âthe same softwareâ. Just as many different chat/forum/bulletin board/etc. systems for many years (going back to dial-up modem days) have allowed you to clone the software and make your own separate system, I think the same would apply here. Allowing others to âassociateâ as part of a network really complicates things in terms of:
Governance
Legality of particular content, including country-specific issues
âBe niceâ (if a site doesnât want to âbe niceâ, we canât stop them from using the software, but we should not be part of a network with them - and defining âniceâ gets a bit toughâŚ)
API - needs to do a lot more to handle the back & forth of various bits of data to keep everything synchronized, particularly since we canât control/maintain other peopleâs servers
and probably other things I havenât thought of yet
So I am really inclined to a A Codidact instance can have multiple topic sites but is governed by one group of people and resides on one server (or group, but technical as in load-sharing, etc., not as in âreally differentâ) and anyone else can make their own Codidact server for any reason they want. We could, relatively easily, provide a single sign-on/authentication system so that other Codidact instances could very loosely associate, but the profile details (beyond username/authentication), Q&A content, etc. would be totally separate, which should, I hope, avoid a lot of technical issues and also legal issues (i.e., if we arenât moderating a particular topic site then we are not responsible for what someone posts there).
I tend to hold a similar opinion on this. Decentralization is a nice thing to have, but thereâs significant technical challenges towards implementing it that we certainly donât want to address in MVP, because itâll slow us down considerably.
Once weâre closer to being set up and running, we do need to turn consideration to things like - as Monica said - the criteria that apply to sites that want to join, and weâll need to balance that against our available hosting resources. Ultimately, the more sites we host, the more itâll cost in terms of both technical resources and human resources to staff and support. Thatâll be offset some by the added publicity providing some degree of boost to our funding, but thereâs likely to be a disparity and we need to make sure we balance that.
That said, being open-source software gives us a degree of decentralization for free - since anyone can download the software and set up their own instance of it, sites that we donât feel able to host for whatever reason on the âofficialâ instance can set up for themselves. To that end, I do feel that we should make APIs and data dumps and the like available to make it easy for communities to migrate - both in to a Codidact instance (such as an SE community moving to our âofficialâ instance), and out of one (such as a community that doesnât like our governance wanting to set up for themselves).
I agree with most of this. In case I wasnât clear, Iâm only asking that we develop in such a way that we could enable a distributed network â i.e. different Codidact instances talking to each other to link profiles, propagate network-wide announcements, maintain a global listing of network sites, etc. Iâm not saying we need to do any of that early on (and we might not do it at all), but Iâd like to keep the possibility open.
Anybody can take Codidact and set it up; weâre not locking down the software at all. So, as we go, we should track whatâs needed to actually set up an instance, to enable others to actually do that.
Decentralization is hard to build. A decentralized information repository is harder to use: at any point, not everything is easily accessible if at all (DenverCoder9, what do you see that I canât?). Decentralized information repository is harder to share: different people have a different view of the available information (âLook at the three answers to this question. â Huh? It only has two.â).
The problem weâre trying to solve is creating a repository of knowledge, not aggregating existing knowledge. How is it useful to spread the information among sites?
I strongly oppose peer-to-peer storage. I want information to be available to everyone, not lost because nobody happens to be sharing a particular block anymore.
There is a dependency on the owner of the frontend anyway. If it isnât the content server, itâs the content server directory â whatever serves as the entry point to the platform. We can solve ownership of the entry point through legal means, with a sutiably governed nonprofit organization. We donât need any fancy technical solution for that.
This feels like a solution in search of a problem to solve. It definitely creates many more problems.
I meant decentralization at the level of the community, not at the level of the individual post. The latter would be completely unworkable.
On SE, Ask Ubuntu and Math Overflow are sites with external affiliations where someone else might have preferred to host the community. (Iâm not deeply familiar with the history of either site.) SE has several single-product sites that might fit better into sponsoring organizationsâ structures, or not. Also on SE, Christianity has long struggled with some of SEâs expectations (at least based on reading their meta) and perhaps would struggle with ours but be happy to stand up their own server.
Maybe itâs not a strong use case; I havenât thought deeply about it. I just worry a little that if we say you can only do this through our network, then we could have SEâs problems several years from now. Maybe itâs premature to think about that now; I thought it wouldnât be hard to keep the door open on decentralized communities, and so suggested it. If that makes the project way more complex, then I agree that those wanting it need to make a stronger case.
Maybe I am getting too old and I am nostalgic about the old (but much simpler) internet, when content seemed to be more spread out. Several companies found that not so useful and gathered everything into one single place. Now all searching is done on Google, all online sales on Amazon, all social stuff on Facebook (or whatever they bought up), all videos on Youtube. Internet is gravitating towards a commercialized mono-culture (there are some alternatives but it remains limited).
Creating a fork is nice, but the Spaghetti that you eat with it is the nice stuff. Maybe I am too naive about the technological part of a project like this. I imagine that the actual basis/content of the sites, whenever it is according to some standard, could be swapped or shared from place to place and that the content can be kept separate. Then the communities can grow much more independently, while possibly sharing repositories of questions and answers (which can be simple and only need to contain some sort of version control and way to credit the originators) on a shared server (but it will be, due to the split architecture, easier to move around and rebuild elsewhere).
It is a bit like how I dislike that my Facebook profile is stuck to Facebook. I can not easily take of the Facebook-coat and put on a different coat. There is all sorts of integration that must be cut out. I can take all the posted photoâs and videoâs but basically all the history links with friends and interactions in messages and posts is being lost.
So what I am thinking about for a decentralized Q&A is something analogous as what https://diasporafoundation.org/ is to Facebook. At least I am personally reluctant (and I imagine others might be as well) to start again creating questions and answers on just another copy of StackExchange/StackOverflow. How much different is this new Q&A site gonna be from SE/SO if it ends up as a nearly similar web 2.0 concept (contrary to say this definition of web 3.0)? What prevents the same problems to come back?
Which is problematic, if our value system is supposed to value creating a strong community. Having a system that makes it easy to break up the community is then antithetical to our values.
@MasonWheeler Which is problematic, if our value system is supposed to value creating a strong community. Having a system that makes it easy to break up the community is then antithetical to our values.
Not necessarily. Keep in mind three things:
There are really two products here: Codidact == Software for a Q&A site and TBD == An instance of Codidact with content imported from SE (exactly which/what TBD) and new content with a core initial group of largely former SE users who would like to start a new Q&A community. We have to build Codidact before we can start an instance of it to support our community goals, and doing so open-source is supported by most of the people involved at this time (certainly myself included).
Just because something is open-source does not mean it will actually get used âelsewhereâ to any significant degree. Sometimes that is the case (look at the number of Linux distros), sometimes it is not.
There may be additional communities where for a variety of reasons:
a group wants to have a private Q&A system like SE Teams but have full control over the details
a group wants to have a public Q&A but is not compatible with the âbe niceâ or other policies enforced by the Codidact development group on its own instance) and this would allow them to use the software without (for better or worse) any of the policies.
a group in another country is unable (due to government restrictions limiting outside access to sites that have relatively âfree as in speechâ content) to use our primary instance of Codidact but would still like to benefit from the environment it provides by setting up their own instance inside their country for similar uses (well, without as much âfree as in speechâ on certain topics, but with âbe niceâ and multiple topic sites, etc.)
The software is open source. The content is open source. So thereâs no technical or legal obstacle to forking. All you have to do is to convince people to join your community. What problem does âdecentralizationâ (and I still donât really understand what you mean by that) solve?
To put it another way, in what way is Wikipedia overly centralized? What would a decentralized Wikipedia look like?
I think I have a good idea on how to build a decentralized network without impacting the development of codidact significantly:
codidact remains as a public server with its own backend datastore
stack exchange remains as a public server with its own datastore
other sites (on codidact or se or other software) can also be added
all that is needed is a communication mechanism (an API) that connects rep/questions/answers/comments/users across different sites.
Such an API can be built on top of SEâs API (albeit in one direction only), but I see no major problem in implementing it as an add-on to codidact or other similar software.
How would this work? On a cron (e.g. daily or hourly) new content can be exchanged via the API. Each server participating in the network will have a list of other servers to sync to. Each server is free to do whatever it wants with the content (respecting to CC-WIKI of course). Typically the API implementation would have some server reputation mechanism to prune the incoming firehose.
Having this network has significant advantages:
if a server goes off the grid or gets paywalled or becomes evil, their content will already be shared across the network
the community can be local to a server (which is good) but it can also migrate easily to any other server without losing content (which is also good)
different servers can serve different purposes (e.g. one server could be read-only and focused on being a googleable reference, another focused on curating content, another on community building)
Regarding the feasibility of the system, it has been implemented in the past e.g. with FIDONet. Newsgroups are also distributed systems but without many of the essentials features weâd have.
@sklivvz Reading your post as a user of the system rather than a developer: âblah blah blah tech stuff blah blah blah more tech stuffâ
As a user, what is this all about? I know that âserversâ are involved in storage and communication, but why should I care how they work? Iâm a member of my community. I go to this site that Iâve bookmarked (or I can type a URL, or I can search for the name in a web search engine), and I see the same content and the same people every day (modulo whateverâs changed since yesterday).
The first major difference for a user is that if their server becomes evil or goes down, they can (almost) seamlessly go elsewhere and find their content already attributed.
The other major difference is that youâd see extra content (and in some cases significant amounts of it) appearing with a slight delay on your âdailyâ server.
Youâd also see some notice of other servers, depending on how the server implements it, but I think this could be fairly similar in appearance to how stack exchange implements its own network, except the servers would be different nodes.
Maybe I mean that the site should be modular. Yes you can take the entire code and database and put it somewhere else, but can you? What if you want to change feature X how is it gonna impact Y and Z? What if you want to just use the content but it has been organised in a way that worked for the particular site but becomes awkward to reuse somewhere else?
I guess that what I am steering at is âto have the site set up in a more independent wayâ, independent from presentation. Keep the content in a format that is as much general as possible and does not involve any particular style of the website. The goal is to have a database with questions and answers. All the frivolous additions should stay away from the main data.
How you organise the site does influence the shape of the content a lot. On SE/SO questions and answers you have a lot large contributions by single users. This makes keeping track of âownershipâ and ratings like âreputationâ important and that meta data becomes important (but is not easy to transport/move to another site; because it is private data).
This contrasts with the wiki-type posts and articles which are much more like many little contributions from a lot of different people. The âownershipâ is much less important.
(To explain better how I feel about this: on Wikipedia I have much less troubles edditing a piece off text directly. However, on SE I either place a comment or post an answer/question myself. On SE I somehow feel obstructed to edit someone elseâs post and feel it is much more âtheirâ work.)
A wiki can be copied one-to-one but the SE-style Q&A can not be copied 100% because the way it is organised creates some connection (e.g. see how we discuss the transfer of reputation and votes which would not be an issue for wiki-style data).