The production system (i.e., the primary instance of Codidact) will have a number of hosting requirements which place it beyond the minimal servers currently provided at no charge by various people involved in development of Codidact. The needs primarily fall into three categories:
Web Server
The web server actually runs the code that powers the web site. Many shared hosting systems (not really a consideration here, but for comparison) like ICDSoft (currently hosting the barebones codidact.org page), GoDaddy and others are limited in what software you can actually run on the server. In our case, we need to run the primary stack (currently planned as C#/ASP.NET) and we need flexibility to control many aspects of the server which are not generally practical on a typical shared server. In addition to full (or “nearly full” - with almost any remote server there are always some things you can’t/shouldn’t do) control of the server, we will need regular data backups, high speed connections and a high-reliability data center.
Database Server
We are currently planning on PostgreSQL for the database engine. Typical shared hosting includes MySQL and limited storage & speed - we’re already planning for much more than that. We can set up our own PostgreSQL server but a managed solution provides a number of benefits with respect to backup, bandwidth, availability and other factors.
File Server
Any web server can also function as a file server. However, in addition to the typical small (relatively) batch of JS, CSS, typical site image files (logos, avatars, etc.) we will also need quite a bit of storage for image files uploaded by users.
Ideally, all 3 of these systems should be scalable to handle more users and storage as we need them without locking us into any long-term contract or significant up-front costs.
There are a number of possible solutions:
Typical Shared Hosting
Simply not an option here. For anyone who does not already know, any hosting company that offers “unlimited bandwidth” and/or “unlimited storage”, etc. for $5 or $10 (or even $100) per month is flat-out lying. TANSTAAFL. Shared hosting has its place for sites that are relatively small and expected to stay small. We need to plan BIG. Note however that a really good shared hosting company (ICDSoft is the only one I put in that category - my list of “not so good” is a long one…) does a great job of taking care of web server backup, firewalls, database backup, hardware repairs (move you to a new server when there is a problem), etc. But shared hosting just doesn’t do the job for our grand plans.
Your Own Server
By this, I mean a service such as Rackspace where you buy a specific server (or group of servers to expand, etc.) and run whatever you want on the server. That can work great if your system is sized just right. In my experience, you either end up with a server that is too small - and have to rush to expand when you hit a limit - or too big - and are then paying for more than you need. In addition, no matter what they (the data center/management company) say, I have found that when you have your “own” server, you do not get the level of support for when there are problems that you do with one of the newer cloud services (below) or good shared hosting (above). Again, I do not feel this is a viable option for us.
AWS/Azure/Google Cloud
AWS (Amazon), Azure (Microsoft) and Google Cloud (Google) are 3 services that offer various combinations of servers with some great advantages:
- Pay for only what you need
- Expand when you want very easily
- Multiple data centers with automatic live backup of databases (AWS calls this “multi zone”)
- Easy replacement of failed servers
- Easy backup of web servers on a scheduled basis
- Super high bandwidth
and many other advantages.
I only have personal experience with AWS (a few years now with my largest customer and also a few smaller projects). Specific features that AWS has that I believe make it a good fit for us:
- Web Server = EC2
The primary web server platform in AWS is EC2. You can run pretty much anything on it (i.e., there are terms of service limitations as with almost any service, but the servers themselves can run either an Amazon customized Linux distro or many other versions of Linux or even Windows (though I don’t recommend that). You can associate as much storage (normally SSD) as you want with an EC2 instance. You can spin up multiple EC2 instances to handle increased usage (though you have to decide how to split the usage among servers) or keep increasing the size of the server (essentially spin up a new server, turn off the old server and move the storage over to the new server.
There are also alternatives. For example, AWS Lambda lets you (essentially, don’t complain too much if I have the terminology wrong…) have small processes spin up in response to requests, so that you (a) only pay for the CPU time you use and not 24/7 for an EC2 server and (b) have huge capacity because AWS will spin up as many Lambda processes as needed to meet demand. I have no idea if this is compatible with our chosen tech stack or not, but just mentioning it as a possible option.
- Database Server = RDS
This is one of the places where I think AWS really shines. RDS can be configured with a number of different database servers, including PostgreSQL. You pay based on storage and server size in CPU cores, RAM, etc. (i.e…, much like the pricing of EC2). However, RDS is specifically optimized for databases. It includes backup, live mirrored data (multi zone - though that doubles the cost), automatic database system updates, etc. Effectively a plug 'n play database appliance but with no upfront cost. Scaling is great - start small and increase as you need to.
Note that both EC2 and RDS can be paid for by the hour (which really means monthly) or pay 0, partial or all upfront on a 1 to 3 year contract to save a considerable amount once you have an idea how much long-term capacity you really need.
- File Server = S3
You can use an EC2 instance as a “normal” web file server. However, Amazon offers S3 as a high-speed file storage system. Everything is stored in “buckets”. The number of buckets is actually relatively limited, so typically you might have one for Dev. and one for Production with everything else stored hierarchically inside the bucket. Technically the buckets have everything at one level, but they use / as a separator to essentially mimic a typical file system. Security options are quite flexible - you can have files that are totally open (great for CSS, JS, system images - the client browser can get the files directly) or secured (check user privileges first and then have your EC2 process read the file and serve it to the users).
Overall, I think EC2, RDS and S3 are a great system and could work very well for a Codidact production system. However, I am open to consideration of other platforms, so if anyone has any experience with others please speak up. In addition, we need to consider how EC2 or other options will relate to the chosen tech stack (I am quite confident about the suitability of RDS and S3).