MVP Image Uploading

manassehkatz · 15 November 2019 20:29

Image uploading into questions and answers is definitely MVP. Some of the features will likely not be

Process: Image uploading should include local file upload, image URL paste and (if possible/practical) paste image data from local clipboard
Image Storage: We should use our own storage in order to have complete control - i.e., avoid a 3rd party service going out of business or being blocked for legal issues (e.g., if the same service hosted images deemed illegal). AWS S3 is an option, currently at $0.023/GB/month.
Metadata Storage: The database will need to store information such as the original source (user), filename, etc. All references to the actual image should be via the database, so that if we change storage platforms (or rearrange files within a platform), only the routine that fetches the image based on the metadata will change, but no actual links within Q & A, etc.
User Manipulation: We should not provide a full image editor. But 3 simple functions:
- Crop
- Rotate
- Resize
  can take care of the vast majority of problems with uploaded images. Many users, particularly uploading directly from smartphones, are unable to crop/rotate/resize locally before upload. I have at times downloaded images and fixed them and uploaded the fixed images simply because I couldn’t stand looking at an upside down image with extraneous junk, etc. Make it easy and users can do it themselves (original user or another user editing). These functions will require a good UI, but the end result because a handful of numbers (4 numbers for crop (left/right/top/bottom), 1 for rotate (angle), 1 for resize (percentage) so they can be stored in the database with the other metadata. The adjusted images can be stored separately (more storage cost) or generated on-the-fly (more processing each time) or a mixture (store recent images and purge all but ‘original’ after a certain time).

MartijnWeterings · 15 November 2019 21:31

Is the own storage just for backup or is it also gonna be used for hosting the images?

gilles · 15 November 2019 22:25

Ability to include images in posts: yes, images in posts are very useful. That’s not strictly needed for MVP, links to images will do in a pinch. But it’s definitely better if we can show them inline.

Having our own storage for images: that’s preferable, but not easy or cheap. Not MVP. We can leverage the Wayback Machine to ensure images don’t get lost: ensure that it’s crawling our site and includes linked images.

Built-in editor: nice, but that’s absolutely not MVP. MVP means minimum viable product, not everything we’d like to have.

manassehkatz · 17 November 2019 00:40

Hosting. I am 100% against using a 3rd-party service like imgur. The cost of hosting ourselves is minimal and it solves a lot of potential problems. The issues (copyright violations, inappropriate images, etc.) that we will inevitably have to deal with will happen (and will affect us) whether we host the images or just point to them.

ArtOfCode · 17 November 2019 00:42

I tend to lean towards self-hosting images, but it absolutely needs some solid investigation into costs and likely available funding.

I also tend to agree with @gilles - an on-site image editor would be good, but is not MVP. If SE gets by without one in production, so can we until we have built a solid product and can come back for a second pass.

manassehkatz · 17 November 2019 00:45

Can’t rely on Wayback Machine to ensure images don’t get lost - even less than a paid 3rd party service. Hosting our own is not expensive, and I argue should be MVP.

One way to keep the cost down initially, and would be simpler as well, is to not store any images that are retrieved as part of transferred Q&A from SE (or other sites) and link those images to the original location (imgur or direct hosting of any other site). That is quite logical and eliminates the initial burden of transferring (potentially) millions of images. It also may avoid some licensing issues.

I do agree that built-in editor is not MVP. But I believe that the 3 key functions (crop, rotate, resize) can be done fairly easily and would be an example of an advantage over some competing systems.

Marc.2377 · 17 November 2019 07:46

This is indeed very likely to be true, especially if we use an existing solution (at least a starting point) as we should anyway IMO, instead of reinventing wheels, tires and entire carriages.

But yes, matter of fact is, this is not an MVP feature.

gilles · 19 November 2019 00:44

I’ve changed my mind on image hosting. We should only accept embedded content that we or some independent service hosts. Ideally we should host the images, if not they should be on some reasonable service such as Imgur.

The reason is privacy. Stack Exchange allows posts with embedded images on a server that the author controls. This allows the author to at least collect IP addresses of visitors, and possibly do some browser fingerprinting. It’s a privacy violation. Let’s not do that.

ArtOfCode · 19 November 2019 00:52

On the other hand, the ability to hotlink an image is useful - maybe Wikimedia Commons has the exact image that I want, and I don’t want to have to bother re-uploading it.

So maybe the solution is in the middle - we (a) have a list of whitelisted domains that can be hotlinked from, including things like Imgur and Wikimedia, and we (b) make it easy to re-upload images that are not on those domains - provide a box to paste a URL into and have the image it refers to re-uploaded.

On the third hand, that raises issues of licensing and copyright that could potentially land us in some hot water, so maybe we can’t offer re-uploading.

ArtOfCode · 19 November 2019 00:59

On the fourth hand, maybe we adopt a Medium-like approach to hotlinking: allowed, but any third-party hosted embedded content will be replaced by an overlay and not loaded until the user specifically clicks it to load it in. Embedded content that we host doesn’t need to have an overlay.

manassehkatz · 19 November 2019 01:00

We absolutely have to allow re-uploading. A very significant percentage of users (except the true geeks) have any clue how to properly upload a picture from their phone to their computer or from their phone straight into a browser. Many (not all, but a significant percentage) upload to some “system” that is easy from their phone - that might be imgur or facebook or something else. Once it is there, they upload to SE by copying/pasting a link to the image from a web site. There is no practical way to distinguish between those links (their own personal content that has zero license issues) and content from other sites. For that matter, we have no easy way to distinguish between a person’s own Facebook post and a grab off of someone else’s page. (No, I have no idea what the Facebook licensing is - even if that turns out to be OK, there will be other sites where it is a problem).

So we allow “any images”. We make it easy. We give a BOLD WARNING about copyright and other legal issues (you can guess what those are with images). And then we have to be reasonably vigilant when issues or questions come up.

Corsaka · 29 November 2019 14:20

Would pasting an image in from clipboard ever be a possibility? I know that’s one of my favourite Discord features, and it would circumvent the issue of pasting in links to at least some users. Probably not MVP, but I don’t want to have to create another thread to just post a single line.

manassehkatz · 29 November 2019 16:26

Yes. I mentioned that at the beginning:

There really are a variety of users with different capabilities, devices preferences. Ideal will be link, upload or paste.

cellio · 31 December 2019 17:45

This thread has been dormant for a while. I note that Writing needed image uploading within its first few days here, and I don’t think that will be unusual.

I think we need image upload for MVP. I don’t think we necessarily need other tools; you can do your cropping, rotating etc outside of Codidact before uploading. We do need to handle large images, presumably with a combination of scaling (for display) and limiting file size (to prevent huge uploads that’ll never display anyway).

The question of whether we require upload or allow links to extant images remains open; do we need to resolve that for MVP?

manassehkatz · 31 December 2019 18:34

Upload is preferred. That should (but not necessarily MVP) include the ability to paste a URL and let Codidact pull the image from elsewhere. While conceptually that may seem the same as “link to extant images”, it is not. Link to extant images (i.e., a link that displays an image here) has several problems:

Links break over time due to content elsewhere changing URLs or web sites disappearing - all beyond our control.
Actually allowing user-entered <img> links raises a host of security issues.
Storing the images locally allows us to make sure the images are truly images (i.e., match a normal JPG or PNG or GIF file format).

Storing images locally (whether collected via link paste or a true upload) has the one disadvantage of leaving the site open to copyright issues. However, that is really not a big deal - as with text content, if there is a reasonable request (however that is defined) to remove, we remove it and notify the user who can challenge, etc. But it has the big advantages (over links-to-user-supplied-sites or to SE-style hosted-for-us-elsewhere) of keeping the system reasonably self-contained and not reliant on other sites for primary content.

cbrumbaugh · 31 December 2019 18:38

How is that going to work with the importing on the content from the existing SE sites where all images are external?

cellio · 31 December 2019 18:56

We’ll probably need to do a one-time operation of copying images from SE’s imgur space to ours – a job that runs as a follow-on to the data-import job. I don’t think we need to block on that; those images work today and Stack’s imgur space is durable. But we’ll want to clean that up and do our own hosting, for all the reasons already listed and also so that we’re not vulnerable to SE deciding to block external deep links. (And since they’re paying for that space, they could reasonably do that.)

manassehkatz · 31 December 2019 20:56

A quick look imgur’s info was a bit scary - they only allow reuse of images for personal/non-commercial (which arguably we won’t be, even if we don’t actually charge anyone for anything) and “fair use” (definitely not). And actually in a number of ways the newer imgur licensing is “nasty” - grants imgur the right to do pretty much anything with the images without granting much in the way of use to people who download them - i.e., a one-way street.

However, the SE license makes it pretty clear (I’ve read a few Q&A about it) that (a) the images are licensed with the same CCBYSA as user-supplied Q&A text and (b) imgur does not automatically get their preferred license, thanks to a deal (hopefully still in place…) with imgur.

So IANAL, but it appears that we can download standard SE-imgur images within CCBYSA guidelines without any real concerns.

Helmar · 19 January 2020 18:28

Consensus summary.
If anyone disagrees please answer to this post.
Sufficient likes / one day of silence are considered agreement.

MVP functionalities are:

We have image display in posts in MVP
The images are hosted by us.
- That means we have an image upload functionality (i.e. picking an image [file] from the end device).

Not in MVP are:

Adding a trusted image hoster whose images can be linked.
Further upload mechanisms
- uploading by providing a web link
- drag’n’drop
- Other pasting mechanisms
Any kind of image editing (i.e. cropping, rotating, resizing)

Remark: Everything regarding a potential import has to be covered with the import functionality.

manassehkatz · 19 January 2020 18:43

I think that while this is not “must be MVP”, it should not be that hard to implement and would help quite a bit for many users (the ones who have trouble figuring out how to navigate to pictures on their phone to upload them but who have all their pictures magically appearing in some web site and can grab a link from there).