In many startups I’ve worked with, image uploading was a part of their web application’s workflow. From user avatars to uploadable inventory pictures, it was a common-enough feature to be present in almost every system.
Rather unsurprisingly, many of those startups’ solutions to their uploading suffered from the same issues. Logic to handle image uploads was ad-hoc. File processing happened at the same time as the request, causing the application server’s request queue to back up.
In short, it wasn’t so much a system as a bunch of disparate workflows cobbled together (typical of most startups). The result? A lot of bugs in image handling, mysterious crashes that can’t be traced, and random timeouts.
I’m here to show you a better way.
Understanding the technical concerns
Image uploading can be complex, and use-cases vary depending on the system.
There’s a lot of concerns with images in general that most people don’t think of when they are diving into building it out.
Images in a web application can easily touch on the following technical concerns:
- Displaying the image on the front-end
- Authorizing users to download the image
- Authorizing users to upload the image
- Uploading the image to your server in a scalable way
- Validating the image data
- Processing the image, performing cropping, optimization, and other tasks
- Creating image variants, such as banners and thumbnails
- Storing the images
- Associating the images to whatever records you’re uploading them for (such as user avatars or campaign banners).
While not all of these would be present, a solid system will be able to flex and adapt to support these use cases with minimal changes.
When thinking of a solution that can address these concerns, it quickly becomes evident that the final solution will be more complex than adding a column to your user model and an endpoint to upload the image, and calling it a day.
It’ll be helpful to dive into the tradeoffs to consider in a solution.
Image uploads are notorious for crashing servers or causing timeouts. If a user attempts to upload a 10 megabyte image, that’s a lot of resource usage:
- 10 megabytes of your server‘s memory tied up
- A request handler being tied up for the entire amount of time it takes to upload 10 megabytes
- CPU usage to deal with the image upload
If you’re dealing with 1 or 10 users uploading images, it’s not a big deal. However, if your system is actually used by users, it’ll quickly go out of control.
As a result, the architecture is required to be focused on reducing the amount of time your server is actually handling an image uploading request to almost nothing, and offloading the actual upload to another service (such as S3 or an in-house service dedicated to uploads). You don’t want image uploading taking up all of your web server’s capacity.
Image uploads and processing are a massive source of security holes. Any endpoint that lets you tie up a massive amount of resources is vulnerable to denial-of-service attacks (intentional or not).
On an even more worrisome note, image processing is a poorly understood aspect of engineering that has led to some tragic security flaws, providing random users the ability to execute arbitrary commands as a root-level user.
Our system architecture has to handle images in a secure way that promotes availability but also provides integrity of user data.
On the security front, there’s also business logic specific to our system. Not all images should be publicly available. Perhaps there are situations where users should not be able to download images uploaded by other users. Perhaps there might be rules surrounding who can upload an image.
Our system has to be able to handle these domain-specific authorization cases easily.
I’ve written about the value of consistency in the past. We don’t want a different way to upload an image for every kind of image we want to upload. Image upload use-cases like user avatars and campaign banners should all flow through the same image uploading workflow and should require minimal or no code change to support.
When an image is uploaded, there might be multiple places and ways it is used. Places like thumbnails, backgrounds, and profile images might use the same image in different ways. We might serve a lower resolution image to mobile users to save on bandwidth.
Whatever the case, we have to support the creation and usage of variants in our upload system. Creating variants can take a lot of processing power, and it’s not something we want our primary web server to do. The architecture has to offload that to a separate, asynchronous service.
All subsystems should be able to grow independently of the rest of the system. You never know when you’ll see an increased demand in usage. By keeping well defined boundaries in your system , you can easily convert them into micro-services or a separate deployment that scales horizontally.
With these trade-offs in mind, let’s now take a look at an architecture for uploading images that fulfills the criteria.
What are each of the parts of the architecture intended for?
The various little nuances and decision points in the architecture provide significant advantages to scalability, security, and flexibility.
But, what are those nuances? What is gained or lost by each decision point? Why are we choosing to use signed URLs or other elements? Let’s dive into the details.
Image Upload API
The image upload API handles the lifecycle of an image upload request. It’ll perform authentication, authorization, auditing, rate limiting, and other items, but it’ll leave the actual management of the file data to other services.
It never touches file data. I leave it as an exercise to the reader to find ways to make it RESTful.
Image Upload Service
The Image Upload Service encapsulates the logic of calling the endpoints in the Image Upload API with the right data in the right order, hiding it behind a single method interface. This temporal coupling is not something you want spread out through your entire system, so having a single source of truth for this algorithm is incredibly important.
A signed URL is a URL that has authorization parameters in it. We can use signed URLs for uploading and downloaded to provide protection and ensure that only authorized users perform these actions.
Signed Upload URLs
In a simple case,
/images/get_upload_url could return an unsigned URL:
However, this means that anyone could just call
/uploads and fill our data store with random files or use it as their personal file server. This is clearly not desirable.
/images/get_upload_url could return a signed URL:
This means that we have an opportunity to authenticate the user, authorize them, rate limit, or audit who generated the signed URL. If someone abuses our upload service, we have the means to stop that user.
Signed Download URLs
If we didn’t have signed download URLs, anyone could download the image if they knew the URL:
This may be desirable in many cases, but in some cases, such as private files, it may be highly undesirable. In these cases, we wouldn’t want the URL to be publicly available.
We can gate access behind our own endpoint.
images/get_download_url, which returns a temporary token such as:
Just like uploads, this provides us an opportunity to authenticate, authorize, limit, and audit user access to the images.
Cloud File Storage
We use a separate file storage service (such as S3) to reduce the burden on our web servers.
File uploads can take a long time and can be resource-intensive. By offloading such operations to a service dedicated to this, we can ensure the rest of our system continues operating smoothly.
We also never serve the files directly from our web server for the same reasons listed above — we instead deliver it from the external file store. Any request to our server instead returns another URL or redirects to the appropriate service.
Image Processing Job
Image processing can be highly memory and CPU intensive — it’s not something you want to perform on your application or web server, which is busy handling other requests.
We turn this into an asynchronous call that is offloaded to another service so that we can keep our primary web server unblocked and performant.
Image Metadata Record
We store the metadata of the image record in our database because we have to track it. There’s no sense uploading an image without a way to retrieve or manage it later. What use is uploading a photo intended to be used as a user avatar if we have no way to associate with the user record in question?
It’s important to note that we do not store the actual image itself in our database, we merely store the metadata, which we then use later to construct the various URLs to it.
Why don’t we store the image URL? Storing the image URL is fragile, and the link is susceptible to breaking if the URL ever changes for any reason. Storing the metadata needed to construct the URL is a lot more robust — we can easily change things like the domain name, and build the appropriate URL without having any downtime.
Let’s examine how we would use the components in our system to actually upload the image:
- Step 1: Client request an upload URL from the server (REQUEST)
- Step 2: Client uploads the image data to the upload URL (UPLOAD)
- Step 3: Client tells the server the upload is completed (CONFIRM)
- Step 4: Server processes image in background (PROCESS)
- Step 5: Client checks image processing status (CHECK)
- Step 6: Server is done processing image, notifies client (FINALIZE)
Step 1: Client request an upload URL from the server (REQUEST)
Why is the first step not an image upload? Remember — we don’t want the image data to ever even touch our servers. It is far too much of a resource hog and denial-of-service vulnerability.
Instead, we ask our server to give use the URL we should upload to. This URL could point to a 3rd-party cloud storage, such as S3, or another system specifically built to handle the load of image uploading.
During this step, the server can generate a random URL that is:
- time restricted
These upload URLs are pre-signed URLs. That is, the URL the server returns has query parameters that indicate all of the authorization in it required to upload to the 3rd-party service.
After performing authorization checks, the server also creates a record in the database to track this individual image upload, with data on:
- the name of the file
- the type of the file
- the URL of the file
- the status (eg.
- the associations of the image (eg.
- the kind of association (eg.
- write token (a generated token that must be provided to modify the image)
- read token (a generated token that must be provided to read the image)
- any other audit data
The server returns this data to the client.
Step 2: Client uploads the image data to the upload URL (UPLOAD)
This is a fairly straightforward step.
The client, now armed with the Upload URL from the server, simply performs a POST request to that URL. That service then accepts that data and stores it.
Step 3: Client tells the server the upload is completed (CONFIRM)
The client makes a request to the server with the token the server returned earlier, telling the server that the upload was completed.
Step 4: Server processes image in background (PROCESS)
The server verifies the token, and then checks for that upload request.
It kicks off a job that will process the image — verifying file integrity, creating variants, and performing optimizations without blocking requests to the web server.
Step 5: Client checks image processing status (CHECK)
Processing images takes some time, and you don’t want the client blocking a request. The client should check back occasionally to see if the processing is done.
Step 6: Server is done processing image, notifies client (FINALIZE)
Eventually, the check will pass and server is going to return the image URL. Now, the client is free to use the image.
The security here is provided by the image URL. If the image is protected, the URL to access the image would point to our server, and any request by the client would have to provide a read token, which the server could use to perform authorization as needed and generate a temporary signed URL for that image.
If the image is meant to be public, the image URL could be a direct reference to the image, bypassing our server altogether.
Implemented, this image uploading system can scale significantly and handle a lot of potential use cases. Adding new images would be as simple as adding a few lines of code.
The complexity of the system would be hidden behind clean interfaces that have implementations that, once built, would rarely change.
Built correctly, it could also support general file uploading as well!