I’m continuing with my exploration in the AWS world 🙂
For the last couple of days, I have been occasionally receiving the weird error “NSuckUpload” when I try to either upload a part to an S3 Multipart upload or try to complete the upload with given UploadId.
S3 Multipart Upload is the way you upload really big files into S3.
As of today (2019 Jan 18), the docs indicate that you can upload a 5GB file with one call to their api, but for bigger files, you’d need to split the file in parts and upload each one of them using the S3 Multipart Upload API.
Here’s how the multipart upload API works:
- Call the s3. createMultipartUpload method to indicate that you will upload a file split on parts. Each part should be between 5MB and 5GB. You can have < 5MB only for the last part (which is useful if you don’t know exactly how big is your file that you’re going to export). The method returns an UploadId, that one must use in order to add parts and complete the multipart upload.
- (N-times) Upload a part using s3.uploadPart, providing the body of the file part, the UploadId, the PartNumber, and the items you pass everywhere – the Bucket and the Key. Mind that PartNumber starts from 1 (for whatever reason). This method returns ETag for your part that you must store.
- Call the s3.completeMultipartUpload method to indicate that you’re ready with the upload. One has to provide the UploadId all the parts in format { ETag, PartNumber } and the regular Bucket and Key.
After completing these steps in the s3 bucket a file with the given name (Key) should appear.
So my mistake here was that my DB exporter didn’t wait for all the async work to be done before calling the stream ‘end’. This caused writing after the ‘end’ of the stream and skipping a number of items
The funny part was that instead of receiving an error regarding this casual problem, the AWS API returned 404 “NoSuchUpload”, although I could see the uploadId when listing the active uploads afterward.
Moral of the story:
Add unit tests to your code and verify that you could write something to file before trying to upload it to the cloud. Also – try to provide useful error messages when designing an API.