Closed Bug 1167732 Opened 9 years ago Closed 9 years ago

Investigate S3 COPY operation timings

Categories

(Infrastructure & Operations :: RelOps: General, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: dustin, Assigned: dustin)

Details

I've heard tell that the COPY operation is super-slow.

ubuntu@ip-10-134-48-94:~$ s3cmd -v cp s3://mozilla-releng-slowwin-test-use1/copytest s3://mozilla-releng-test/copytest1
INFO: Applying --exclude/--include
INFO: Summary: 1 remote files to copy
File s3://mozilla-releng-slowwin-test-use1/copytest copied to s3://mozilla-releng-test/copytest1
ubuntu@ip-10-134-48-94:~$ time s3cmd -v cp s3://mozilla-releng-slowwin-test-use1/copytest s3://mozilla-releng-test/copytest1
INFO: Applying --exclude/--include
INFO: Summary: 1 remote files to copy
File s3://mozilla-releng-slowwin-test-use1/copytest copied to s3://mozilla-releng-test/copytest1
real    1m7.512s

That's a 2G file between two buckets in use1, so about 30MB/s.  Which seems a little odd since AIUI S3 is content-addressible on the backend, so I would think an in-region copy would just involve adding a link.  I'll see if I can gather some comprehensive data.
Yikes, 10 minutes coast-to-coast.
Hm, and that failed without an error indication:

ubuntu@ip-10-134-48-94:~$ time s3cmd -v cp s3://mozilla-releng-slowwin-test-use1/copytest s3://mozilla-releng-slowwin-test-usw1/copytest
INFO: Applying --exclude/--include
INFO: Summary: 1 remote files to copy
File s3://mozilla-releng-slowwin-test-use1/copytest copied to s3://mozilla-releng-slowwin-test-usw1/copytest

real    10m8.263s
user    0m0.041s
sys     0m0.016s
ubuntu@ip-10-134-48-94:~$ s3cmd cp s3://mozilla-releng-slowwin-test-usw1/copytest s3://mozilla-releng-slowwin-test-usw2/copytest
ERROR: S3 error: 404 (NoSuchKey): The specified key does not exist.

(and indeed a check with the console shows the file is not there)
Uploading from EC2/use1 to S3/usw1 only takes 5.5 minutes.  What's AWS doing here??
ubuntu@ip-10-134-48-94:~$ time s3cmd cp s3://mozilla-releng-slowwin-test-usw1/copytest s3://mozilla-releng-slowwin-test-usw2/copytest
File s3://mozilla-releng-slowwin-test-usw1/copytest copied to s3://mozilla-releng-slowwin-test-usw2/copytest

real    34m18.935s
user    0m0.106s
sys     0m0.024s

I could easily have downloaded the file on my laptop, burned it to a CD, carried it to my desk, and uploaded it from my desktop in less time.  Wow.
Thanks for CCing me. For the Mercurial bundles, I think I'll just do dual upload instead of this copy foo. At least with upload I get pretty much guaranteed 40MB/s from SCL3.
From the support case, it looks like the suggested way to do copies is to issue many parallel ranged operations.  Which is stupid.  Download and re-upload is probably the right idea.
Ugh, using the correct calculations for size, my tests over the memorial day weekend showed averages as follows:

 src  dst avg time   rate      
usw1 usw1 61.17s     33.481MB/s
usw1 use1 2061.59s   0.993MB/s 
usw1 usw2 2052.22s   0.998MB/s 
use1 usw1 2034.54s   1.007MB/s 
use1 use1 68.00s     30.118MB/s
use1 usw2 1910.06s   1.072MB/s 
usw2 usw1 1992.22s   1.028MB/s 
usw2 use1 2024.66s   1.012MB/s 
usw2 usw2 65.77s     31.139MB/s

Support confirmed that inter-region copies are rate-limited, and it looks like that's limited to 1MB/s.  From comment 4, 34:18 minutes is 2068s, which matches pretty closely to 2GB = 2048MB.  It's interesting that they build in a per-request rate limit, then write a client that issues multiple parallel requests.

The 30+ MB/s for intra-region copies seems pretty reasonable and matches comment 0.

So, I think we have a complete answer.  James, you satisfied?
Flags: needinfo?(jlal)
Well, we can re-open if not :)
Status: NEW → RESOLVED
Closed: 9 years ago
Flags: needinfo?(jlal)
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.