After running into an issue when copying thousands of files at one to EFS, I came across https://aws.amazon.com/premiumsupport/knowledge-center/efs-troubleshoot-slow-performance/
Let’s look at some benchmarks – the issue was that this job was taking 30+ seconds which would have timed out the HTTP server and it’s not yet a background job. It unzips a file to a temporary directory (not in EFS), does some validation, then copies the contents to EFS. The zip in question was 3000 files, around ~150MB extracted.
# find /big_dir | wc -l
3098
# time cp -R /big_dir /efs_dir
real 0m37.957s
user 0m0.052s
sys 0m0.960s
Ouch, that’s not good. We could try rsync
, maybe that will help:
rsync -r /big_dir /efs_dir
real 1m10.210s
user 0m0.931s
sys 0m1.744s
Even longer! Why is this? It’s because:
Metadata I/O occurs if your application performs metadata-intensive, operations such as, "ls," "rm," "mkdir," "rmdir," "lookup," "getattr," or "setattr", and so on. Any operation that requires the system to fetch for the address of a specific block is considered to be a metadata-intensive workload.
rsync
is also checking the destination file to see if it needs to sync it, which causes a bottleneck. So plain rsync
and cp
aren’t an option.
The issue is that Elastic File System is not built for serial operations. That is, copying a file, waiting, and copying the next one. EFS must replicate all the files to multiple locations so there is a delay while it does so. There is also some overhead from NFS, as each filesystem operation is a network call. What EFS is designed for is actually parallel operations. But rsync
or cp
can’t run in parallel, so you’ll need to manually batch up your files or use this tool that was referenced in the document above called fpsync
(Filesystem partitioner sync).
What fpsync
can do is split a directory of files up into chunks, and then send those contents in parallel via rsync
. This is also possible with GNU Parallel but you’d have to write your own script. fpsync
was available on CentOS, and probably many other distributions. Let’s run it out of the box:
fpsync /big_dir /efs_dir
real 0m59.790s
user 0m1.975s
sys 0m3.925s
Not much of an improvement…but why? Because fpsync
doesn’t run in parallel by default, and you have to tweak it a bit. Let’s process 100 files at a time using 10 concurrent runners:
# time fpsync -f 100 -n 10 -v /big_dir /efs_dir
1662569967 Info: Run ID: 1662569967-43986
1662569967 ===> Analyzing filesystem...
1662569968 <=== Fpart crawling finished
1662569980 <=== Parts done: 29/29 (100%), remaining: 0
1662569980 <=== Time elapsed: 13s, remaining: ~0s (~0s/job)
1662569980 <=== Fpsync completed without error in 13s.
real 0m13.467s
user 0m2.086s
sys 0m4.400s
Much better! But let’s try more concurrent runners. Since we had 3000 files, there would have been a queue in our last command (100*10 = 1000). So let’s run 50 batches of 50 files each:
# fpsync -f 50 -n 50 -v /big_dir /efs_dir
1662570120 Info: Run ID: 1662570120-51913
1662570120 ===> Analyzing filesystem...
1662570122 <=== Fpart crawling finished
1662570129 <=== Parts done: 58/58 (100%), remaining: 0
1662570129 <=== Time elapsed: 9s, remaining: ~0s (~0s/job)
1662570129 <=== Fpsync completed without error in 9s.
real 0m8.903s
user 0m2.093s
sys 0m4.868s
So, the more concurrent copy operations we can run, the better.
On a regular disk this wouldn’t have any effect since the filesystem operations are negligible and your only bottleneck is the disk speed. It might even slow it down. There may be some other options inside of fpsync
that would speed it up even more. What about rsync --inplace
? This would eliminate a step that rsync would usually take, which is to create a new file, then rename it.
# time fpsync -o "--inplace" -f 100 -n 50 -v /big_dir /efs_dir
1662584365 Info: Run ID: 1662584365-126224
1662584365 ===> Analyzing filesystem...
1662584367 <=== Fpart crawling finished
1662584371 <=== Parts done: 29/29 (100%), remaining: 0
1662584371 <=== Time elapsed: 6s, remaining: ~0s (~0s/job)
1662584371 <=== Fpsync completed without error in 6s.
real 0m5.872s
user 0m1.634s
sys 0m3.438s
Running batches of 100 brought it down to under 6s. After that it started to get slower. Also running a huge number of rsync
s and small batches got slower. This is likely due to the system itself – after all, it’s running 250+ instance of rsync
.