What’s with cp --reflink: failed to clone: Invalid argument?

Most modern copy-on-write file systems, such as Btrfs and XFS, support file cloning. (OpenZFS being the notable exception.) However, the tools that support this space-saving innovation can be difficult to use. Here’s an example situation detailing how the simple copy (cp) command on Linux can make it hard to understand what’s going on.

As an example, here’s a quick command set that will create a file and a directory, disable copy-on-write on the directory, and then attempt to clone the file into the directory. It uses commands from gnu-coreutils and e2fsprogs packages, and assumes you’re working on a file cloning-capable file system.

touch test.file
mkdir test.dir/
chattr +C test.dir/
cp --reflink test.file test.dir/

The file cloning operation will fail and output the following unenlightening error message:

/usr/bin/cp: failed to clone 'demo.dir/test.file' from 'test.file': Invalid argument

What argument is invalid? What makes it invalid? The error message isn’t actionable. Right now, we already know what the problem is from setting up the test above. We disabled copy-on-write on the target directory. However, you wouldn’t have this knowledge if you were working on an unfamiliar system, or had simply forgotten that you’d disabled copy-on-write on your ~/Downloads directory.

Where do you even start to troubleshoot something like this? It took me some time to find out how to best approach this type of problem. A strace of the copy command’s execution reveals where things go wrong:

ioctl(4, FICLONE, 3) = -1 EINVAL (Invalid argument)

The copy command simply bubbles this error up to the end-user. From here you could refer to the manual page from ficlone and look-up the EINVAL error:

“The filesystem does not support reflinking the ranges of the given files. This error can also appear if either file descriptor represents a device, FIFO, or socket. Disk filesystems generally require the offset and length arguments to be aligned to the fundamental block size. XFS and Btrfs do not support overlapping reflink ranges in the same file.”

The information isn’t all that helpful, though. We’re indeed on a file system that supports file-cloning (“reflinking,”) and none of the other causes matches the situation at hand. You can double and triple-check that your expectations are correct: your file system does support cloning, your kernel version is up-to-date, and no; you haven’t gotten any weird symbolic links across file system boundaries. You’d still be no closer to working out what the problem was.

There isn’t a plot-twist at the end of this story. I wasn’t struck by inspiration and found a way to identify the root cause of the problem. It took me about six weeks before I remembered that I had disabled copy-on-write on my ~/Downloads directory. I hadn’t only disabled file checksumming, copy-on-write updates, and potentially reduced file fragmentation. By turning off copy-on-write, I’d also turned off the ability to use file cloning to and from that directory. To me, this was an unexpected side-effect, but it makes sense in retrospect.

If you’re in the same situation, you can check if copy-on-write has been disabled on a file or directory using the lsattr command (part of the e2fsprogs package). Look for the “C” (uppercase) flag. You can remove the flag using chattr -C.

From the perspective of the underlying system call, it has indeed been provided with an invalid argument. However, the error is completely meaningless from the perspective of an end-user of the copy command. The copy command should have detected the system call error and displayed a more contextually appropriate error message.

Related reading