The One True Way™ to handle temporary files
Problem statement
Re-generating file contents and atomically replacing the old version is a relatively straightforward task. And yet, I’ve seen multiple terrible ways to do it1:
# The "I don't give a flying hoot" way
$ generator.sh > final_file
# The "I care somewhat but am misunderstanding things" way
$ generator.sh > tmpfile
$ cat tmpfile > final_file
# The "close, but still no cigar" way
$ generator.sh > tmpfile
$ mv -f tmpfile final_file
# The "use mktemp, it'll be great" way
$ F=$(mktemp)
$ generator.sh > "$F"
$ mv -f "$F" finalfile
All of the ways above are subtly wrong.
Let me show you the One True Way™2. Skip to solution if you want tl;dr.
What’s wrong with…
… writing directly to the final file?
Let’s suppose you write directly to the target file and it works just fine. The following would be typical examples:
# One shot
$ generator.sh > final_file
# Multiple shot
$ echo "<html>" > final_file
$ echo "<body><pre>$(date)</pre></body>" >> final_file
$ echo "</html>" >> final_file
The trouble with both is that if someone accesses the file contents at an inopportune time, they either get a partial file, or even an empty one.
Which is abundantly clear as soon as you strace
what’s going on under
the hood:
$ strace -ttt /bin/sh -c "echo foobar > abc"
[...]
1709298831.349822 openat(AT_FDCWD, "abc", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
1709298831.349886 fcntl(1, F_DUPFD, 10) = 10
1709298831.349911 close(1) = 0
1709298831.349935 fcntl(10, F_SETFD, FD_CLOEXEC) = 0
1709298831.349959 dup2(3, 1) = 1
1709298831.349982 close(3) = 0
1709298831.350009 write(1, "foobar\n", 7) = 7
1709298831.350045 dup2(10, 1) = 1
1709298831.350110 close(10) = 0
1709298831.350137 exit_group(0) = ?
1709298831.350214 +++ exited with 0 +++
First bash opens the file (openat(... O_TRUNC, ...)
) in a mode that truncates
it, only to write
the contents a few microseconds after (the orchestration
in between is needed to redirect the outputs etc).
And it could get even worse if the output wasn’t generated with a single atomic write, or the command took some sweet time coming up with the output:
$ strace -f -e openat,write -ttt /bin/sh -c "ruby -e 'puts :ohboy' > abc" 2>&1 \
| grep -e abc -e ohboy
1709299267.244206 openat(AT_FDCWD, "abc", O_WRONLY|O_CREAT|O_TRUNC, 0666) = 3
[pid 15201] 1709299267.364196 write(1, "ohboy\n", 6) = 6
That’s now 120ms of having the file empty, in a trivial example.
Long story short: This method is dumb, because it leaves the final file empty or partial for a stretch of time.
… writing to a static temp file and then cat
-ing the output to the final one?
Let’s suppose a well meaning person switches to a more “advanced” way:
$ generator.sh > tmpfile
$ cat tmpfile > final_file
Security concerns aside3, this is not much better than the previous step.
Yes, you are less likely to get the partial result on the final cat
, but
the point about partial result (or empty file still stands).
And it gets worse.
Unless you’re guaranteed to never run the generator twice at the ~same time, you might end up with a garbled file. The probability is low, but present.
… writing to a static tempfile and then renaming it?
In other words:
$ generator.sh > tmpfile
$ mv -f tmpfile final_file
Here I would – again – invoke the security concerns and race conditions from the previous example.
But otherwise – oh so very very close, and still wrong.
… using mktemp
and $TMPDIR
?
The mktemp
way that you can find in plenty of articles on the web:
$ F=$(mktemp)
$ generator.sh > "$F"
$ mv -f "$F" finalfile
has a lot going for it. It even looks like the right approach. But it’s a “trap for young players” in this case, I’d say.
The reason for that is… the dreaded EXDEV
4 syscall error. You might
luck out and have $TMPDIR
on the same filesystem as the target file, but if
you don’t, you might get the same partial file issue as before.
Check this out (the strace
output is slightly post-edited for clarity):
$ strace -f /bin/sh -c \
'F=$(mktemp); echo foobar > $F; mv $F /run/user/$(id -u)/abc' 2>&1 | \
sed -r 's/\[pid [0-9]+\] //'
[...]
renameat2(AT_FDCWD, "/tmp/tmp.e5FbJKLu3V", AT_FDCWD, "/run/user/1000/abc",
RENAME_NOREPLACE) = -1 EXDEV (Invalid cross-device link)
openat(AT_FDCWD, "/run/user/1000/abc",
O_RDONLY|O_PATH|O_DIRECTORY) = -1 ENOENT (No such file or directory)
newfstatat(AT_FDCWD, "/tmp/tmp.e5FbJKLu3V",
{st_mode=S_IFREG|0600, st_size=7, ...}, AT_SYMLINK_NOFOLLOW) = 0
newfstatat(AT_FDCWD, "/run/user/1000/abc", 0x7ffcb4bf6570,
AT_SYMLINK_NOFOLLOW) = -1 ENOENT (No such file or directory)
unlinkat(AT_FDCWD, "/run/user/1000/abc", 0) = -1 ENOENT
openat(AT_FDCWD, "/tmp/tmp.e5FbJKLu3V", O_RDONLY|O_NOFOLLOW) = 3
newfstatat(3, "", {st_mode=S_IFREG|0600, st_size=7, ...}, AT_EMPTY_PATH) = 0
openat(AT_FDCWD, "/run/user/1000/abc", O_WRONLY|O_CREAT|O_EXCL, 0600) = 4
ioctl(4, BTRFS_IOC_CLONE or FICLONE, 3) = -1 EXDEV
newfstatat(4, "", {st_mode=S_IFREG|0600, st_size=0, ...}, AT_EMPTY_PATH) = 0
fadvise64(3, 0, 0, POSIX_FADV_SEQUENTIAL) = 0
copy_file_range(3, NULL, 4, NULL, 9223372035781033984, 0) = -1 EXDEV
[...]
read(3, "foobar\n", 131072) = 7
write(4, "foobar\n", 7) = 7
read(3, "", 131072) = 0
[...]
close(4) = 0
close(3) = 0
[...]
First mv
tries renaming using renameat2()
, then renameat()
, and when both
fail, it copies the file contents over5.
So yes, this will work for you well… until it sometimes doesn’t.
And that brings me to the One True Way™…
Solution
The one true way of handling temporary files6 is as follows:
#!/bin/bash
# Final file
FINAL=/path/to/final/file.html
# Temp file
TEMPF="$(mktemp "$FINAL.tmp.XXXXXX")"
# Clean up after yourself (should things go south), will ya?
trap "rm -f -- '$TEMPF'" EXIT
generator.sh > "$TEMPF"
mv -f "$TEMPF" "$FINAL"
This way (the rsync
way) all the important operations are atomic7,
and the tempfile name isn’t static, so buh-bye partial (or empty) files
and/or partial overwrites8.
The important distinction against previous F=$(mktemp)
is that by using
the final path with an unique suffix for the temp file, you are guaranteed
to be on the same filesystem, so no more EXDEV
. In other words, rename()
can always do its job atomically replacing the target file with the newly
generated.
Obviously it comes with the downside of needing to clean up if things go south
(hence the trap
part), but that’s a small price to pay, I’d say9.
Closing words
You might be thinking — why all this fuss? I don’t have a webserver with hundreds of concurrent hits. And my scripts (cronjobs) will never™ overlap.
Why should I bother?
How you do anything is how you do everything
That’s why.
Life’s too short to let bad habits creep into your muscle memory. Because you never know when the truly critical case comes along10.
PS: Also, did I get something wrong? Write me an email… I’m happy to get feedback.
-
Especially in the situation when that particular file is frequently accessed… say, by a webserver with hundreds+ of hits per second. ↩
-
Spoiler alert:
rsync
does it right. Be likersync
. ↩ -
IMO it’s a bad habit to blindly write into a statically named file from a script, unless the entire directory tree and the file itself is guaranteed to be under your control. But I declare this out of scope. ↩
-
Invalid cross-device link
, that is, the two files thrown torename()
orrenameat()
do not reside on the same filesystem. ↩ -
First trying the
ioctl()
clone andcopy_file_range
fanciness, and then resorting to farmer style. ↩ -
Assuming Linux,
bash
, GNUcoreutils
, that kind of jazz. ↩ -
Assuming you’re also properly handling errors during generation. ↩
-
Run it through the
strace
if you don’t believe me. ;) ↩ -
Obviously this way you can – worst case – get a stray temp file if the script gets murdered. Trade-offs, eh? So maybe the real lesson should be to collocate your final output and temp directory on the same filesystem, if you’re truly fussy about this event. Think
Maildir
? ;) ↩ -
I’ll get off the pulpit now. ↩