人被杀,就会死;文件被删,就会丢。
- if you rm something, it’s fucking GONE
- too sufficient permissions
- R.I.P the old vnil.de // 中文版
- git tragedies - let’s build a railway system
- SSH then what?
if you rm something, it’s fucking GONE
Yeah, I thought I have git committed my changes but I didn’t. I was trying to delete some latex build relics and I was too tired to think straight. I typed
rm paper.*
And I forgot my document source was named paper.tex
. Now my 2 hours of work is
gone.
lesson learned: 别tm惦记你那手动 野 卡 片了,好好给 makefile写一行clean不行吗?
too sufficient permissions
I had a system rootfs image, mounted to a host FS via loop device and I was
trying to dump some files into it. Because it was mounted as root, I have to use
sudo
to copy the files but I got tired and committed a relentless
chmod -R 777 mount_point
And there is no fucking way to revert it, at least with ext4. When I booted into
that system the root
login is rejecting me. Having to insufficient permissions
is a pain, but having too-sufficient permissions will fuck you pretty bad.
lesson learned: sudo exists for a reason. Ride with care, ride with minimal privilege.
R.I.P the old vnil.de // 中文版
Most of you don’t know, in the beginning vnil.de was a web forum instead of a pleroma instance. Why was it gone? Why was it fucking gone?
Basically I had a carelessly written crontab script that backups the production database on a daily basis: it only keeps only the newest 3 copies and deletes the older ones. But instead of sorting the backup files by creation date, I was simply deleting “all files that are older than 3 days”. The script looks something like this:
/usr/bin/mongodump -u user -p pwd --authenticationDatabase=forum -d forum
tar -zcvf dump/${backup_date}.tar.gz dump/forum --remove-files
find dump/ -mtime +3 | xargs rm -f
Do you see where this is going? There was an error in the dumping process, the script fails to create the dumps but loyally deleted the “old” backups. There was many layers of self-inflicted miseries but tl;dr; I somehow nuked the production database and ended up with no working backup.
Lesson learned: don’t be overconfident in your backups
git tragedies - let’s build a railway system
I write more in details {here}. Honestly you can’t really fuck up
git since you have a reflog
. But once your commits are merged into master, so
are the footguns.
lesson learned: learn how to rebase before learning how to push
SSH then what?
NEVER do anything important directly from a SSH prompt. Imagine:
$ ssh admin@my.production.server
(admin@prod) $ sudo <heavy batch database operation>
(admin@prod) $ [waiting for hours]
(admin@prod) $ [...]
(admin@prod) $ [...]
# and aha, your network is down for 10 seconds.
Now your remote shell is killed, along with the processes it fork/exec
. And
your important operation will be interrupted in a undefined state. Okay this is
a multifaceted problem, but the most obvious one is this: don’t fucking
attach any non-trivial task to your ssh shell!
If you don’t know better, you may use $nohup <your program> &
, or setsid -f <your program>
. If you want some sanity left, learn tmux
or screen
(well,
you have to know one of them, or … or I have great sympathy for your suffering)
IF YOU REALLY WANT TO KEEP YOUR SANITY: work out your routines/scripts/programs
to properly handle OS interruptions like SIGHUP
or SIGINT
, and make sure you
always have backups. (it’s another thing for your transactions to be atomic and
fault tolerant but I’m not commenting because I’m no expert)