Showing posts with label cron. Show all posts
Showing posts with label cron. Show all posts

11 March 2014

563. High disk i/o caused by find/sort <- updatedb

High disk I/O, leading to system slowdown, has been bothering me a lot recently. Most of the time I've simply blamed it on ECCE, and while the situation gets better when ECCE isn't running, it's still occasionally very bad.

Diagnosis

iotop shows
 Total DISK READ:       3.48 M/s | Total DISK WRITE:    1193.67 K/s
  TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND                                                                                                                                                                     
25565 be/4 root        3.46 M/s    0.00 B/s  0.00 % 76.92 % find / ( -fstype nfs -o -fstype NFS -o -fstype proc -o -fstype afs -o -fstype smbfs -o -~$\)\|\(^/var/tmp$\)\|\(^/afs$\)\|\(^/amd$\)\|\(^/sfs$\)\|\(^/proc$\) ) -prune -o -print0
ps aux|grep 2556[0-9]
root     25562  0.0  0.0  18620   336 ?        S    12:33   0:00 /bin/sh /usr/bin/updatedb

root     25563 26.2  0.1  25996 12400 ?        S    12:33   1:51 /usr/bin/sort -z -f
root     25564  0.0  0.0   4216   116 ?        S    12:33   0:00 /usr/lib/locate/frcode -0
root     25565 24.2  0.0  19024   956 ?        R    12:33   1:09 /usr/bin/find / ( -fstype nfs -o -fstype NFS -o -fstype proc -o -fstype afs -o -fstype smbfs -o -fstype autofs -o -fstype iso9660 -o -fstype ncpfs -o -fstype coda -o -fstype devpts -o -fstype ftpfs -o -fstype devfs -o -fstype mfs -o -fstype sysfs -o -fstype shfs -o -type d -regex \(^/tmp$\)\|\(^/usr/tmp$\)\|\(^/var/tmp$\)\|\(^/afs$\)\|\(^/amd$\)\|\(^/sfs$\)\|\(^/proc$\) ) -prune -o -print0
Heading deeper down the rabbit hole:
me@beryllium:~$ ps -p 25565 -o ppid=
25562
me@beryllium:~$ ps -p 25562 -o ppid=
25554
me@beryllium:~$ ps -p 25554 -o ppid=
25553
me@beryllium:~$ ps -p 25553 -o ppid=
25552
me@beryllium:~$ ps -p 25552 -o ppid=
 4315
me@beryllium:~$ ps -p 4315 -o ppid=
    1
me@beryllium:~$ ps aux|grep 4315
root      4315  0.0  0.0  26124   428 ?        Ss   Mar07   0:05 /usr/sbin/cron
me@beryllium:~$ ps aux|grep 25552
root     25552  0.0  0.0  64068   844 ?        S    12:33   0:00 /USR/SBIN/CRON
me@beryllium:~$ ps aux|grep 25554
root     25554  0.0  0.0  18620   588 ?        S    12:33   0:00 /bin/sh /usr/bin/updatedb

So, updatedb is starting 25565, which is bogging down the computer. updatedb is starting 25565, and updatedb is started as a cron job. updatedb is run in order to update the locate database, and locate is a powerful file search function -- whereas find searches on the fly, locate consults a database.

At this point its probably a good idea to mention that I have a 4 Tb system, plus four mounted NFS folders with many Gb of content.

Either way, the only thing that remains is to identify which cron job is launching updatedb:

me@beryllium:~$ egrep "updatedb" /etc/cron.*/*
/etc/cron.daily/locate:# Please consult updatedb(1) and /usr/share/doc/locate/README.Debian
/etc/cron.daily/locate:[ -e /usr/bin/updatedb.findutils ] || exit 0
/etc/cron.daily/locate:# filesystems which are pruned from updatedb database
/etc/cron.daily/locate:# paths which are pruned from updatedb database
/etc/cron.daily/locate:if [ -r /etc/updatedb.findutils.cron.local ] ; then
/etc/cron.daily/locate: . /etc/updatedb.findutils.cron.local
/etc/cron.daily/locate:  cd / && nice -n ${NICE:-10} updatedb.findutils 2>/dev/null


Solution:
locate is a powerful command which I use frequently, but I'd be happy to change the frequency of updatedb to once per week instead of once per day, especially if running it takes hours.

sudo mv /etc/cron.daily/locate /etc/cron.weekly/locate

We can also work on excluding paths.
me@beryllium:~$ cat /etc/cron.weekly/locate |grep PRUNE
PRUNEFS="NFS nfs nfs4 afs binfmt_misc proc smbfs autofs iso9660 ncpfs coda devpts ftpfs devfs mfs shfs sysfs cifs lustre_lite tmpfs usbfs udf ocfs2"
PRUNEPATHS="/tmp /usr/tmp /var/tmp /afs /amd /alex /var/spool /sfs /media /var/lib/schroot/mount"
export FINDOPTIONS PRUNEFS PRUNEPATHS NETPATHS LOCALUSER

So my NFS folders are already excluded through PRUNEFS, but it might be worth throwing more paths into PRUNEPATHS. In my case I'm quite happy with a full run every week.

Update: I also discovered that I'd put an updatedb job manually in /etc/crontab which was run once every three hours. The cron.daily script was run at 6 am, and so was unlikely to cause slowdown during times when I'm actually at work. Instead it was the script I'd set up myself that was the culprit.

04 September 2013

509. Very briefly: Send remote commands via the dropbox folder

This is probably fairly obvious to most people.

I've got a reverse ssh tunnel set up so that I can access my work computer from home. However, for the past few days I've had the connection get stuffed up on a regular basis (it doesn't get dropped, but the connection gets refused), and it frustrates me a little bit.

While a proper ssh connection is unbeatable, I would at least be able to copy files back and forth via dropbox if I only had a way of sending commands to my work computer.

And an obvious way of doing that would be to use a cronjob and a tiny bit of bash scripting. So here we go:
While we don't have to (we could just have an empty script file instead) I like the idea of testing for the presence of a specific file in the Dropbox folder, and if it exists, execute it.

Let's call the file that tests for it runremote.sh, and put it in our home folder (~/). I personally suspect that making sure that execution output and error messages get properly logged is a good thing if you're going to fly blind like this, hence the 1> and 2>

runremote.sh
if [ -e ~/Dropbox/runme.sh ]; then sh ~/Dropbox/runme.sh 1>> ~/Dropbox/runme.log 2>> ~/Dropbox/runme.error & fi

Then when you want something executed, put a file called runme.sh in ~/Dropbox:
pwd echo 'Is it working?' cp ~/testfile.text ~/Dropbox date
Note that any command in runme.sh is going to be run in the ~/ folder -- not in ~/Dropbox.

And set the runremote.sh file to be executed e.g. every five minutes through cron:

crontab -e
*/5 * * * * sh ~/runremote.sh

Again, you don't need to have it test for the presence of a file, but I just instinctively like the idea.

Anyway, any command you put in ~/Dropbox/runme.sh should be executed and logged within five minutes from being synced.

You CAN use sudo (echo mypassword| sudo -S ls /root )as well by providing your password in the script file, but this is obviously not terribly safe.