Notes about open source software, computers, other stuff.

Tag: sysadmin (Page 4 of 5)

Converting from bzr to git

I’m in the process of moving several of my projects that used Bazaar (bzr) for revision control to Git. Converting a repository from bzr to git is very easy when using the fastimport package. In a Debian-based distribution run the following command to install the package (don’t be fooled by its name, it also contains the fastexport option):

sudo aptitude install bzr-fastimport

The go into the directory that contains your bzr repo and run:

git init
bzr fast-export `pwd` | git fast-import 

You can now check a few things, e.g. running git log to see whether the change log was imported correctly. This is also the moment to move the content of your .bzrignore file to a .gitignore file.

If all is well, let’s clean up:

rm -r .bzr 
git reset HEAD

Thanks to Ron DuPlain for his post here, from which I got most of this info.

Related Images:

Solving “RTNETLINK answers: File exists” when running ifup

On a server with multiple network cards I tried to configure the eth3 interface by editing /etc/network/interfaces (this was an Ubuntu 12.04 machine).

This was the contents of /etc/networking/interfaces:

# The loopback network interface
auto lo
iface lo inet loopback

auto eth0
iface eth0 inet static
        address xxx.yyy.zzz.mmm
        netmask 255.255.255.0
        gateway xxx.yyy.zzz.1
        dns-nameservers xxx.yyy.zzz.aaa xxx.yyy.zzz.bbb
        dns-search mydomain.nl

auto eth3
iface eth3 inet static
        address 192.168.4.1
        netmask 255.255.255.0
        gateway 192.168.4.1

When I tried to bring the interface up I got an error message:

$ ifup eth3
RTNETLINK answers: File exists
Failed to bring up eth3.

It took me a while to figure it out, but the problem was the gw line in the eth3 entry. Of course you can only have one default gateway in your setup. I missed this because I was also trying to add routes to networks behind the machine on the other end of eth3.
In the end, removing the gw line in the eth3 entry solved the problem.

My final /etc/networking/interfaces looks like this:

# The loopback network interface
auto lo
iface lo inet loopback

auto eth0
iface eth0 inet static
        address xxx.yyy.zzz.mmm
        netmask 255.255.255.0
        gateway xxx.yyy.zzz.1
        dns-nameservers xxx.yyy.zzz.aaa xxx.yyy.zzz.bbb
        dns-search mydomain.nl

auto eth3
iface eth3 inet static
        address 192.168.4.1
        netmask 255.255.255.0
        post-up /sbin/route add -net 192.168.1.0 netmask 255.255.255.0 gw 192.168.4.250
        post-up /sbin/route add -net 192.168.2.0 netmask 255.255.255.0 gw 192.168.4.250
        post-up /sbin/route add -net 192.168.3.0 netmask 255.255.255.0 gw 192.168.4.250
        post-down /sbin/route del -net 192.168.1.0 netmask 255.255.255.0
        post-down /sbin/route del -net 192.168.2.0 netmask 255.255.255.0
        post-down /sbin/route del -net 192.168.3.0 netmask 255.255.255.0

Update 2013-08-19: Removed network entries as per Ville’s suggestion.

Related Images:

Pairing a device with a Logitech unifying receiver in Linux

My girlfriend’s keyboard and mouse stopped working some time ago. It turned out that her Logitech unifying receiver (a small USB dongle for keyboard and mouse) was a bit broken, only when twisted in a certain way it would work. So, I called Logitech, explained the situation and they offered to send us a replacement for free. Well done Logitech support!

Now, since we both use Linux as our main OS, the question was how to pair the mouse and keyboard with the new receiver. Logitech provides a piece of Windows software, but nothing for Linux. It turns out it’s not that difficult and you can find various little C programmes that do it for you. I tried Travis Reeder’s solution and it worked like a charm on my Ubuntu 12.04 machine.

These are the steps I took.
First I switched off the keybord and the mouse, then ran the following:

$ git clone https://github.com/treeder/logitech_unifier.git
Cloning into 'logitech_unifier'...
remote: Counting objects: 35, done.
remote: Compressing objects: 100% (26/26), done.
remote: Total 35 (delta 11), reused 33 (delta 9)
Unpacking objects: 100% (35/35), done.
$ cd logitech_unifier/
$ ./autopair.sh 
Logitech Unified Reciever unify binary not compiled, attemping compilation
Logitech Unified Reciever unify binary was successfully compiled
Auto-discovering Logitech Unified Reciever
Logitech Unified Reciever found on /dev/hidraw0!
Turn off the device you wish to pair and then press enter
[sudo] password for lennart: 
The receiver is ready to pair a new device.
Switch your device on to pair it.

I ran the autopair.sh script twice, once for the mouse and once for the keyboard.

Thanks Travis!

Related Images:

Comparing rsnapshot and obnam for scheduled large backups

Introduction

The home directories of the servers I administer at work total about 6.5TB of data. The home directories are stored on a file server (using ext4 partitions) and served to the other server over NFSv3 with a bonded 1Gbps LAN link.

As you all know backups are a good idea but how to implement a backup strategy for this kind of data? We decided quite early that using tapes as backup medium was out of the question. We simply can’t afford them given the amount of disk space we need. Moreover, tapes usually require operator involvement and neither me nor my colleague feels like going to the data centre every week. Our idea was to back up to another server with enough disk space in a different part of the data centre. For off-site backups we can always make an annual (maybe monthly) backup either on tape at SurfSARA/BigGrid or on a remote server.

Before implementing a given strategy several things need to be known and tested. The major questions we wanted to have an answer to were:

  1. How often do we want to backup the data? Daily snapshots? Weekly? Monthly?
  2. How many of the backups mentioned above do we want to keep? And for how long?
  3. In order to answer these questions (given a roughly fixed amount of backup space) we need to know
    • How much data changes per night/week/etc.
    • How much duplication is there in the data? How many people store the same file (or blocks, if you go for block-level deduplication)?
  4. Is NFS/network speed a limiting factor when running the backups?
  5. Can the tool preserve additional file system attributes like POSIX ACLS?

Candidates

After looking around the web and looking back at my own experiences I came up with three possible candidates. Each of them allows for backup rotation and preserves Posix ACLs (so points 1 and 5 above have been taken care of).

  1. Bacula: enterprise-level backup application that I’ve used in combination with tapes in the past. Easily supports multiple clients, tape robots, etc. No deduplication. All metadata etc. are stored in a (MySQL) database, so restoring takes some effort (and don’t forget to make a backup of the database as well!).
  2. rsnapshot: based on rsync, makes snapshots using hard links. Easy to restore, because files are simply copied to the backup medium.
  3. rdiff-backup: similar to rsnapshot, but doesn’t allow for removal of intermediate backups after a given time interval. Consequently it was the first candidate to fall of my list.
  4. Obnam: a young tool that promises block level data deduplication. Stores backed up data in its own file format. Tools for browsing those archives are not really well developed yet.

Tests

Because I already had quite some experience with Bacula but none with the other two candidates (although I use rsync a lot) I decided to start a test run with Obnam, followed by a run with rsnapshot. These are the results:

Obnam

After backing up /home completely (which took several days!), a new run, several days later took (timing by the Linux time command):

Backed up 3443706 files, uploaded 94.0 GiB in 127h48m49s at 214.2 KiB/s average speed830 files; 1.24 GiB (0 B/s)

real    7668m56.628s
user    4767m16.132s
sys     162m48.739s

From the obname log file:

2012-11-17 12:41:34 INFO VFS: baseurl=/home read=0 written=0
2012-11-21 23:09:36 INFO VFS: baseurl=/backups/backup_home read=2727031576964 written=150015706142
2012-11-21 23:09:36 INFO Backup performance statistics:
2012-11-21 23:09:36 INFO * files found: 3443706
2012-11-21 23:09:36 INFO * uploaded data: 100915247663 bytes (93.9846482715 GiB)
2012-11-21 23:09:36 INFO * duration: 460128.627629 s
2012-11-21 23:09:36 INFO * average speed: 214.179341663 KiB/s
2012-11-21 23:09:36 INFO Backup finished.
2012-11-21 23:09:36 INFO Obnam ends
2012-11-21 23:09:36 INFO obnam version 1.2 ends normally

So: ~5 days for backing up ~100 GB of changed data… Load was not high on the machines, neither in terms of CPU, nor in terms of RAM. Disk usage in /backups/backup_home was 5.7T, disk usage of /home was 6.6T, so there is some dedup, it seems.

rsnapshot

A full backup of /home to (according to the log file):

[27/Nov/2012:12:55:31] /usr/bin/rsnapshot daily: started
[27/Nov/2012:12:55:31] echo 17632 > /var/run/rsnapshot.pid
[27/Nov/2012:12:55:31] mkdir -m 0700 -p /backups/backup_home_rsnapshot/
[27/Nov/2012:12:55:31] mkdir -m 0755 -p /backups/backup_home_rsnapshot/daily.0/
[27/Nov/2012:12:55:31] /usr/bin/rsync -a --delete --numeric-ids --relative --delete-excluded /home /backups/backup_home_rsnapshot/daily.0/localhost/
[28/Nov/2012:23:16:16] touch /backups/backup_home_rsnapshot/daily.0/
[28/Nov/2012:23:16:16] rm -f /var/run/rsnapshot.pid
[28/Nov/2012:23:16:16] /usr/bin/rsnapshot daily: completed successfully

So: ~1.5 days for a full backup of 6.3TB. An incremental backup a
day later took:

[29/Nov/2012:13:10:21] /usr/bin/rsnapshot daily: started
[29/Nov/2012:13:10:21] echo 20359 > /var/run/rsnapshot.pid
[29/Nov/2012:13:10:21] mv /backups/backup_home_rsnapshot/daily.0/ /backups/backup_home_rsnapshot/daily.1/
[29/Nov/2012:13:10:21] mkdir -m 0755 -p /backups/backup_home_rsnapshot/daily.0/
[29/Nov/2012:13:10:21] /usr/bin/rsync -a –delete –numeric-ids –relative –delete-excluded –link-dest=/backups/backup_home_rsnapshot/daily.1/localhost/ /home /backups/backup_home_rsnapshot/daily.0/localhost/
[29/Nov/2012:13:25:09] touch /backups/backup_home_rsnapshot/daily.0/
[29/Nov/2012:13:25:09] rm -f /var/run/rsnapshot.pid
[29/Nov/2012:13:25:09] /usr/bin/rsnapshot daily: completed successfully

So: 15 minutes… and the changed data amounted to 21GB.

This gave me a clear winner: rsnapshot! Not only is it very fast, but given its simple way of storing data restoring a backup of any file is quickly done.

We now also have answers to our questions: Our daily changing volume is of the order of ~ 100GB, there isn’t much data that can be deduplicated. We also monitored the network usage and, depending on the server load it can be limiting, but since a daily differential backup takes only 15-30 minutes that isn’t a problem.
For a remote backup sever that was connected with a 100Mbps line we did see that the initial backup took a very long time. We should try to get a faster connection to that machine.

The future

The next challenge we face is how to back up some of the large data sets we have/produce. These include aligned BAM files of next-generation sequencing data, VCF files of the same data, results from genomic imputations (both as gzip-ed text files and as binary files in DatABEL format). This also totals several TB. Luckily these files usually don’t change on a daily basis.

Related Images:

Booting an Ubuntu server with a degraded software RAID array

My home server runs Ubuntu 12.04 with a software RAID 5 array and since a couple of days I’ve been getting e-mails from the SMART daemon warning me of uncorrectable errors on one of the drives. Today I took the time to take the failing drive out and check it with the tools from the manufacturer.

Because I didn’t want to run the risk of unplugging the wrong drive with the system on (and thus losing the whole RAID array) I shut the server down, removed the harddrive and started it again. The idea was that it would boot right back into the OS, but with a degraded RAID array. Unfortunately the server didn’t come up… After connecting a keyboard and monitor to it it turned out that the system was waiting with an initramfs prompt. From there I could check that the RAID array was indeed degraded, but functioning fine as I could manually mount all partitions.

Some Googling later I found out that by default Ubuntu doesn’t boot into a degraded software RAID array. This is to make sure you as administrator know something is wrong. A good idea for a laptop or PC, but not for a standalone server. The solution is the following:

  • From the initramfs prompt mount your original filesystems, for example in /mnt.
  • Use chroot /mnt to change root into your server’s hard disks.
  • In the file /etc/initramfs-tools/conf.d/mdadm add or change the line to
    BOOT_DEGRADED=true
    
  • Then run
    update-initramfs -u

    to regenerate the initial ramdisk.

  • Type exit to exit the chroot environment.
  • Unmount your file systems and reboot

Now your server should continue booting even though it has a degraded RAID array.

Links

Related Images:

Installing Loggerhead behind Apache on Ubuntu 11.04

Introduction

Loggerhead is a webfrontend for Bazaar (usually abbreviated as bzr) repositories. Bazaar is a so-called distributed version control system. So, if you have one or more bzr repositories you can use Loggerhead to look at the files, read the change logs and see the differences between revisions from within your web browser.

The main purpose of this post is to document the steps needed to configure Loggerhead and Apache to work together to publish your bzr repos on the web. The need for this post arose when I tried to get this setup to work and found that there isn’t a lot of documentation on how to get this done and most of it is out of date. The folowing steps were performed on a Linux server with Ubuntu 11.04 installed.

Basic Loggerhead configuration

First, let’s install Loggerhead:

$ aptitude install loggerhead

Although the package is called loggerhead, the actual binary that is run is called serve-branches. The package provides start and stop scripts for the service (/etc/init.d/loggerhead), but to start successfully the file /etc/serve-branches.conf needs to exist. Older documentation I found on the web refers to the file /etc/loggerhead.conf, but that file has become obsolete.

The serve-branches.conf file contains three lines:

served_branches=/home/bzr
prefix=
port=8080

Here, the line served_branches points to the directory under which you store your bzr repositories. Each repo needs to be stored in its own directory. So in this example all the repos are in subdirectories of /home/bzr/.

You have to make sure that loggerhead can read the files in that directory. Loggerhead runs as the loggerhead user but I made the directories readable and accessible by all users:

$ chmod -R a+rx /home/bzr/

If you now start Loggerhead:

$ service start loggerhead

you should be able to visit http://localhost:8080 in your browser and see your repositories.
NOTE for Ubuntu 12.04 and 12.10: There seems to be a bug in Loggerhead for these Ubuntu releases (see the link to the Launchpad bug report at the end of this post). In order to start the Loggerhead daemon correctly in these Ubuntu releases the file /etc/init.d/loggerhead must be edited. The line

start-stop-daemon -p $PIDFILE -S --startas /usr/bin/serve-branches --chuid loggerhead --make-pidfile --background --chdir $served_branches -- --prefix=$prefix --port=$port --host=$host --log-folder /var/log/loggerhead 2>/dev/null

must be changed to

start-stop-daemon -p $PIDFILE -S --startas /usr/bin/serve-branches --chuid loggerhead --make-pidfile --background -- file://$served_branches --prefix=$prefix --port=$port --log-folder /var/log/loggerhead 2>/dev/null

Once this is done run restart the Loggerhead service as stated above and it should work again (if you run Loggerhead behind an Apache webserver as detailed below, don’t forget to restart Apache also).

How to publish your branch to this shared repository?

Now that our repository browser is set up, how do we publish our branches to it so that there actually is something to browse through? Here is how you publish your branch to the server, assuming that you are in a directory that contains a branch and want to publish it as myTests:

$ bzr push --create-prefix sftp://username@server.yourdomain.com/home/bzr/myTests

As you probably suspected, the --create-prefix option is only necessary the first time you push your branch. Note that we are using sftp here. Loggerhead itself doesn’t allow writes to the published repos. So, every user that want to push his/her changes to this repository needs to have sftp access to the /home/bzr directory. I solved that problem by adding all people that need to be able to push changes to a Linux group called vcs (for Version Control Systems) and then set the primary group of /home/bzr/ to vcs as well as giving group write permissions to this directory:

$ ls -ld /home/bzr/
drwxrwxr-x 4 root vcs 4096 2011-08-16 23:10 /home/bzr/

Adding Apache to the mix

In my case I already have a web server (Apache) running on port 80. Since I’d rather not open yet another port (8080 in this case) on my router, I wanted to use Apache to hand over the requests for bzr page to Loggerhead. For that I needed to install the following packages:

$ aptitude install python-pastedeploy

Next, I needed to change the contents of the /etc/serve-branches.conf file to this:

served_branches=/home/bzr
prefix=/bzr
port=8080

The prefix indicates the location in the URL where Apache will serve the repos. In this case that will be http://server.yourdomain.com/bzr/.

And finally I needed to configure Apache. First, make sure that the proxy and proxy-http modules are loaded:

$ a2enmod proxy proxy_http

Next, create a file /etc/apache/conf.d/sites-available/loggerhead with the following contents:

# Configuration for browsing of Bazaar repos. Make sure loggerhead is running.
<Location "/bzr/">
    ProxyPass http://127.0.0.1:8080/
    ProxyPassReverse http://127.0.0.1:8080/
</Location>

Note that Loggerhead and Apache run on the same host, that’s why I set the IP to 127.0.0.1.

Finally it’s time to enable the site and restart Apache:

$ a2ensite loggerhead
$ service apache2 restart

Now it should be possible to browse your repos at http://server.yourdomain.com/bzr/. Note the final /, it’s important.

Securing access with an LDAP connection

I have stored all my Unix user and group information in an LDAP server. To make sure that only people in the Unix group vcs are allowed access to the loggerhead pages, change the Apache configuration file loggerhead to the following:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# Configuration for browsing of Bazaar repos. Make sure loggerhead is running.
<Location "/bzr/">
    ProxyPass http://127.0.0.1:8080/
    ProxyPassReverse http://127.0.0.1:8080/
 
    # LDAP authentication
    AuthType Basic
    AuthName "Karssen.org VCS users"
    AuthBasicProvider ldap
    AuthLDAPURL "ldap://ldap.yourdomain.com/ou=Users,dc=yourdomain,dc=com?uid"
    AuthLDAPGroupAttribute memberUid
    AuthLDAPGroupAttributeIsDN off
    Order Allow,Deny
    Allow From All
    Require ldap-group cn=vcs,ou=Groups,dc=yourdomain,dc=com
</Location>

Lines 11 and 12 are needed because the vcs group is not an LDAP group. I store my Unix (POSIX) groups in a separate OU in the LDAP tree (see line 15).
Don’t forget to restart Apache after making these changes.

References

Related Images:

Installing and configuring Puppet

Puppet is a configuration management system. In short this means that by setting up a server (the Puppet master) you can manage many other machines (nodes) with this server by specifying which packages should be installed, files that need to be present, their permissions, etc. The nodes poll the server every 30 minutes (by default) to see if they should apply any changes to their configuration. Other packages that implement a similar idea are CfEnfine and Chef.

Note that all these instructions were performed as root.

The puppet master

Gaffel will be puppet master. I’ve added a DNS entry for puppet.karssen.org that points to gaffel. This installs the client and the Puppet master:

$ aptitude install puppet puppetmaster

The main configuration of server and client can be found in /etc/puppet/puppet.conf. We’ll leave it at the default for now. The file /etc/puppet/manifests/site.pp contains options that apply to the whole site. Let’s make it and add the following contents:

import "nodes"
 
# The filebucket is for backups. Originals of files that Puppet modifies
# get stored here.
filebucket { main: server => 'puppet.karssen.org' }
File  { backup => main}
 
# Set the default $PATH$ for executing commands on node systems.
Exec { path => "/usr/bin:/usr/sbin:/bin:/sbin:" }

The file /etc/puppet/manifests/nodes.pp defines the nodes/clients that will be managed by puppet as well as what configuration will be applied to them, so-called roles. For now, let’s make a quick example:

node common {
	include packages
}
 
node lambik inherits common {
	include ntp::client
}

Both the ‘packages’ and the ‘ntp’ modules still need to be defined. Let’s do that now.

Modules are collections of puppet code (known as manifests) and related files that are used for client configuration. Modules are stored in /etc/puppet/modules/.
Let’s start with the ntp example. First make the necessary directory structure:

$ mkdir -p /etc/puppet/modules/ntp/{manifests,files,templates}

Every modules needs a file init.pp that declares the class. It can also include other files. The files and templates directories are used to store files that need to be copied to the node or templates to make such files, respectively. We’ll come across some examples of both. This is the init.pp file for the ntp role (/etc/puppet/modules/ntp/manifests/init.pp):

class ntp::client {
	 package { "ntp":
    		 ensure => installed,
	 }
 
	 service { "ntp_client":
		 name       => "ntp"
    		 ensure     => running,
#		 hasstatus  => true,
		 hasrestart => true,
		 require    => Package["ntp"],
	 }
}

Here we indicate that the NTP service must be running and that it’s init script (in /etc/init.d) accepts the status and restart options. Lastly in the require line we note that before this manifest can be applied we must make sure that the package ntp has been installed. This is necessary, because the order in which the two directives are executed is not necessarily the order in
which they appear in the manifest.

The # in from of the hasstatus attribute is because of a bug inthe puppet version (2.6.4) shipped with Ubuntu 11.04. See http://projects.puppetlabs.com/issues/5610 for the bug report. In version 2.6.7 it is supposedly fixed.

In our nodes.pp file we also mentioned a packages class. In this class we list all the packages that we want to have installed on the node. Let’s make the packages module. First create the necessary directories:

$ mkdir -p /etc/puppet/modules/packages/{manifests,files,templates}

Add the file /etc/puppet/modules/packages/manifests/init.pp:

class packages {
	 $base_packages = [
	 "openssh-server",
	 "nfs-common",
	 "etckeeper",
	 "htop",
	 "iotop",
	 "iftop",
	 ]
 
	 $editor_packages = [
	 "emacs",
	 "emacs-goodies-el",
	 "elscreen",
	 ]
 
	 $all_packages = [
	 $base_packages,
	 $editor_packages,
	 ]
 
	 package { $all_packages:
	      ensure => installed,
	 }
}

Here I’ve defined three variables (beginning with a $ sign), one for base packages, one for editor-related packages and one called $all_packages that incorporates them both. Finally, I tell puppet to ensure they are all installed.

Setting up a client

As a test client I’m using lambik, one of my MythTV frontends.

$ aptitude install puppet

To make sure that puppet starts by default on system startup edit the file /etc/default/puppet and set START to yes:

# Defaults for puppet - sourced by /etc/init.d/puppet
 
# Start puppet on boot?
START=yes
 
# Startup options
DAEMON_OPTS=""

Now edit /etc/puppet/puppet.conf (on the client) and add the FQDN of the puppet master server to the [main] section:

[main]
logdir=/var/log/puppet
vardir=/var/lib/puppet
ssldir=/var/lib/puppet/ssl
rundir=/var/run/puppet
factpath=$vardir/lib/facter
templatedir=$confdir/templates
prerun_command=/etc/puppet/etckeeper-commit-pre
postrun_command=/etc/puppet/etckeeper-commit-post
server = puppet.karssen.org
 
[master]
# These are needed when the puppetmaster is run by passenger
# and can safely be removed if webrick is used.
ssl_client_header = SSL_CLIENT_S_DN
ssl_client_verify_header = SSL_CLIENT_VERIFY

Setting up secure communication between master and nodes and first test run

Puppet uses SSL certificates to set up a secure connection between master and nodes. Before you can apply any changes to the client, certificates need to be exchanged and signed. First, tell the client to connect to the puppet master:

$ puppetd --test
info: Creating a new SSL key for lambik.karssen.org
warning: peer certificate won't be verified in this SSL session
info: Caching certificate for ca
warning: peer certificate won't be verified in this SSL session
warning: peer certificate won't be verified in this SSL session
info: Creating a new SSL certificate request for lambik.karssen.org
info: Certificate Request fingerprint (md5): 1D:A3:3A:4A:A6:DA:D6:C8:96:F4:D4:7E:52:F4:12:1D
warning: peer certificate won't be verified in this SSL session
warning: peer certificate won't be verified in this SSL session
warning: peer certificate won't be verified in this SSL session
Exiting; no certificate found and waitforcert is disabled

On the puppet master we can now sign the certificate:

$ puppetca -l
lambik.karssen.org
$ puppetca -s lambik.karssen.org
notice: Signed certificate request for lambik.karssen.org
notice: Removing file Puppet::SSL::CertificateRequest lambik.karssen.org at '/var/lib/puppet/ssl/ca/requests/lambik.karssen.org.pem'

On the client we can now rerun puppetd:

root@lambik:~# puppetd --test
info: Caching catalog for lambik.karssen.org
info: Applying configuration version '1311930908'
notice: /Stage[main]/Packages/Package[iotop]/ensure: ensure changed 'purged' to 'present'
notice: /Stage[main]/Packages/Package[iftop]/ensure: ensure changed 'purged' to 'present'
notice: /Stage[main]/Ntp/Package[ntp]/ensure: ensure changed 'purged' to 'present'
notice: /Stage[main]/Packages/Package[emacs-goodies-el]/ensure: ensure changed 'purged' to 'present'
notice: /Stage[main]/Packages/Package[htop]/ensure: ensure changed 'purged' to 'present'
info: Creating state file /var/lib/puppet/state/state.yaml
notice: Finished catalog run in 78.43 seconds

If all went well, we can now start the puppet client daemon to keep our system under puppet control:

$ service puppet start

Adding (configuration) files to the roles

Since I run my own NTP server (ntp.karssen.org, only accessible from inside my LAN), the NTP configuration file (/etc/ntp.conf) must be changed. Of course, we want Puppet to take care of this. The ntp.conf file I want to distribute to all nodes has the following contents (note that the only change is the name of the server and commenting the restrict lines):

# /etc/ntp.conf, configuration for ntpd; see ntp.conf(5) for help
 
driftfile /var/lib/ntp/ntp.drift
 
 
# Enable this if you want statistics to be logged.
#statsdir /var/log/ntpstats/
 
statistics loopstats peerstats clockstats
filegen loopstats file loopstats type day enable
filegen peerstats file peerstats type day enable
filegen clockstats file clockstats type day enable
 
# Specify one or more NTP servers.
 
# Use servers from the NTP Pool Project. Approved by Ubuntu Technical Board
# on 2011-02-08 (LP: #104525). See http://www.pool.ntp.org/join.html for
# more information.
server ntp.karssen.org
 
# Use Ubuntu's ntp server as a fallback.
server ntp.ubuntu.com
 
# Access control configuration; see /usr/share/doc/ntp-doc/html/accopt.html for
# details.  The web page <http://support.ntp.org/bin/view/Support/AccessRestrict
ions>
# might also be helpful.
#
# Note that "restrict" applies to both servers and clients, so a configuration
# that might be intended to block requests from certain clients could also end
# up blocking replies from your own upstream servers.
 
# By default, exchange time with everybody, but don't allow configuration.
#restrict -4 default kod notrap nomodify nopeer noquery
#restrict -6 default kod notrap nomodify nopeer noquery
 
# Local users may interrogate the ntp server more closely.
restrict 127.0.0.1
restrict ::1
 
# Clients from this (example!) subnet have unlimited access, but only if
# cryptographically authenticated.
#restrict 192.168.123.0 mask 255.255.255.0 notrust
 
 
# If you want to provide time to your local subnet, change the next line.
# (Again, the address is an example only.)
#broadcast 192.168.123.255
 
# If you want to listen to time broadcasts on your local subnet, de-comment the
# next lines.  Please do this only if you trust everybody on the network!
#disable auth
#broadcastclient

Save this file in /etc/puppet/modules/ntp/files (on the puppet master). Now edit the manifest for the ntp role (/etc/puppet/modules/ntp/manifest/init.pp) to add the file section and a subscribe command:

class ntp::client {
	 package { "ntp":
		      ensure => installed,
	 }
 
	 service { "ntp_client":
	      name       => "ntp",
	      ensure     => running,
#	      hasstatus  => true,
	      hasrestart => true,
	      require    => Package["ntp"],
	      subscribe  => File["ntp_client_config"],
	 }
 
	 file { "ntp_client_config":
		   path => "/etc/ntp.conf",
	   owner   => root,
	   group   => root,
	   mode    => 644,
	   source  => "puppet:///ntp/ntp.conf",
	   require => Package["ntp"],
	 }
}

The URL specified in the source line automatically looks in the right place (as mentioned just above) for the file. Because we don’t want to wait for puppet to automatically pass on this configuration, let’s run it by hand:

root@lambik:~# puppetd --test
info: Caching catalog for lambik.karssen.org
info: Applying configuration version '1311936811'
--- /etc/ntp.conf	2011-06-17 07:59:54.000000000 +0200
+++ /tmp/puppet-file20110729-12128-1h3fupz-0	2011-07-29 12:53:33.279622938 +0200
@@ -16,16 +16,14 @@
 # Use servers from the NTP Pool Project. Approved by Ubuntu Technical Board
 # on 2011-02-08 (LP: #104525). See http://www.pool.ntp.org/join.html for
 # more information.
-server 0.ubuntu.pool.ntp.org
-server 1.ubuntu.pool.ntp.org
-server 2.ubuntu.pool.ntp.org
-server 3.ubuntu.pool.ntp.org
+server ntp.karssen.org
 
 # Use Ubuntu's ntp server as a fallback.
 server ntp.ubuntu.com
 
@@ -33,8 +31,8 @@
 # up blocking replies from your own upstream servers.
 
 # By default, exchange time with everybody, but don't allow configuration.
-restrict -4 default kod notrap nomodify nopeer noquery
-restrict -6 default kod notrap nomodify nopeer noquery
+#restrict -4 default kod notrap nomodify nopeer noquery
+#restrict -6 default kod notrap nomodify nopeer noquery
 
 # Local users may interrogate the ntp server more closely.
 restrict 127.0.0.1
info: FileBucket adding /etc/ntp.conf as {md5}32280703a4ba7aa1148c48895097ed07
info: /Stage[main]/Ntp::Client/File[ntp_client_config]: Filebucketed /etc/ntp.conf to main with sum 32280703a4ba7aa1148c48895097ed07
notice: /Stage[main]/Ntp::Client/File[ntp_client_config]/content: content changed '{md5}32280703a4ba7aa1148c48895097ed07' to '{md5}0d1b81c95bab1f6b08eb27dfaeb18bb5'
info: /Stage[main]/Ntp::Client/File[ntp_client_config]: Scheduling refresh of Service[ntp_client]
notice: /Stage[main]/Ntp::Client/Service[ntp_client]: Triggered 'refresh' from 1 events
notice: Finished catalog run in 3.06 seconds

Setting NFS mounts in /etc/fstab

On my clients I want to mount several NFS shares. Let’s create the directories for the nfs_mounts module (on the puppet master of course):

$ mkdir -p /etc/puppet/modules/nfs_mounts/{manifests,files,templates}

Next, let’s edit the manifest (/etc/puppet/modules/nfs_mounts/manifests/init.pp):

class nfs_mounts {
	 # Create the shared folder unless it already exists
	 exec { "/bin/mkdir -p /var/sharedtmp/":
		   unless => "/usr/bin/test -d /var/sharedtmp/",
	 }
 
	 mount { "/var/sharedtmp/":
	    atboot  => true,
	    ensure  => mounted,
	    device  => "nfs.karssen.org:/var/sharedtmp",
	    fstype  => "nfs",
	    options => "vers=3",
	    require => Package["nfs-common"],
	 }
}

This should make the /var/sharedtmp directory and mount it. Note that I mention the nfs_common package in a require line. This package was defined in the packages module (in the $base_packages variable. Now let’s add this module to the nodes.pp file:

node common {
  include packages
}
 
node lambik inherits common {
	include ntp::client
	include nfs_mounts
}

Since I’ve got more than a single NFS mount, let’s extend the previous example and use a defined resource. Change the file /etc/puppet/modules/nfs_mounts/manifests/init.pp as follows:

define nfs_mount(
	  $location,
	  $server  = "nfs.karssen.org",
	  $options = "vers=3",
	  $fstype  = "nfs"
) {
  file {"$location":
	  ensure => directory,
  }
 
  mount { "$location":
  	atboot  => true,
	ensure  => mounted,
	device  => "${server}:${location}",
	fstype  => "$fstype",
	options => "$options",
	require => [ Package["nfs-common"], File["$location"] ],
  }
}
 
class nfs_mounts {
 
			 nfs_mount { "/home":
		 	   location => "/home",
		 }
 
			 nfs_mount { "/var/sharedtmp":
		 	    location => "/var/sharedtmp",
 		 }
 
			 nfs_mount { "/var/video":
	 	    	    location => "/var/video",
 		 }
 
			 nfs_mount { "/var/music":
	 	    	    location => "/var/music",
 		 }
}

Here we first define a resource called nfs_mount, which can accept various parameters, all of which have a default value, except $location. Secondly we ensure that this location is a directory and then we define how it should be mounted. In the subsequent class definition we use this nfs_mount resource several times to mount the various NFS shares.
Note that it would have been easier if the definition of nfs_mount would have started with

define nfs_mount (
	  $location = $name,

because then the invocations of nfs_mount in the class would not
need the location => line. Unfortunately this doesn’t work. It’s
a known bug that has been fixed in version 2.6.5
(http://projects.puppetlabs.com/issues/5061).

Links

Related Images:

Making a .deb package for software that doesn’t accept the DESTDIR variable in its Makefile

Because I’ll be deploying a new server in the near future and I want to keep it as clean as possible I decided (again) to try to find out how to create a .deb package (as used for example by Debian and Ubuntu Linux) for some software that doesn’t follow the autotools way of doing things. This time I found a/the way. But first some background info.

In the Unix/Linux world many programs are compiled from source in three steps:

./configure
make
make install

Usually the necessary files for this have been created using the autotools. The goal of the first step is to create a so-called Makefile that contains instructions on how to compile and install the files (as done in the two subsequent make steps.

Some software packages, however, include a ready-made Makefile that, in addition, doesn’t accept the environment variable DESTDIR. This last point is what makes packaging the application into a .deb file a bit tricky. The reason for this is that the package build scripts want to install the files of your application in a temporary directory and not into system-wide directories like /usr/bin/ etc. during the packing process. As such, packaging does not require root privileges.

At work we use many programs and tool sets developed by ourselves and other scientists. I know from my own experience that setting up autotools for your program is not trivial. Actually, for lack of time I’ve never successfully done it and for most of the rather simple programs that I’ve written setting up a complete autoconf/automake environment seems a bit overkill. I usually ended up writing a simple Makefile that compiles to code and installs it (usually in /usr/local/bin).

Merlin by Abecasis et al. is a great piece of software developed at the University of Michigan. However, as you may have expected by now, its Makefile does not accept the DESTDIR variable, instead running make tells you that in order to install in a different directory you’ll have to run

make INSTALLDIR=/some/other/directory

Therefore, all quick and dirty .deb recipes one finds on the Internet do not work without some adaptations. So here is what I did to make a .deb of it. It won’t be a full tutorial on how to do packaging, see the references at the end of this post for that. I’ll assume here that you have your build environment set up (e.g. the build-essential and fakeroot packages, as well as some others).

tar -xzf merlin-1.1.2.tar.gz
cd merlin-1.1.2
dh_make --single --email youremail@address --file ../merlin-1.1.2.tar.gz

Now the basic files are ready. Apart from the untarred source files the files needed for Debian packaging have also been created (in merlin-1.1.2/debian).

Time to make the necessary changes. First, since the Makefile included with merlin does not accept the DESTDIR variable that the Debian packaging system uses we’ll patch the Makefile in such a way that it works (I tried to fix this in the debian/control file, but in the end adapting the Makefile was much easier). I do this by changing the line

INSTALLDIR=/usr/local/bin

to

# default installation directory
ifeq ($(DESTDIR),"")
    INSTALLDIR=/usr/local/bin/
else
    INSTALLDIR=$(DESTDIR)/usr/bin/
endif

Let’s do some polishing of the package. I don’t want to make the perfect package, but adding a bit of text to the debian/control file make a lot of difference. This is what it looked like after my edits:

Source: merlin
Section: science
Priority: extra
Maintainer: Lennart C. Karssen <youremail@address>
Build-Depends: debhelper (>= 7)
Standards-Version: 3.8.3
Homepage: http://www.sph.umich.edu/csg/abecasis/merlin/index.html
 
Package: merlin
Architecture: any
Depends: ${shlibs:Depends}, ${misc:Depends}
Description: Package for fast pedigree analysis
 MERLIN uses sparse trees to represent gene flow in pedigrees
 and is one of the fastest pedigree analysis packages around
 (Abecasis et al, 2002).

Also editing the file debian/changelog is a good idea, especially since I changed the source code (remember the Makefile?). This is what I wrote:

merlin (1.1.2-1) unstable; urgency=low
 
  * Initial release
  * Adjusted Makefile to make DESTDIR work.
 
 -- Lennart C. Karssen <youremail@address>  Tue, 05 Apr 2011 12:04:21 +0200

Officially you should edit the debian/copyright file as well, but since the merlin licence doesn’t allow distribution of the source or the binaries I didn’t bother.

To finally build the package run

dpkg-buildpackage -rfakeroot -us -uc

This creates a .deb file in the directory where you started. As a final touch you can check your package for errors with

lintian ../merlin_1.1.2-1_amd64.deb

References:

Related Images:

Using rsync to backup to a remote Synology Diskstation

An updated version of the script can be found here.

I recently bought a NAS, a Synology DiskStation DS211j and stuffed two 1TB disks in it. I configured the disks to be in RAID 1 (mirrored) in case one of them decides to die. I then brought the NAS to a family member’s house and installed it there. Now she uses it to back up her important files (and as a storage tank for music and videos).

The good thing for me is that I can now make off-site backups of my home directories. I configured the DS211j to accept SSH connections so that I can log into it (as user admin or root). I used the web interface to create a directory for my backups (which appeared to be /volume1/BackupLennart after logging in with SSH).

After making a hole in her firewall that allowed me to connect to the DS211j, I created a backup script in /etc/cron.daily with the following contents:

#!/bin/bash
#
# This script makes a backup of my home dirs to a Synology DiskStation at
# another location. I use LVM for my /home, so I make a snapshot first and
# backup from there.
#
# Time-stamp: <2011-02-06 21:30:14 (lennart)>
 
###############################
# Some settings
###############################
 
# LVM options
VG=raidvg01
LV=home
MNTDIR=/mnt/home_rsync_snapshot/
 
# rsync options
DEST=root@remote-machine.example.com:/volume1/BackupLennart/
SRC=${MNTDIR}/*
OPTIONS="-e ssh --delete --progress -azvhHS --numeric-ids --delete-excluded "
EXCLUSIONS="--exclude lost+found --exclude .thumbnails --exclude .gvfs --exclude .cache --exclude Cache"
 
 
 
###############################
# The real work
###############################
 
# Create the LVM snapshot
if [ -d $MNTDIR ]; then
    # If the snapshot directory exists, another backup process may be
    # running
    echo "$MNTDIR already exists! Another backup still running?"
    exit -1
else
    # Let's make snapshots
    mkdir -p $MNTDIR
    lvcreate -L5G -s -n snap$LV /dev/$VG/$LV
    mount /dev/$VG/snap$LV $MNTDIR
fi
 
 
# Do the actual backup
rsync $OPTIONS $EXCLUSIONS $SRC $DEST
 
# Remove the LVM snapshot
if [ -d $MNTDIR ]; then
    umount /dev/$VG/snap$LV
    lvremove -f /dev/$VG/snap$LV
    rmdir $MNTDIR
else
    echo "$MNTDIR does not exist!"
    exit -1
fi

Let’s walk through it: in the first section I configure several variables. Since I use LVM on my server, I can use it to make a snapshot of my /home partition. The LVM volume group I use is called ‘raidvg01’. Withing that VG my /home partition resides in a logical volume called ‘home’. The variable MNTDIR is the place where I mount the LVM snapshot of ‘home’.

The rsync options are quite straight forward. Check the rsync man page to find out what they mean. Note that I used the --numeric-ids option because the DS211j doesn’t have the same users as my server and this way all ownerships will still be correct if I ever need to restore from this backup.

In the section called “The real work” I first create the MNTDIR directory. Subsequently I create the LVM snapshot and mount it. After this the rsync backup can be run and finally I unmount the snapshot and remove it, followed by the removal of the MNTDIR.

Since the script is placed in /etc/cron.daily it will be executed every day. Since we use SSH to connect to the remote DS211j I set up SSH key access without a password. This Debian howto will tell you how to set that up.

The only thing missing in this setup is that the backups are not stored in an encrypted form on the remote NAS, but for now this is good enough. I can’t wait until the network bandwidth on both sides of this backup connection get so fast (and affordable) that I can easily sync my music as well. Right now uploads are so slow that I hardly dare to include those. I know that I shouldn’t complain since the Netherlands has one of the highest broadband penetrations in the world, but, hey, don’t you just always want a little more, just like Oliver Twist?

Related Images:

« Older posts Newer posts »

© 2024 Lennart's weblog

Theme by Anders NorĂ©nUp ↑