November 30, 2003

Tape shuffling

Changed tapes on Friday, and again on Sunday. Weekend backups didn’t quite work properly.

Something about the weekend backups didn’t pan out. I re-ran the manual backups for paco and roj this afternoon on a fresh set of tapes and used just over one tape. Pax’s backups mostly failed, and I’m not sure what to make of that. Compression seems to be working, though, since I can get 120 GB on a tape from Banner data. Need to check on pax’s backup history.

November 21, 2003

Alumni search CGI

Yesterday I did more tweaking to the alumni search CGI for Bryan.

Added the ability to search by first, middle, and last names, class year, and e-mail address. Added URL field to database. Made HTML output use the Earlham stylesheet.

See previous post

Internet connection down

The router seems to be involved with this somehow.

Problem report 2003112101

November 19, 2003

shared memory and performance tuning in NetBackup

Shared memory and SYSV IPC settings need to be tweaked in /etc/system on Solaris, and buffer parameters need to be increased for NetBackup for performance tuning.

Also, before I forget, gigabit ethernet is absolutely required here.

See this document on Solaris kernel tuning for settings to /etc/system to make NetBackup run better.

This document on network buffer size and data buffer size and number has good information as well. The NET_BUFFER_SZ file needs to be present on all clients.

November 18, 2003

rsync backups

Playing around this morning looking at various backup thngs which run off of the rsync method.

Spurred by a posting in sage-members, I was looking at

  • rdiff-backup - a system that uses basic rsync like stuff with a python wrapper to do backups and incrementals to a remote location. Pretty nice.
  • duplicity - built off of rdiff-backup, but uses tar to create the archives and includes optional compression (gzip) and encryption (GnuPG) to protect the archives. Claims to work with FTP, but I can’t seem to get it to. The primary transport method is scp, which insists on four authentications for the first backup (and possibly more afterwards). Some nice ideas, but not really usable.

November 17, 2003

Internet connection down

The shaper and Garibaldi seemed to make the main campus connection go this evening.

Problem report 2003111701

Possible PAX quota solution

Found some bug reports on FreeBSD that indicate that the quota system doesn’t like files that are owned by unknown users.

This bug indicates, among other things, that quotacheck doesn’t like files that are owned by unknown users. I’ve done some searching on PAX, and it looks like I deleted the accounts of a bunch of summer 2002 students, but never deleted their home directories. I’m currently in the process of deleting these dangling files. We’ll try quotas later this week, perhaps, and see whether it works or not.

NetBackup and HEIWA

Discovered at about 5 AM on Saturday morning that NetBackup kills NFS on HEIWA when it’s reading the deep directory tree of Wusage reports.

Somehow, deep into the reports tree for Wusage, NetBackup’s tar program interacts badly with NFS and makes NFS from HEIWA to PAX die, thus killing the machine. It takes about 1.5 hours to get to that point, making a backup that starts at midnight kill HEIWA at about 1:30. I got in at 2:30 and manually kicked off another backup to watch it. at about 4:30 it died and I got to see where.

Have disabled backups on HEIWA until I figure out what to do about this.

November 14, 2003

Alumni search CGI

Wrote a short little CGI perl script for Bryan and the alumni e-mail directory.

I’m not sure why there is such a disconnect between what I’m trying to suggest and what Bryan seems to be understanding. So I wrote a 30 minute perl script to demonstrate jsut exactly what I’m talking about. The script merely searches for case-insensitive regular expression matches in a CSV file of first name, middle name, last name, class year, and e-mail address entries. It does a little bit of formatting for the output.

Alumni Directory

NetBackup catalog backups

Catalog backups on NetBackup are going to disk and then getting written to tape with a semi-manual dump command.

NetBackup was having problems with the catalog backup directly to tape, so I set it to back up to disk instead. Then I modified the dbbackup_notify script to run a little shell script that takes a snapshot of the /home filesystem and dumps it to tape. This has the benefit of also backing up all of the NetBackup installation as well.

The dump script uses fssnap to take the snapshot, and then uses ufsdump to perform the dump to tape.

Tape Drive Hitches

NetBackup is much more finicky about tape drive errors and cleaning requests than AMANDA ever was. It actually pays attention to them and suspends the jobs if they happen.

Currently working on finding a solution to this problem - fairly often when changing tapes the cleaning light will come on, suspending the current job. There are some scripts that will check for downed drive states and reset them, or up them as appropriate. I’ll be modifiying one of these to do that. We’ll see if that eases the problem.

Also trying to get automatic tape cleaning going, but that may not be necessary and may be tricky.

Use vmoprcmd to get status of tape drives and set them up, down, etc. Use tpclean to get info on drive cleanings and to initiate a cleaning.

November 10, 2003

NetBackup policies

Working a revamp of the policies in NetBackup.

Don’t be afraid of policies. There will be at most three policies for every host: business, user, and system (data types). The schedules for full and incrementals on each will be the same, but the differences will be in the file lists. This is probably the easiest way to separate out the file lists for each host in a scalable manner. It leads to a proliferation of policies, but I think that’s manageable, at least within the framework here. I’ll post a description of the policies in more detail tomorrow.

Tonight I’ve turned NetBackup off, so it won’t attempt to back up anything.

Update, 11/11/03:

As promised, policy descriptions. There are three templates, which are applicable to almost any host. Not every host will have all three of the policies, depending on its mix of user data, business data, and whether the OS is easily re-installable (Jumpstart).

  • ecs_TEMPLATE_business
    business_full_backup: to tape robot, 1 week frequency, 1 month retention
    business_differential_backup: to tape robot, daily frequency, 2 week retention
  • ecs_TEMPLATE_user
    user_full_backup: to tape robot, 1 week frequency, 1 month retention
    user_differential_backup: to tape robot, daily frequency, 2 week retention
  • ecs_TEMPLATE_system
    system_full_backup: to disk, 1 month frequency, 1 month retention
    system_cumulative_backup: 1 week frequency, 1 month retention

Veritas Links

Web sites with Veritas NetBackup information

November 08, 2003

Veritas NetBackup

Getting started with NetBackup.

Yesterday we discovered that the AMANDA dumps on ROJ were really not working well at all, so I went on a crash install of Veritas on the new backup server. I’ve put clients on PAX, ASHTI, and SITH so far, and I did a trial backup of /home/ldap on ASHTI last night. Nothing there at that time, so it didn’t prove much. I set the policies to have /home/db and /home/webdb on SITH backed up starting at 8 this morning, and it kicked off and finished with what looks like a happy ending. Also in the mix were most of the /home partitions on PAX, and again /home/ldap on ASHTI. PAX is currently working on /home/groups, after having done /home/classes. I won’t really know how it all goes until the whole backup finishes or fails, and hopefully that’ll work ok.

Things to note so far:

  • There’s a bunch of good scripts at www.storagemountain.com (as usual, a good place). I’ve put Curtis Preston’s (as suggested) into /usr/openv/local.
  • NetBackup uses a MySQL database as a backend…
  • Currently I have the Overland library attached to the main external SCSI bus and not the PCI card. While the PROM saw everything attached to the card, Solaris didn’t. It happily saw everything when attached to the built in bus.
  • Use the st and sg drivers. Let Veritas handle configuration of the sg.conf file. We can handle configuration of the st.conf file.
  • NetBackup has a funny concept of the files to include in a backup schedule. In particular, it doesn’t have a way of associating a file (list) with a server. So you end up having to arrange your servers’ filesystems such that the data you want put into a particular schedule is going to be found in a certain place. We already do that to a certain extent, so it’s ok. THis also means, though, that if on ASHTI, I only back up /home/ldap with this policy, I get a number of failed reports for /home/classes and /home/groups, etc. which only exist on PAX. The job as a whole doesn’t fail, since it succeeds if any of the things in the file list succeed, but it’s a slightly different way of thinking about things.
  • We need some concurrency/multiplexing tuning. Right now we’re writing one job at a time to tape, and I don’t think this is either necessary or a good use of time. It’ll work for this weekend, but hopefully I’ve set the concurrency and multiplexing stuff for future jobs such that up to 4 jobs can be streamed to tape at once. Must go back and re-read that section of the manual, since it made sense a month or two ago when I first read it…
  • The Java GUI interface is surprisingly useful. The CLI interface is surprisingly well synced with the GUI, but has little to no doco. Will need to read up on that where I can.

November 07, 2003

MT search mods

modified the MovableType search template to include an IncludeBlogs tag to confine secondary searches to the original blog. ~MT/search_templates/default.tmpl is the file. In the search submission form area, add an input field, type hidden, named IncludeBlogs and set the value to MTBlogID.

November 06, 2003

Veritas NetBackup install

Most of the day installing Veritas NetBackup on EYEWI (SunFire V240) and starting to play with it. NB is installed in /home/opt/openv and linked to /usr/openv. The Java console seems to work pretty well, and it discovers DLT drives attached pretty easily. No luck on an old DDS drive, but we're not going to use that, anyway. The tape library support is there, but I didn't get a chance to test it, since the library is still attached to ROJ. Next task should be playing around with adding and removing tape drives and coming up with decent schedules.