Опубликован: 06.08.2012 | Уровень: специалист | Доступ: платный
Лекция 12:

The Vinum Volume Manager

< Лекция 11 || Лекция 12: 12345 || Лекция 13 >

Vinum configuration database

Vinum stores configuration information on each drive in essentially the same form as in the configuration files. You can display it with the dumpconfig command. When reading from the configuration database, Vinum recognizes a number of keywords that are not allowed in the configuration files, because they would compromise data integrity. For example, after adding the second plex to myvol, the disk configuration would contain the following text:

vinum -> dumpconfig
Drive a:  Device /dev/da1s2h
          Created on bumble.example.org at Tue Nov 26 14:35:12 2002
          Config last updated Tue Nov 26 16:12:35 2002
          Size:   4293563904 bytes  (4094 MB)
volume myvol state up
plex name myvol.p0 state up org concat vol myvol
plex name myvol.p1 state up org concat vol myvol
sd name myvol.p0.s0 drive a plex myvol.p0 len 1048576s driveoffset 265s state up plexoffset 0s
sd name myvol.p1.s0 drive b plex myvol.p1 len 2097152s driveoffset 265s state up plexoffset 0s
sd name myvol.p0.s1 drive c plex myvol.p0 len 1048576s driveoffset 265s state up plexoffset 1048576s

Drive /dev/da1s2h: 4094 MB (4293563904 bytes)
Drive b:  Device /dev/da2s2h
          Created on bumble.example.org at Tue Nov 26 14:35:27 2002
          Config last updated Tue Nov 26 16:12:35 2002
          Size:   4293563904 bytes (4094 MB)
volume myvol state up
plex name myvol.p0 state up org concat vol myvol
plex name myvol.p1 state up org concat vol myvol
sd name myvol.p0.s0 drive a plex myvol.p0 len 1048576s driveoffset 265s state up plexoffset 0s
sd name myvol.p1.s0 drive b plex myvol.p1 len 2097152s driveoffset 265s state up plexoffset 0s
sd name myvol.p0.s1 drive c plex myvol.p0 len 1048576s driveoffset 265s state up plexoffset 1048576s

The obvious differences here are the presence of explicit location information and naming (both of which are also allowed, but discouraged, for use by the user) and the information on the states (which are not available to the user). Vinum does not store information about drives in the configuration information: it finds the drives by scanning the configured disk drives for partitions with a Vinum label. This enables Vinum to identify drives correctly even if they have been assigned different UNIX drive IDs.

When you start Vinum with the vinum start command, Vinum reads the configuration database from one of the Vinum drives. Under normal circumstances, each drive contains an identical copy of the configuration database, so it does not matter which drive is read. After a crash, however, Vinum must determine which drive was updated most recently and read the configuration from this drive. It then updates the configuration, if necessary, from progressively older drives.

Installing FreeBSD on Vinum

Installing FreeBSD on Vinum is complicated by the fact that sysinstall and the loader don't support Vinum, so it is not possible to install directly on a Vinum volume. Instead, you need to install a conventional system and then convert it to Vinum. That's not as difficult as it might sound.

A typical disk installation lays out disk partitions in the following manner:

Typical partition layout without Vinum
da0s3a:/file sistem da0s3c: entire disk
da0s3d:swap
da0s3e:/usr file sistem
da0s3f:/var file sistem

This layout shows three file system partitions and a swap partition, which is not the layout recommended on page 68. We'll look at the reasons for this below.

Each partition corresponds logically to a Vinum subdisk. You could enclose all these subdisks in a Vinum drive. The only problem is that Vinum stores its configuration information at the beginning of the drive, and that's where the root file system is. One way to solve this problem is to put the swap partition first and make it 265 sectors longer than needed. You can do this from sysinstall simply by creating the swap partition before any other partition. Consider installing FreeBSD on a 4 GB drive. Create, in sequence, a swap partition of 256 MB, a root file system of 256 MB, a /usr file system of 2 GB, and a /var file system to take up the rest. It's important to create the swap partition at the beginning of the disk, so you create that first. After installation, the output of bsdlabel looks like this:

8 partitions:
#        size   offset  fstype  [fsize  bsize bps/cpg]
  a:   524288   532480  4.2BSD    2048  16384   94
  b:   532215      265    swap
  c:  8386733        0  unused       0      0     #"raw" part, don't edit
  e:  4194304  1056768  4.2BSD    2048  16384   89
  f:  3135661  5251072  4.2BSD    2048  16384   89

To convert to Vinum, use bsdlabel with the -e (edit label) option to create a volume of type vinum that maps the c partition:

h: 8386733

After this, you have the following situation:

Partition layout with Vinum
da0s3b: swap da0s3c: entire disk da0s3h: vinum drive
da0s3a:/file system
da0s3a:/usr file
da0s3f:/var file sistem

The shaded area at the top of the Vinum partition represents the configuration information, which cuts into the swap partition. To fix that, we redefine the swap partition to start after the Vinum configuration information and to be 265 sectors shorter. The file systems are relatively trivial to recreate: take the size and offset values from the bsdlabel output above and use them in a Vinum configuration file:

drive rootdev device /dev/da0s2h
volume swap
  plex org concat
#  b:     532215                  265          swap
  sd len  532215s    driveoffset  265s      drive rootdev
volume root
  plex org concat
#  a:     524288                  532480       4.2BSD  2048  16384  94
  sd len  524288s    driveoffset  532480s   drive rootdev
volume usr
  plex org concat
#  e:     4194304                 1056768      4.2BSD  2048  16384  89
  sd len  4194304s   driveoffset  1056768s   drive rootdev
volume var
  plex org concat
#  f:     3135661                 5251072      4.2BSD  2048  16384  89
  sd len  3135661s   driveoffset  5251072s
drive   rootdev

The comments are the corresponding lines from the bsdlabel output. They show the corresponding values for size and offset. Run vinum create against this file, and confirm that you have the volumes /, /usr and /var.

Next, ensure that you are set up to start Vinum with the new method. You should have the following lines in /boot/loader.conf:

vinum_load="YES"
vinum.autostart="YES"

Then reboot to single-user mode, start Vinum and run fsck against the volumes, using the -n option to tell fsck not to correct any errors it finds. You should see something like this:

# fsck -n -t ufs /dev/vinum/usr
** /dev/vinum/usr (NO WRITE)
** Last Mounted on /usr
** Phase 1 - Check Blocks and Sizes
** Phase 2 - Check Pathnames
** Phase 3 - Check Connectivity
** Phase 4 - Check Reference Counts
** Phase 5 - Check Cyl groups
35323 files, 314115 used, 718036 free (4132 frags, 89238 blocks, 0.4% fragmentation)

If there are any errors, they will probably be because you have miscalculated size or offset. You'll see something like this:

# fsck -n -t ufs /dev/vinum/usr
** /dev/vinum/usr (NO WRITE) 
Cannot find file system superblock
/dev/vinum/usr: CANNOT FIGURE OUT FILE SYSTEM PARTITION

You need to do this in single-user mode because the volumes are shadowing file systems, and it's normal for open file systems to fail fsck, since some of the state is in buffer cache.

If all is well, remount the root file system read-write:

# mount -u /

Then edit /etc/fstab to point to the new devices. For this example, /etc/fstab might initially contain:

# $Id: fstab,v 1.3 2002/11/14 06:48:16 grog Exp $
# Device     Mountpoint  FStype  Options  Dump Pass#
/dev/da0s4a  /           ufs     rw       1    1
/dev/da0s4b  none        swap    sw       0    0
/dev/da0s4e  /usr        ufs     rw       1    1
/dev/da0s4f  /var        ufs     rw       1    1

Change it to reflect the Vinum volumes:

# $Id: fstab,v 1.3 2002/11/14 06:48:16 grog Exp $  
# Device         Mountpoint  FStype  Options  Dump Pass#
/dev/vinum/swap  none        swap    sw       0    0
/dev/vinum/root  /           ufs     rw       1    1
/dev/vinum/usr   /usr        ufs     rw       1    1
/dev/vinum/var   /var        ufs     rw       1    1
/dev/da0s4b      none        swap    sw       0    0
/dev/da0s4e      /usr        ufs     rw       1    1
/dev/da0s4f      /var        ufs     rw       1    1

Then reboot again to mount the root file system from /dev/vinum/root. You can also optionally remove all the UFS partitions except the root partition. The loader doesn't know about Vinum, so it must boot from the UFS partition.

Once you have reached this stage, you can add additional plexes to the volumes, or you can extend the plexes (and thus the size of the file system) by adding subdisks to the plexes, as discussed on page 229.

Recovering from drive failures

One of the purposes of Vinum is to be able to recover from hardware problems. If you have chosen a redundant storage configuration, the failure of a single component will not stop the volume from working. In many cases, you can replace the components without down time.

If a drive fails, perform the following steps:

  1. Replace the physical drive.
  2. Partition the new drive. Some restrictions apply:
    • If you have hot-plugged the drive, it must have the same ID, the Vinum drive must be on the same partition, and it must have the same size.
    • If you have had to stop the system to replace the drive, the old drive will not be associated with a device name, and you can put it anywhere. Create a Vinum partition that is at least large enough to take all the subdisks in their original positions on the drive. Vinum currently does not compact free space when replacing a drive. An easy way to ensure this is to make the new drive at least as large as the old drive.

      If you want to have this freedom with a hot-pluggable drive, you must stop Vinum and restart it.

  3. If you have restarted Vinum, create a new drive. For example, if the replacement drive data3 is on the physical partition /dev/da3s1h , create a configuration file, say configfile, with the single line
    drive data3 device /dev/da3s1h
    

    Then enter:

    # vinum create configfile
    
  4. Start the plexes that were down. For example, vinum list might show:
    vinum -> l -r test
    V test         State: up        Plexes:       2 Size:  30 MB
    Ptest.p0     C State: up        Subdisks:     1 Size:  30 MB
    Ptest.p1     C State: faulty    Subdisks:     1 Size:  30 MB
    Stest.p0.s0    State: up        PO:        0  B Size:  30 MB
    Stest.p1.s0    State: obsolete  PO:        0  B Size:  30 MB
    vinum -> start test.p1.s0
    Reviving test.p1.s0 in the background
    vinum -> vinum[295]: reviving test.p1.s0   this message appears after the prompt
    (some time later)
    vinum[295]: test.p1.s0 is up
    
Failed boot disk

If you're running your root file system on a Vinum volume, you can survive the failure of the boot volume if it is mirrored with at least two concatenated plexes each containing only one subdisk. Under normal circumstances, you can carry on running as if nothing had happened, but obviously you will no longer be able to reboot from that disk. Instead, boot from the other disk.

The root file system also has individual UFS partitions, so you have a choice of what you mount. For example, if your root file system has UFS partitions /dev/da0s4a and /dev/da1s4a, you can mount either of these partitions or /dev/vinum/root. Never mount more than one of them, otherwise you can cause data corruption.

An even more insidious way to corrupt the root file system is to mount /dev/da0s4a or /dev/da1s4a and modify it. In this case, the two partitions are no longer the same, but there's no way for Vinum to know that. If this happens, you must mark the other subdisk as crashed with the vinum stop command.

Migrating Vinum to a new machine

Sometimes you might want to move a set of Vinum disks to a different FreeBSD machine. This is simple, as long as there are no name conflicts between the objects on the Vinum disks and any other Vinum objects you may already have on the system. Simply connect the disks and start Vinum. You don't need to put the disks in any particular location, and you don't need to run vinum create: Vinum stores the configuration on the drives themselves, and when it starts, it locates it accordingly.

Things you shouldn't do with Vinum

The vinum command offers a large number of subcommands intended for specific purposes. It's easy to abuse them. Here are some things you should not do:

  • Do not use the resetconfig command unless you genuinely don't want to see any of your configuration again. There are other alternatives, such as rm, which removes individual objects or groups of objects.
  • Do not re-run the create command for objects that already exist. Vinum already knows about them, and the start command should find them.
  • Do not name your drives after the disk device on which the yare located. The purpose of having drive names is to be device independent. For example, if you have two drives a and b, and they are located on devices /dev/da1slh and /dev/da2s1h respectively, you can remove the drives, swap their locations and restart Vinum. Vinum will still correctly locate the drives. If you had called the drives da1 and da2, you would then see something confusing like this:
    2 drives:
    D da2  State: up  /dev/da1s1h  A: 3582/4094 MB (87%)
    D dal  State: up  /dev/da1s2h  A: 3582/4094 MB (87%) 
    

    This is clearly not helpful.

  • Don't put more than one drive on a physical disk. Each drive contains two copies of the Vinum configuration, and both updating the configuration and starting Vinum slow down as a result. If you want more than one file system to occupy space on a physical drive, create subdisks, not drives.
< Лекция 11 || Лекция 12: 12345 || Лекция 13 >
Бехзод Сайфуллаев
Бехзод Сайфуллаев
Узбекистан, Бухара, Бухарский институт высоких технологий, 2013
Василь Остапенко
Василь Остапенко
Россия