OpenStack:Ceph-COI-Installation

From DocWiki

(Difference between revisions)
Jump to: navigation, search
m (Added Category:OpenStack)
 
(44 intermediate revisions not shown)
Line 1: Line 1:
== Installing a ceph cluster and configuring rbd-backed cinder volumes.==
== Installing a ceph cluster and configuring rbd-backed cinder volumes.==
-
Currently rbd-backed cinder volumes are only available on compute nodes running cinder-volume. Work on using rbd-backed volumes on standalone cinder nodes is underway.
+
 
=== First steps ===
=== First steps ===
Line 8: Line 8:
* Run puppet_modules.py to download the necessary puppet modules
* Run puppet_modules.py to download the necessary puppet modules
* Edit site.pp to fit your configuration.
* Edit site.pp to fit your configuration.
-
* Define one mon (mon0) and at least one OSD server (osd0). If you wish to test with multiple MONs, you must have an odd number of MON nodes.
+
* You must define one MON and at least one OSD to use Ceph.
 +
* It is recommended that you zero the first several blocks on each disk that will be used for Ceph OSD data storage.  This step is required if you're using disks that have been used in a previous Ceph deployment.  The command "dd if=/dev/zero of=/dev/sdX bs=100M count=1" (with "sdX" replaced with an appropriate device name) will suffice.
-
=== Configuration ===
+
== Choosing Your Configuration ==
-
* Change the cinder driver option to 'rbd'
+
Cisco COI Grizzly g.1 release only supports standalone ceph nodes. Please follow only those instructions.
 +
Cisco COI Grizzly g.2 release supports standalone and integrated. Integrated options allow you to run MON on control and compute servers, along with OSD on compute servers. You can also have standalone cinder-volume nodes as OSD servers.
 +
 +
For all ceph configurations, uncomment the following in site.pp and change the values as are appropriate for your deployment:
<pre>
<pre>
-
# The cinder storage driver to use Options are iscsi or rbd(ceph). Default is 'iscsi'.
+
 
-
$cinder_storage_driver        = 'rbd'
+
$ceph_auth_type        = 'cephx'
 +
$ceph_monitor_fsid      = 'e80afa94-a64c-486c-9e34-d55e85f26406'
 +
$ceph_monitor_secret    = 'AQAJzNxR+PNRIRAA7yUp9hJJdWZ3PVz242Xjiw=='
 +
$ceph_monitor_port      = '6789'
 +
$ceph_monitor_address  = $::ipaddress
 +
$ceph_cluster_network  = '10.0.0.0/24'
 +
$ceph_public_network    = '10.0.0.0/24'
 +
$ceph_release          = 'cuttlefish'
 +
$cinder_rbd_user        = 'admin'
 +
$cinder_rbd_pool        = 'volumes'
 +
$cinder_rbd_secret_uuid = 'e80afa94-a64c-486c-9e34-d55e85f26406'
 +
 
 +
Exec {
 +
  path        => '/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin',
 +
#  environment => "https_proxy=$::proxy",
 +
}
</pre>
</pre>
-
* Enable ceph installation on your compute nodes
+
=== Ceph Standalone Node Deployment ===
 +
Support for standalone Ceph nodes is available in g.1 and newer releases.  Configure the cobbler node entries for your Ceph servers.
 +
 +
Uncomment or add the Puppet Ceph node entries as shown below.  This entry is necessary on the first Mon node only:
<pre>
<pre>
-
$cinder_ceph_enabled = true
+
  if !empty($::ceph_admin_key) {
 +
  @@ceph::key { 'admin':
 +
    secret      => $::ceph_admin_key,
 +
    keyring_path => '/etc/ceph/keyring',
 +
  }
 +
  }
 +
 
 +
  class {'ceph_mon': id => 0 }
 +
}
 +
</pre>
 +
On any additional Mon nodes, you only need the following:
 +
<pre>
 +
class {'ceph_mon': id => 1 }
</pre>
</pre>
-
* Specify your MON and OSD cobbler nodes
+
YOU MUST INCREMENT THIS ID, AND IT MUST BE UNIQUE TO EACH MON!
 +
Ceph osd nodes need the following:
<pre>
<pre>
-
### Repeat as needed ###
+
  class { 'ceph::conf':
-
# Make a copy of your swift storage node block above for each additional
+
    fsid            => $::ceph_monitor_fsid,
-
# node in your swift cluster and paste the copy in this section. Be sure
+
    auth_type      => $::ceph_auth_type,
-
# to change the host name, mac, ip, and power settings for each node.
+
    cluster_network => $::ceph_cluster_network,
 +
    public_network  => $::ceph_public_network,
 +
  }
 +
  class { 'ceph::osd':
 +
    public_address  => '10.0.0.3',
 +
    cluster_address => '10.0.0.3',
 +
  }
 +
  ceph::osd::device { '/dev/sdb': }
 +
</pre>
-
### this block defines the ceph monitor nodes
+
Change '/dev/sdb' to the disk you want to use.  You may specify additional disks by duplicating the line and changing the device name.
-
### you will need to add a node type for each addtional mon node
+
 
-
### eg ceph-mon02, etc. This is due to their unique id requirements
+
=== Ceph MON on the Controller Only and OSD on All Compute Nodes ===
-
cobbler_node { "ceph-mon01":
+
This option is available in g.2 and newer releases.  Uncomment the following in your site.pp file:
-
     node_type    => "ceph-mon01",
+
<pre>
-
     mac          => "11:22:33:cc:bb:aa",
+
$controller_has_mon = true
-
     ip            => "192.168.242.180",
+
$osd_on_compute = true
-
     power_address => "192.168.242.13",
+
</pre>
 +
Uncomment the following in your control server puppet node definition:
 +
 
 +
<pre>
 +
    if !empty($::ceph_admin_key) {
 +
    @@ceph::key { 'admin':
 +
      secret      => $::ceph_admin_key,
 +
      keyring_path => '/etc/ceph/keyring',
 +
    }
 +
    }
 +
 
 +
    # each MON needs a unique id, you can start at 0 and increment as needed.
 +
    class {'ceph_mon': id => 0 }
 +
</pre>
 +
 
 +
Add the following to each compute server puppet node definition
 +
<pre>
 +
  class { 'ceph::conf':
 +
     fsid            => $::ceph_monitor_fsid,
 +
     auth_type      => $::ceph_auth_type,
 +
     cluster_network => $::ceph_cluster_network,
 +
     public_network  => $::ceph_public_network,
   }
   }
 +
  class { 'ceph::osd':
 +
    public_address  => '10.0.0.3',
 +
    cluster_address => '10.0.0.3',
 +
  }
 +
  # Specify the disk devices to use for OSD here.
 +
  # Add a new entry for each device on the node that ceph should consume.
 +
  # puppet agent will need to run four times for the device to be formatted,
 +
  #  and for the OSD to be added to the crushmap.
 +
  ceph::osd::device { '/dev/sdb': }
 +
</pre>
-
### this block define ceph osd nodes
+
=== Ceph Multiple MON On Specified Controller and Compute Nodes, with OSD on Separate Compute Nodes ===
-
### add a new entry for each node
+
This option is available in g.2 and newer releases.  You cannot cohabitate mons and osds on the same server in this use case.
-
cobbler_node { "ceph-osd01":
+
 
-
     node_type    => "ceph-osd01",
+
Uncomment the following:
-
     mac          => "11:22:33:cc:bb:aa",
+
<pre>
-
     ip            => "192.168.242.181",
+
$controller_has_mon = true
-
     power_address => "192.168.242.14",
+
$computes_have_mons = false
 +
</pre>
 +
Uncomment the following in your control server puppet node definition:
 +
<pre>
 +
    if !empty($::ceph_admin_key) {
 +
    @@ceph::key { 'admin':
 +
      secret      => $::ceph_admin_key,
 +
      keyring_path => '/etc/ceph/keyring',
 +
    }
 +
    }
 +
 
 +
    # each MON needs a unique id, you can start at 0 and increment as needed.
 +
    class {'ceph_mon': id => 0 }
 +
</pre>
 +
 
 +
For each additional mon on a compute node, add the following:
 +
<pre>
 +
  # each MON needs a unique id, you can start at 0 and increment as needed.
 +
  class {'ceph_mon': id => 0 }
 +
 
 +
for each compute node that does NOT contain a mon, you can specify the OSD configuration
 +
  class { 'ceph::conf':
 +
     fsid            => $::ceph_monitor_fsid,
 +
     auth_type      => $::ceph_auth_type,
 +
    cluster_network => $::ceph_cluster_network,
 +
    public_network  => $::ceph_public_network,
 +
  }
 +
  class { 'ceph::osd':
 +
     public_address  => '10.0.0.3',
 +
     cluster_address => '10.0.0.3',
   }
   }
 +
  # Specify the disk devices to use for OSD here.
 +
  # Add a new entry for each device on the node that ceph should consume.
 +
  # puppet agent will need to run four times for the device to be formatted,
 +
  #  and for the OSD to be added to the crushmap.
 +
  ceph::osd::device { '/dev/sdb': }
</pre>
</pre>
-
* Enable all the ceph options.
+
=== Ceph MON and OSD on the Same Nodes ===
-
** Note that the string 'REPLACEME' MUST be left as is.
+
-
** Generate a unique $ceph_monitor_fsid by running 'uuidgen \-r' on the command line. Unique $ceph_monitor_secrets can only be generated on a running ceph cluster. Leave this string as is.
+
 +
This feature will be available in g.3 and later releases.  It is not supported in g.2 or ealier.
 +
 +
'''WARNING: YOU MUST HAVE AN ODD NUMBER OF MON NODES.'''
 +
 +
You can have as many OSD nodes as you like, but the MON nodes must be odd to reach a quorum.
 +
 +
First, uncomment the ceph_combo line in site.pp:
<pre>
<pre>
-
$ceph_auth_type        = 'cephx'
+
# Another alternative is to run MON and OSD on the same node. Uncomment
-
$ceph_monitor_fsid      = 'e80afa94-a64c-486c-9e34-d55e85f26406'
+
# $ceph_combo to enable this feature. You will NOT need to enabled
-
$ceph_monitor_secret    = 'AQAJzNxR+PNRIRAA7yUp9hJJdWZ3PVz242Xjiw=='
+
# $osd_on_compute, $controller_has_mon, or $computes_have_mon for this
-
$ceph_monitor_port      = '6789'
+
# feature. You will need to specify the normal MON and OSD definitions
-
$ceph_monitor_address  = $::ipaddress
+
# for each puppet node as usual.
-
$ceph_cluster_network  = '192.168.242.0/24'
+
$ceph_combo = true
-
$ceph_public_network    = '192.168.242.0/24'
+
-
$ceph_release          = 'cuttlefish'
+
-
$cinder_rbd_user        = 'volumes'
+
-
$cinder_rbd_pool        = 'volumes'
+
-
$cinder_rbd_secret_uuid = 'REPLACEME'
+
</pre>
</pre>
-
* You will also need to uncomment both the path Exec statement and the puppet node statements
+
Do NOT uncomment these variables
 +
<pre>
 +
# $controller_has_mon = true
 +
# $osd_on_compute = true
 +
# $computes_have_mons = false
 +
</pre>
 +
 
 +
You will need to specify the normal MON and OSD definitions for each puppet node as usual:
<pre>
<pre>
-
node 'ceph-mon01' inherits os_base {
+
node 'compute-server01' inherits os_base {
-
# only mon0 should export the admin keys.
+
   class { 'compute':
-
# This means the following if statement is not needed on the additional mon nodes.
+
     internal_ip        => '192.168.242.21',
-
if \!empty($::ceph_admin_key) {
+
     #enable_dhcp_agent => true,
-
   @@ceph::key { 'admin':
+
    #enable_l3_agent  => true,
-
     secret      => $::ceph_admin_key,
+
    #enable_ovs_agent  => true,
-
     keyring_path => '/etc/ceph/keyring',
+
   }
   }
-
}
 
-
# each MON needs a unique id, you can start at 0 and increment as needed.
+
  # If you want to run ceph mon0 on your controller node, uncomment the
-
class {'ceph_mon': id => 0 }
+
  # following block. Be sure to read all additional ceph-related
-
class { 'ceph::apt::ceph': release => $::ceph_release }
+
  # instructions in this file.
 +
  # Only mon0 should export the admin keys.
 +
  # This means the following if statement is not needed on the additional
 +
  # mon nodes.
 +
  if !empty($::ceph_admin_key) {
 +
    @@ceph::key { 'admin':
 +
      secret      => $::ceph_admin_key,
 +
      keyring_path => '/etc/ceph/keyring',
 +
    }
 +
  }
 +
 
 +
  # Each MON needs a unique id, you can start at 0 and increment as needed.
 +
  class {'ceph_mon': id => 0 }
 +
 
 +
  class { 'ceph::osd':
 +
    public_address  => '192.168.242.21',
 +
    cluster_address => '192.168.242.21',
 +
  }
 +
 
 +
  # Specify the disk devices to use for OSD here.
 +
  # Add a new entry for each device on the node that ceph should consume.
 +
  # puppet agent will need to run four times for the device to be formatted,
 +
  #  and for the OSD to be added to the crushmap.
 +
  ceph::osd::device { '/dev/sdb': }
}
}
 +
</pre>
-
# This is the OSD node definition example. You will need to specify the public and cluster IP for each unique node.
+
==== Making a standalone OSD node in a combined node environment ====
 +
Add the following to your puppet OSD node in site.pp
 +
<pre>
 +
  class { 'ceph::conf':
 +
    fsid => $::ceph_monitor_fsid,
 +
  }
 +
</pre>
 +
 
 +
=== Deploying a Standalone Cinder Volume OSD node ===
 +
This option is available in g.1 and newer releases.  For each puppet node definition, add the following:
 +
<pre>
-
node 'ceph-osd01' inherits os_base {
+
  # if you are using rbd, uncomment the following ceph classes
-
class { 'ceph::conf':
+
  class { 'ceph::conf':
     fsid            => $::ceph_monitor_fsid,
     fsid            => $::ceph_monitor_fsid,
     auth_type      => $::ceph_auth_type,
     auth_type      => $::ceph_auth_type,
Line 98: Line 245:
     public_network  => $::ceph_public_network,
     public_network  => $::ceph_public_network,
   }
   }
-
class { 'ceph::osd':
+
  class { 'ceph::osd':
-
     public_address  => '192.168.242.3',
+
     public_address  => '192.168.242.22',
-
     cluster_address => '192.168.242.3',
+
     cluster_address => '192.168.242.22',
   }
   }
-
# Specify the disk devices to use for OSD here.
+
 
-
# Add a new entry for each device on the node that ceph should consume.
+
-
# puppet agent will need to run four times for the device to be formatted,
+
-
# and for the OSD to be added to the crushmap.
+
-
ceph::osd::device { '/dev/sdd': }
+
-
class { 'ceph::apt::ceph': release => $::ceph_release }
+
-
}
+
</pre>
</pre>
-
=== Installation process ===
+
=== Configuring Glance to use Ceph ===
 +
This option is available in g.1 and newer releases.  Uncomment the following:
 +
<pre>
 +
# $glance_ceph_enabled = true
 +
# $glance_ceph_user    = 'admin'
 +
# $glance_ceph_pool    = 'images'
 +
</pre>
 +
and change $glance_backend to 'rbd':
 +
<pre>
 +
# Glance backend configuration, supports 'file', 'swift', or 'rbd'.
 +
$glance_backend      = 'rbd'
 +
</pre>
-
* Lets first bring up the mon0 node
+
=== Configuring Cinder to use Ceph ===
 +
This option is available in g.1 or newer releases.  Uncomment the following:
<pre>
<pre>
-
apt-get update
+
# The cinder_ceph_enabled configures cinder to use rbd-backed volumes.
-
run 'puppet agent \-t \-v \--no-daemonize' at least three times
+
# $cinder_ceph_enabled          = true
 +
</pre>
 +
Then change $cinder_storage_driver to 'rbd'
 +
<pre>
 +
$cinder_storage_driver        = 'rbd'
</pre>
</pre>
-
* Then bring up the OSD node(s)
+
== Ceph Node Installation and Testing ==
 +
 
 +
If you do not set puppet to autostart in the site.pp, you will have to run the agent manually as shown here.
 +
Regardless of the start method, the agent must run at least four times on each node running any Ceph services in order for Ceph to be properly configured.
 +
 
 +
* First bring up the mon0 node and run:
<pre>
<pre>
-
apt-get update
+
apt-get update
-
run 'puppet agent \-t \-v \--no-daemonize' at least four times
+
run 'puppet agent -t -v --no-daemonize' at least four times
</pre>
</pre>
-
* The ceph cluster will now be up.
+
* Then bring up the OSD node(s) and run:
<pre>
<pre>
-
Login to mon0
+
apt-get update
-
run 'ceph status'
+
run 'puppet agent -t -v --no-daemonize' at least four times
-
the monmap line should show X mons
+
-
the osdmap line should show X osds, ensure that the OSD is marked as up.
+
</pre>
</pre>
-
* You can verify the ceph status with the following command  
+
* The ceph cluster will now be up.  You can verify by logging in to the mon0 node and running the 'ceph status' command.  The "monmap" line should show 1 more more mons (depending on the number you configured).  The osdmap shoudl show 1 or more OSD's (depending on the number you configured) and the OSD should be marked as "up". There will be one OSD per disk configured eg. if you have a single OSD node with three disks available for ceph, you will have 3 OSDs show up in your 'ceph status'.
 +
 
<pre>
<pre>
$ ceph status
$ ceph status
Line 149: Line 310:
dd if=/dev/zero of=/dev/DISK bs=1M count=100';
dd if=/dev/zero of=/dev/DISK bs=1M count=100';
</pre>
</pre>
-
if you do not, your OSD installation will fail.
+
If you do not zero the disk, your OSD installation will fail.
-
Installing your compute nodes will run the necessary commands to create the volumes pool and the client.volumes account.
+
* Installing your compute nodes will run the necessary commands to create the volumes pool and the client.volumes account.
-
Your compute nodes will be automatically configured to use ceph for block storage.
+
* Your compute nodes will be automatically configured to use ceph for block storage.
-
Run run 'puppet agent \-t \-v \--no-daemonize' at least two times on each compute node.
+
=== Testing ===
-
Once these steps are complete, you should be able to create a rbd-backed volume and attach it to an instance as normal.
+
==== Testing Cinder ====
 +
* Once these steps are complete, you should be able to create a rbd-backed volume and attach it to an instance as normal.
<pre>
<pre>
-
nova volume-create 1
+
root@control-server01:~# cinder create --display-name myvol 1
-
nova volume-list
+
+---------------------+--------------------------------------+
 +
|      Property      |                Value                |
 +
+---------------------+--------------------------------------+
 +
|    attachments    |                  []                  |
 +
|  availability_zone  |                nova                 |
 +
|      bootable      |                false                |
 +
|      created_at    |      2013-09-25T14:49:17.484197      |
 +
| display_description |                None                |
 +
|    display_name    |                myvol                |
 +
|          id        | b6d78eb1-e7d7-453d-969b-1b8a23afdd38 |
 +
|      metadata      |                  {}                  |
 +
|        size        |                  4                  |
 +
|    snapshot_id    |                None                |
 +
|    source_volid    |                None                |
 +
|        status      |              creating              |
 +
|    volume_type    |                None                |
 +
+---------------------+--------------------------------------+
 +
root@control-server01:~#
 +
root@control-server01:~# cinder list
 +
+--------------------------------------+-----------+--------------+------+-------------+----------+--------------------------------------+
 +
|                  ID                  |  Status  | Display Name | Size | Volume Type | Bootable |            Attached to              |
 +
+--------------------------------------+-----------+--------------+------+-------------+----------+--------------------------------------+
 +
| b6d78eb1-e7d7-453d-969b-1b8a23afdd38 | available |    myvol    |  4  |    None    |  false  |                                      |
 +
+--------------------------------------+-----------+--------------+------+-------------+----------+--------------------------------------+
 +
root@control-server01:~#
</pre>
</pre>
-
For a few moments, depending on the speed of your ceph cluster, nova volume-list will show the volume status as "creating".
+
Note the volume's ID in the output above.
 +
 
 +
Check Ceph to see that the new volume exists
 +
<pre>
 +
root@control-server01:~# rbd --pool volumes ls
 +
volume-b6d78eb1-e7d7-453d-969b-1b8a23afdd38
 +
root@control-server01:~#
 +
</pre>
 +
 
 +
This command should return a list of UUIDs, of which you will see the one matching the output of cinder commands above.
 +
This is your volume.
 +
 
 +
* For a moment, depending on the speed of your ceph cluster, "cinder list" will show the volume status as "creating".
 +
 
 +
* After it's created, it should mark the volume "available".
 +
 
 +
* Failure states will either be "error" or a indefinite "creating" status. If this is the case, check the /var/log/cinder/cinder-volume.log for any errors.
 +
 
 +
Next, you can attach the volume to a running instance.  First, use the "nova list" command to find the UUID of the instance to which you want to attach the volume.  Then use the "nova volume-attach \[instance id\] \[volume id\] auto" command to attach the volume to the instance.
 +
 
 +
<pre>
 +
root@control-server01:~# nova list
 +
+--------------------------------------+---------------+--------+------------+-------------+------------------+
 +
| ID                                  | Name          | Status | Task State | Power State | Networks        |
 +
+--------------------------------------+---------------+--------+------------+-------------+------------------+
 +
| 68070cdd-8953-4307-99dc-6f346c876f65 | cirros_test1  | ACTIVE | None      | Running    | net10=10.10.10.4 |
 +
| 7876fb8c-a202-421e-8711-c15362c9699f | precise_test1 | ACTIVE | None      | Running    | net10=10.10.10.2 |
 +
+--------------------------------------+---------------+--------+------------+-------------+------------------+
 +
root@control-server01:~# nova volume-attach 68070cdd-8953-4307-99dc-6f346c876f65 b6d78eb1-e7d7-453d-969b-1b8a23afdd38 auto
 +
+----------+--------------------------------------+
 +
| Property | Value                                |
 +
+----------+--------------------------------------+
 +
| device  | /dev/vdb                            |
 +
| serverId | 68070cdd-8953-4307-99dc-6f346c876f65 |
 +
| id      | b6d78eb1-e7d7-453d-969b-1b8a23afdd38 |
 +
| volumeId | b6d78eb1-e7d7-453d-969b-1b8a23afdd38 |
 +
+----------+--------------------------------------+
 +
root@control-server01:~#
 +
</pre>
 +
 
 +
The "cinder list" command should now show the volume's status as "in-use" and it should show the UUID of the instance in the "Attached to" field:
 +
 
 +
<pre>
 +
root@control-server01:~# cinder list
 +
+--------------------------------------+--------+--------------+------+-------------+----------+--------------------------------------+
 +
|                  ID                  | Status | Display Name | Size | Volume Type | Bootable |            Attached to              |
 +
+--------------------------------------+--------+--------------+------+-------------+----------+--------------------------------------+
 +
| b6d78eb1-e7d7-453d-969b-1b8a23afdd38 | in-use |    myvol    |  4  |    None    |  false  | 68070cdd-8953-4307-99dc-6f346c876f65 |
 +
+--------------------------------------+--------+--------------+------+-------------+----------+--------------------------------------+
 +
root@control-server01:~#
 +
 
 +
</pre>
 +
 
 +
You can now log in to the instance, partition the volume, and create a filesystem on the volume.  First we'll need to SSH into the instance.  We know it's IP address from the "nova list" output above.  Use the "quantum router-list" command to find out the namespace we'll need to use when calling SSH.
 +
 
 +
<pre>
 +
root@control-server01:~# quantum router-list
 +
+--------------------------------------+---------+--------------------------------------------------------+
 +
| id                                  | name    | external_gateway_info                                  |
 +
+--------------------------------------+---------+--------------------------------------------------------+
 +
| fc39be0f-b963-45ba-9da8-9874d2924d08 | router1 | {"network_id": "e0692be1-af70-478f-9dc9-c550f8a73231"} |
 +
+--------------------------------------+---------+--------------------------------------------------------+
 +
root@control-server01:~# ip netns exec qrouter-fc39be0f-b963-45ba-9da8-9874d2924d08 ssh -i ~/.ssh/id_rsa cirros@10.10.10.4
 +
The authenticity of host '10.10.10.4 (10.10.10.4)' can't be established.
 +
RSA key fingerprint is 70:ba:13:5e:cf:15:92:8f:30:dd:7d:aa:ac:fa:9f:48.
 +
Are you sure you want to continue connecting (yes/no)? yes
 +
Warning: Permanently added '10.10.10.4' (RSA) to the list of known hosts.
 +
$
 +
</pre>
 +
 
 +
Note from the output of "nova volume-attach" above that the volume was attached as device "/dev/vdb".  We can treat that as an ordinary hard drive by partitioning it, creating a filesystem on it, and mounting it:
 +
<pre>
 +
$ sudo su -
 +
# fdisk /dev/vdb
 +
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
 +
Building a new DOS disklabel with disk identifier 0x07505fb0.
 +
Changes will remain in memory only, until you decide to write them.
 +
After that, of course, the previous content won't be recoverable.
 +
 
 +
Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)
 +
 
 +
Command (m for help): n
 +
Partition type:
 +
  p  primary (0 primary, 0 extended, 4 free)
 +
  e  extended
 +
Select (default p): p
 +
Partition number (1-4, default 1):
 +
Using default value 1
 +
First sector (2048-8388607, default 2048):
 +
Using default value 2048
 +
Last sector, +sectors or +size{K,M,G} (2048-8388607, default 8388607):
 +
Using default value 8388607
 +
 
 +
Command (m for help): w
 +
The partition table has been altered!
 +
 
 +
Calling ioctl() to re-read partition table.
 +
Syncing disks.
 +
# mkfs.ext4 /dev/vdb1
 +
mke2fs 1.42.2 (27-Mar-2012)
 +
Filesystem label=
 +
OS type: Linux
 +
Block size=4096 (log=2)
 +
Fragment size=4096 (log=2)
 +
Stride=0 blocks, Stripe width=0 blocks
 +
262144 inodes, 1048320 blocks
 +
52416 blocks (5.00%) reserved for the super user
 +
First data block=0
 +
Maximum filesystem blocks=1073741824
 +
32 block groups
 +
32768 blocks per group, 32768 fragments per group
 +
8192 inodes per group
 +
Superblock backups stored on blocks:
 +
32768, 98304, 163840, 229376, 294912, 819200, 884736
 +
 
 +
Allocating group tables: done                           
 +
Writing inode tables: done                           
 +
Creating journal (16384 blocks): done
 +
Writing superblocks and filesystem accounting information: done
 +
 
 +
# mkdir /tmp/mount
 +
# mount /dev/vdb1 /tmp/mount
 +
# df -h
 +
Filesystem                Size      Used Available Use% Mounted on
 +
/dev                    242.4M        0    242.4M  0% /dev
 +
/dev/vda1                23.2M    17.9M      4.1M  81% /
 +
tmpfs                  245.9M        0    245.9M  0% /dev/shm
 +
tmpfs                  200.0K    76.0K    124.0K  38% /run
 +
/dev/vdb1                3.9G    72.0M      3.7G  2% /tmp/mount
 +
#
 +
</pre>
 +
 
 +
==== Testing Glance ====
 +
Download an image and add it to glance:
 +
<pre>
 +
wget http://cloud-images.ubuntu.com/precise/current/precise-server-cloudimg-amd64-disk1.img
 +
glance add name="precise-x86_64" is_public=true container_format=ovf disk_format=qcow2 < precise-server-cloudimg-amd64-disk1.img
 +
wget http://download.cirros-cloud.net/0.3.1/cirros-0.3.1-x86_64-disk.img
 +
glance add name="cirros-x86_64" is_public=true disk_format=qcow2 container_format=ovf < cirros-0.3.1-x86_64-disk.img
 +
</pre>
 +
 
 +
Check that the image is known to glance and to Ceph:
 +
<pre>
 +
root@control-server01:~# glance image-list
 +
+--------------------------------------+----------------+-------------+------------------+-----------+--------+
 +
| ID                                  | Name          | Disk Format | Container Format | Size      | Status |
 +
+--------------------------------------+----------------+-------------+------------------+-----------+--------+
 +
| 02315102-8422-44bb-84d2-a7e28029e152 | cirros-x86_64  | qcow2      | ovf              | 13147648  | active |
 +
| bee406a9-1570-4ed6-b0d4-d1394297ae58 | precise-x86_64 | qcow2      | ovf              | 253100032 | active |
 +
+--------------------------------------+----------------+-------------+------------------+-----------+--------+
 +
root@control-server01:~# rbd --pool images ls
 +
02315102-8422-44bb-84d2-a7e28029e152
 +
bee406a9-1570-4ed6-b0d4-d1394297ae58
 +
root@control-server01:~#
 +
</pre>
-
After it's created, it should mark the volume "available".
+
As with Cinder, you should see a matching UUID in the glance volume-list and rbd commands. This is your image stored in Ceph.
-
Failure states will either be "error" or a indefinite "creating" status. If this is the case, check the /var/log/cinder/cinder-volume.log for any errors.
+
[[Category:OpenStack]]

Latest revision as of 01:28, 4 March 2014

Contents

Installing a ceph cluster and configuring rbd-backed cinder volumes.

First steps

  • Install your build server
  • Run puppet_modules.py to download the necessary puppet modules
  • Edit site.pp to fit your configuration.
  • You must define one MON and at least one OSD to use Ceph.
  • It is recommended that you zero the first several blocks on each disk that will be used for Ceph OSD data storage. This step is required if you're using disks that have been used in a previous Ceph deployment. The command "dd if=/dev/zero of=/dev/sdX bs=100M count=1" (with "sdX" replaced with an appropriate device name) will suffice.

Choosing Your Configuration

Cisco COI Grizzly g.1 release only supports standalone ceph nodes. Please follow only those instructions. Cisco COI Grizzly g.2 release supports standalone and integrated. Integrated options allow you to run MON on control and compute servers, along with OSD on compute servers. You can also have standalone cinder-volume nodes as OSD servers.


For all ceph configurations, uncomment the following in site.pp and change the values as are appropriate for your deployment:


$ceph_auth_type         = 'cephx'
$ceph_monitor_fsid      = 'e80afa94-a64c-486c-9e34-d55e85f26406'
$ceph_monitor_secret    = 'AQAJzNxR+PNRIRAA7yUp9hJJdWZ3PVz242Xjiw=='
$ceph_monitor_port      = '6789'
$ceph_monitor_address   = $::ipaddress
$ceph_cluster_network   = '10.0.0.0/24'
$ceph_public_network    = '10.0.0.0/24'
$ceph_release           = 'cuttlefish'
$cinder_rbd_user        = 'admin'
$cinder_rbd_pool        = 'volumes'
$cinder_rbd_secret_uuid = 'e80afa94-a64c-486c-9e34-d55e85f26406'

Exec {
  path        => '/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin',
#  environment => "https_proxy=$::proxy",
}

Ceph Standalone Node Deployment

Support for standalone Ceph nodes is available in g.1 and newer releases. Configure the cobbler node entries for your Ceph servers.

Uncomment or add the Puppet Ceph node entries as shown below. This entry is necessary on the first Mon node only:

  if !empty($::ceph_admin_key) {
  @@ceph::key { 'admin':
    secret       => $::ceph_admin_key,
    keyring_path => '/etc/ceph/keyring',
  }
  }
  
  class {'ceph_mon': id => 0 }
}

On any additional Mon nodes, you only need the following:

class {'ceph_mon': id => 1 }

YOU MUST INCREMENT THIS ID, AND IT MUST BE UNIQUE TO EACH MON!

Ceph osd nodes need the following:

  class { 'ceph::conf':
    fsid            => $::ceph_monitor_fsid,
    auth_type       => $::ceph_auth_type,
    cluster_network => $::ceph_cluster_network,
    public_network  => $::ceph_public_network,
  }
  class { 'ceph::osd':
    public_address  => '10.0.0.3',
    cluster_address => '10.0.0.3',
  }
  ceph::osd::device { '/dev/sdb': }

Change '/dev/sdb' to the disk you want to use. You may specify additional disks by duplicating the line and changing the device name.

Ceph MON on the Controller Only and OSD on All Compute Nodes

This option is available in g.2 and newer releases. Uncomment the following in your site.pp file:

$controller_has_mon = true
$osd_on_compute = true

Uncomment the following in your control server puppet node definition:

    if !empty($::ceph_admin_key) {
    @@ceph::key { 'admin':
      secret       => $::ceph_admin_key,
      keyring_path => '/etc/ceph/keyring',
    }
    }

    # each MON needs a unique id, you can start at 0 and increment as needed.
    class {'ceph_mon': id => 0 }

Add the following to each compute server puppet node definition

  class { 'ceph::conf':
    fsid            => $::ceph_monitor_fsid,
    auth_type       => $::ceph_auth_type,
    cluster_network => $::ceph_cluster_network,
    public_network  => $::ceph_public_network,
  }
  class { 'ceph::osd':
    public_address  => '10.0.0.3',
    cluster_address => '10.0.0.3',
  }
  # Specify the disk devices to use for OSD here.
  # Add a new entry for each device on the node that ceph should consume.
  # puppet agent will need to run four times for the device to be formatted,
  #   and for the OSD to be added to the crushmap.
  ceph::osd::device { '/dev/sdb': }

Ceph Multiple MON On Specified Controller and Compute Nodes, with OSD on Separate Compute Nodes

This option is available in g.2 and newer releases. You cannot cohabitate mons and osds on the same server in this use case.

Uncomment the following:

$controller_has_mon = true
$computes_have_mons = false

Uncomment the following in your control server puppet node definition:

    if !empty($::ceph_admin_key) {
    @@ceph::key { 'admin':
      secret       => $::ceph_admin_key,
      keyring_path => '/etc/ceph/keyring',
    }
    }

    # each MON needs a unique id, you can start at 0 and increment as needed.
    class {'ceph_mon': id => 0 }
 

For each additional mon on a compute node, add the following:

  # each MON needs a unique id, you can start at 0 and increment as needed.
  class {'ceph_mon': id => 0 }
  
for each compute node that does NOT contain a mon, you can specify the OSD configuration
  class { 'ceph::conf':
    fsid            => $::ceph_monitor_fsid,
    auth_type       => $::ceph_auth_type,
    cluster_network => $::ceph_cluster_network,
    public_network  => $::ceph_public_network,
  }
  class { 'ceph::osd':
    public_address  => '10.0.0.3',
    cluster_address => '10.0.0.3',
  }
  # Specify the disk devices to use for OSD here.
  # Add a new entry for each device on the node that ceph should consume.
  # puppet agent will need to run four times for the device to be formatted,
  #   and for the OSD to be added to the crushmap.
  ceph::osd::device { '/dev/sdb': }

Ceph MON and OSD on the Same Nodes

This feature will be available in g.3 and later releases. It is not supported in g.2 or ealier.

WARNING: YOU MUST HAVE AN ODD NUMBER OF MON NODES.

You can have as many OSD nodes as you like, but the MON nodes must be odd to reach a quorum.

First, uncomment the ceph_combo line in site.pp:

# Another alternative is to run MON and OSD on the same node. Uncomment
# $ceph_combo to enable this feature. You will NOT need to enabled
# $osd_on_compute, $controller_has_mon, or $computes_have_mon for this
# feature. You will need to specify the normal MON and OSD definitions
# for each puppet node as usual.
$ceph_combo = true

Do NOT uncomment these variables

# $controller_has_mon = true
# $osd_on_compute = true
# $computes_have_mons = false

You will need to specify the normal MON and OSD definitions for each puppet node as usual:

node 'compute-server01' inherits os_base {
  class { 'compute':
    internal_ip        => '192.168.242.21',
    #enable_dhcp_agent => true,
    #enable_l3_agent   => true,
    #enable_ovs_agent  => true,
  }

  # If you want to run ceph mon0 on your controller node, uncomment the
  # following block. Be sure to read all additional ceph-related
  # instructions in this file.
  # Only mon0 should export the admin keys.
  # This means the following if statement is not needed on the additional
  # mon nodes.
  if !empty($::ceph_admin_key) {
    @@ceph::key { 'admin':
      secret       => $::ceph_admin_key,
      keyring_path => '/etc/ceph/keyring',
    }
  }

  # Each MON needs a unique id, you can start at 0 and increment as needed.
  class {'ceph_mon': id => 0 }

  class { 'ceph::osd':
    public_address  => '192.168.242.21',
    cluster_address => '192.168.242.21',
  }

  # Specify the disk devices to use for OSD here.
  # Add a new entry for each device on the node that ceph should consume.
  # puppet agent will need to run four times for the device to be formatted,
  #   and for the OSD to be added to the crushmap.
  ceph::osd::device { '/dev/sdb': }
}

Making a standalone OSD node in a combined node environment

Add the following to your puppet OSD node in site.pp

  class { 'ceph::conf':
    fsid => $::ceph_monitor_fsid,
  }

Deploying a Standalone Cinder Volume OSD node

This option is available in g.1 and newer releases. For each puppet node definition, add the following:


  # if you are using rbd, uncomment the following ceph classes
  class { 'ceph::conf':
    fsid            => $::ceph_monitor_fsid,
    auth_type       => $::ceph_auth_type,
    cluster_network => $::ceph_cluster_network,
    public_network  => $::ceph_public_network,
  }
  class { 'ceph::osd':
    public_address  => '192.168.242.22',
    cluster_address => '192.168.242.22',
  }

Configuring Glance to use Ceph

This option is available in g.1 and newer releases. Uncomment the following:

# $glance_ceph_enabled = true
# $glance_ceph_user    = 'admin'
# $glance_ceph_pool    = 'images'

and change $glance_backend to 'rbd':

# Glance backend configuration, supports 'file', 'swift', or 'rbd'.
$glance_backend      = 'rbd'

Configuring Cinder to use Ceph

This option is available in g.1 or newer releases. Uncomment the following:

# The cinder_ceph_enabled configures cinder to use rbd-backed volumes.
# $cinder_ceph_enabled           = true

Then change $cinder_storage_driver to 'rbd'

$cinder_storage_driver         = 'rbd'

Ceph Node Installation and Testing

If you do not set puppet to autostart in the site.pp, you will have to run the agent manually as shown here. Regardless of the start method, the agent must run at least four times on each node running any Ceph services in order for Ceph to be properly configured.

  • First bring up the mon0 node and run:
apt-get update
run 'puppet agent -t -v --no-daemonize' at least four times
  • Then bring up the OSD node(s) and run:
 apt-get update
run 'puppet agent -t -v --no-daemonize' at least four times
  • The ceph cluster will now be up. You can verify by logging in to the mon0 node and running the 'ceph status' command. The "monmap" line should show 1 more more mons (depending on the number you configured). The osdmap shoudl show 1 or more OSD's (depending on the number you configured) and the OSD should be marked as "up". There will be one OSD per disk configured eg. if you have a single OSD node with three disks available for ceph, you will have 3 OSDs show up in your 'ceph status'.
$ ceph status
health HEALTH_WARN 320 pgs degraded; 320 pgs stuck unclean; recovery 2/4 degraded (50.000%)
monmap e1: 1 mons at {0=192.168.2.71:6789/0}, election epoch 2, quorum 0 0
osdmap e7: 1 osds: 1 up, 1 in
pgmap v17: 320 pgs: 320 active+degraded; 138 bytes data, 4131 MB used, 926 GB / 930 GB avail; 0B/s rd, 11B/s wr, 0op/s; 2/4 degraded (50.000%)
mdsmap e1: 0/0/1 up
  • If your OSD is not marked as up, you will NOT be able to create block storage until it is.
  • Note: If You are using a disk that was previously used as an osd device, you must write zeros to the drive. Do this by running
dd if=/dev/zero of=/dev/DISK bs=1M count=100';

If you do not zero the disk, your OSD installation will fail.

  • Installing your compute nodes will run the necessary commands to create the volumes pool and the client.volumes account.
  • Your compute nodes will be automatically configured to use ceph for block storage.

Testing

Testing Cinder

  • Once these steps are complete, you should be able to create a rbd-backed volume and attach it to an instance as normal.
root@control-server01:~# cinder create --display-name myvol 1
+---------------------+--------------------------------------+
|       Property      |                Value                 |
+---------------------+--------------------------------------+
|     attachments     |                  []                  |
|  availability_zone  |                 nova                 |
|       bootable      |                false                 |
|      created_at     |      2013-09-25T14:49:17.484197      |
| display_description |                 None                 |
|     display_name    |                 myvol                |
|          id         | b6d78eb1-e7d7-453d-969b-1b8a23afdd38 |
|       metadata      |                  {}                  |
|         size        |                  4                   |
|     snapshot_id     |                 None                 |
|     source_volid    |                 None                 |
|        status       |               creating               |
|     volume_type     |                 None                 |
+---------------------+--------------------------------------+
root@control-server01:~#
root@control-server01:~# cinder list
+--------------------------------------+-----------+--------------+------+-------------+----------+--------------------------------------+
|                  ID                  |   Status  | Display Name | Size | Volume Type | Bootable |             Attached to              |
+--------------------------------------+-----------+--------------+------+-------------+----------+--------------------------------------+
| b6d78eb1-e7d7-453d-969b-1b8a23afdd38 | available |    myvol     |  4   |     None    |  false   |                                      |
+--------------------------------------+-----------+--------------+------+-------------+----------+--------------------------------------+
root@control-server01:~# 

Note the volume's ID in the output above.

Check Ceph to see that the new volume exists

root@control-server01:~# rbd --pool volumes ls
volume-b6d78eb1-e7d7-453d-969b-1b8a23afdd38
root@control-server01:~# 

This command should return a list of UUIDs, of which you will see the one matching the output of cinder commands above. This is your volume.

  • For a moment, depending on the speed of your ceph cluster, "cinder list" will show the volume status as "creating".
  • After it's created, it should mark the volume "available".
  • Failure states will either be "error" or a indefinite "creating" status. If this is the case, check the /var/log/cinder/cinder-volume.log for any errors.

Next, you can attach the volume to a running instance. First, use the "nova list" command to find the UUID of the instance to which you want to attach the volume. Then use the "nova volume-attach \[instance id\] \[volume id\] auto" command to attach the volume to the instance.

root@control-server01:~# nova list
+--------------------------------------+---------------+--------+------------+-------------+------------------+
| ID                                   | Name          | Status | Task State | Power State | Networks         |
+--------------------------------------+---------------+--------+------------+-------------+------------------+
| 68070cdd-8953-4307-99dc-6f346c876f65 | cirros_test1  | ACTIVE | None       | Running     | net10=10.10.10.4 |
| 7876fb8c-a202-421e-8711-c15362c9699f | precise_test1 | ACTIVE | None       | Running     | net10=10.10.10.2 |
+--------------------------------------+---------------+--------+------------+-------------+------------------+
root@control-server01:~# nova volume-attach 68070cdd-8953-4307-99dc-6f346c876f65 b6d78eb1-e7d7-453d-969b-1b8a23afdd38 auto
+----------+--------------------------------------+
| Property | Value                                |
+----------+--------------------------------------+
| device   | /dev/vdb                             |
| serverId | 68070cdd-8953-4307-99dc-6f346c876f65 |
| id       | b6d78eb1-e7d7-453d-969b-1b8a23afdd38 |
| volumeId | b6d78eb1-e7d7-453d-969b-1b8a23afdd38 |
+----------+--------------------------------------+
root@control-server01:~# 

The "cinder list" command should now show the volume's status as "in-use" and it should show the UUID of the instance in the "Attached to" field:

root@control-server01:~# cinder list
+--------------------------------------+--------+--------------+------+-------------+----------+--------------------------------------+
|                  ID                  | Status | Display Name | Size | Volume Type | Bootable |             Attached to              |
+--------------------------------------+--------+--------------+------+-------------+----------+--------------------------------------+
| b6d78eb1-e7d7-453d-969b-1b8a23afdd38 | in-use |    myvol     |  4   |     None    |  false   | 68070cdd-8953-4307-99dc-6f346c876f65 |
+--------------------------------------+--------+--------------+------+-------------+----------+--------------------------------------+
root@control-server01:~# 

You can now log in to the instance, partition the volume, and create a filesystem on the volume. First we'll need to SSH into the instance. We know it's IP address from the "nova list" output above. Use the "quantum router-list" command to find out the namespace we'll need to use when calling SSH.

root@control-server01:~# quantum router-list
+--------------------------------------+---------+--------------------------------------------------------+
| id                                   | name    | external_gateway_info                                  |
+--------------------------------------+---------+--------------------------------------------------------+
| fc39be0f-b963-45ba-9da8-9874d2924d08 | router1 | {"network_id": "e0692be1-af70-478f-9dc9-c550f8a73231"} |
+--------------------------------------+---------+--------------------------------------------------------+
root@control-server01:~# ip netns exec qrouter-fc39be0f-b963-45ba-9da8-9874d2924d08 ssh -i ~/.ssh/id_rsa cirros@10.10.10.4
The authenticity of host '10.10.10.4 (10.10.10.4)' can't be established.
RSA key fingerprint is 70:ba:13:5e:cf:15:92:8f:30:dd:7d:aa:ac:fa:9f:48.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added '10.10.10.4' (RSA) to the list of known hosts.
$ 

Note from the output of "nova volume-attach" above that the volume was attached as device "/dev/vdb". We can treat that as an ordinary hard drive by partitioning it, creating a filesystem on it, and mounting it:

$ sudo su -
# fdisk /dev/vdb
Device contains neither a valid DOS partition table, nor Sun, SGI or OSF disklabel
Building a new DOS disklabel with disk identifier 0x07505fb0.
Changes will remain in memory only, until you decide to write them.
After that, of course, the previous content won't be recoverable.

Warning: invalid flag 0x0000 of partition table 4 will be corrected by w(rite)

Command (m for help): n
Partition type:
   p   primary (0 primary, 0 extended, 4 free)
   e   extended
Select (default p): p
Partition number (1-4, default 1): 
Using default value 1
First sector (2048-8388607, default 2048): 
Using default value 2048
Last sector, +sectors or +size{K,M,G} (2048-8388607, default 8388607): 
Using default value 8388607

Command (m for help): w
The partition table has been altered!

Calling ioctl() to re-read partition table.
Syncing disks.
# mkfs.ext4 /dev/vdb1
mke2fs 1.42.2 (27-Mar-2012)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
262144 inodes, 1048320 blocks
52416 blocks (5.00%) reserved for the super user
First data block=0
Maximum filesystem blocks=1073741824
32 block groups
32768 blocks per group, 32768 fragments per group
8192 inodes per group
Superblock backups stored on blocks: 
	32768, 98304, 163840, 229376, 294912, 819200, 884736

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (16384 blocks): done
Writing superblocks and filesystem accounting information: done 

# mkdir /tmp/mount
# mount /dev/vdb1 /tmp/mount
# df -h
Filesystem                Size      Used Available Use% Mounted on
/dev                    242.4M         0    242.4M   0% /dev
/dev/vda1                23.2M     17.9M      4.1M  81% /
tmpfs                   245.9M         0    245.9M   0% /dev/shm
tmpfs                   200.0K     76.0K    124.0K  38% /run
/dev/vdb1                 3.9G     72.0M      3.7G   2% /tmp/mount
# 

Testing Glance

Download an image and add it to glance:

wget http://cloud-images.ubuntu.com/precise/current/precise-server-cloudimg-amd64-disk1.img
glance add name="precise-x86_64" is_public=true container_format=ovf disk_format=qcow2 < precise-server-cloudimg-amd64-disk1.img
wget http://download.cirros-cloud.net/0.3.1/cirros-0.3.1-x86_64-disk.img
glance add name="cirros-x86_64" is_public=true disk_format=qcow2 container_format=ovf < cirros-0.3.1-x86_64-disk.img

Check that the image is known to glance and to Ceph:

root@control-server01:~# glance image-list
+--------------------------------------+----------------+-------------+------------------+-----------+--------+
| ID                                   | Name           | Disk Format | Container Format | Size      | Status |
+--------------------------------------+----------------+-------------+------------------+-----------+--------+
| 02315102-8422-44bb-84d2-a7e28029e152 | cirros-x86_64  | qcow2       | ovf              | 13147648  | active |
| bee406a9-1570-4ed6-b0d4-d1394297ae58 | precise-x86_64 | qcow2       | ovf              | 253100032 | active |
+--------------------------------------+----------------+-------------+------------------+-----------+--------+
root@control-server01:~# rbd --pool images ls
02315102-8422-44bb-84d2-a7e28029e152
bee406a9-1570-4ed6-b0d4-d1394297ae58
root@control-server01:~# 

As with Cinder, you should see a matching UUID in the glance volume-list and rbd commands. This is your image stored in Ceph.

Rating: 5.0/5 (4 votes cast)

Personal tools