OpenStack:Folsom-Multinode

From DocWiki

Revision as of 18:50, 17 December 2012 by Danehans (Talk | contribs)
Jump to: navigation, search

While many changes have been checked in and validated, we are still improving and updating our baseline puppet deployment manifests to address some issues raised by early adopters and additional testing of the Cisco Edition. Please check back after December 15th for an updated release which should simplify the changes needed to the build manifests.

Contents

Overview

In the Cisco OpenStack distribution, a build server outside of the OpenStack cluster is used to manage and automate the OpenStack software deployment. This build server primarily functions as a Puppet server for software deployment and configuration management onto the OpenStack cluster, as well as a Cobbler installation server to manage the PXE boot used for rapid bootstrapping of the OpenStack cluster.

Once the build server is installed and configured, it is used as an out-of-band automation and management workstation to bring up, control, and reconfigure (if later needed) the nodes of the OpenStack cluster. It also functions as a monitoring server to collect statistics about the health and performance of the OpenStack cluster, as well as to monitor the availability of the machines and services which comprise the OpenStack cluster.

This current deployment supports:

  • Single control server
  • Multiple compute nodes
  • Quantum managed network (VLAN based environment)

The current model does not support deployment of:

  • SWIFT
  • CINDER

Building the environment

Assumptions

Although other configurations are supported, the following instructions target an environment with a build node, a controller node, and at least one compute node. Additional compute nodes may optionally be added, and swift nodes may also be added if desired.

When naming your nodes, make sure that

  • all compute nodes contain "compute" in their host name
  • all control nodes contain "control" in their host name
  • all swift nodes contain "swift" in their host name

Also, these instructions primarily target deployment of OpenStack onto UCS servers (either blades or rack-mount form factors). Several steps in the automation leverage the UCS manager to execute system tasks. Deployment on non-UCS gear may well work, but may require additional configuration or additional manual steps to manage systems.

COE Folsom requires that you have two physically or logically (VLAN) separated IP networks. You must have an external router or layer-3 switch that provides connectivity between these two networks. This is a variation of the Quantum metadata server support requirement:

http://docs.openstack.org/folsom/openstack-network/admin/content/adv_cfg_l3_agent_metadata.html

One network is used to provide connectivity for OpenStack API endpoints, Open vSwitch (OVS) GRE endpoints, and OpenStack/UCS management. The second network is used by OVS as the physical bridge interface and by Quantum as the public network.

Creating a build server

To deploy Cisco OpenStack, first configure a build server. This server has relatively modest hardware requirements: 2 GB RAM, 20 GB storage, Internet connectivity, and a network interface on the same network as the eventual management interfaces of the OpenStack cluster machines are the minimal requirements. This machine can be physical or virtual; eventually a pre-built VM of this server will be provided, but this is not yet available.

Install Ubuntu 12.04 LTS onto this build server. A minimal install with openssh-server is sufficient. Configure the network interface on the OpenStack cluster management segment with a static IP. Also, when partitioning the storage, choose a partitioning scheme which provides at least 15 GB free space under /var, as installation packages and ISO images used to deploy OpenStack will eventually be cached there.

When the installation finishes, log in and become root:

sudo -H bash

Optional: If you have your build server set up behind a non-transparent web proxy, you should export your proxy configuration:

export http_proxy=http://proxy.esl.cisco.com:80
export https_proxy=https://proxy.esl.cisco.com:80

Replace proxy.es1.cisco.com:80 with whatever is appropriate for your environment.

Now you have two choices. You can follow the manual steps below, or you can run a one line script that tries to automate this process. In either case, you should end up with the puppet modules installed, and a set of template site manifests in /etc/puppet/manifests.

To run the install script, copy and paste the following on your command line (as root with your proxy set if necessary as above):

curl -s -k -B https://raw.github.com/CiscoSystems/folsom-manifests/multi-node/install_os_puppet | /bin/bash

You can now jump to "Customizing your build server". Otherwise, follow along with the steps below.

All should now install any pending security updates:

apt-get update && apt-get dist-upgrade -y && apt-get install -y puppet git ipmitool debmirror

Note: The system may need to be restarted after applying the updates.

Get the Cisco Edition example manifests. Under the folsom-manifests GitHub repository you will find different branches, so select the one that matches your topology plans most closely. In the following examples the simple-multi-node branch will be used, which is likely the most common topology:

git clone -b multi-node https://github.com/CiscoSystems/folsom-manifests ~/cisco-folsom-manifests/

Copy the puppet manifests from ~/cisco-folsom-manifests/manifests/ to /etc/puppet/manifests/

cp ~/cisco-folsom-manifests/manifests/* /etc/puppet/manifests

Then get the Cisco Edition puppet modules from Cisco's GitHub repository:

(cd /etc/puppet/manifests; sh /etc/puppet/manifests/puppet-modules.sh)

Optional: If your set up is in a private network and your build node will act as a proxy server and NAT gateway for your OpenStack cluster, you need to add the corresponding NAT and forwarding configuration.

iptables --table nat --append POSTROUTING --out-interface eth0 -j MASQUERADE
iptables --append FORWARD --in-interface eth1 -j ACCEPT
echo 1 > /proc/sys/net/ipv4/ip_forward

Adjust the network interface specifications as appropriate for your network topology.

Customizing the build server

In the /etc/puppet/manifests directory you will find these three files:

site.pp
core.pp
cobbler-node.pp
clean_node.sh
puppet_modules.sh

At a high level, cobbler-node.pp manages the deployment of cobbler to support booting of additional servers into your environment. The core.pp manifest defines the core definitions for openstack service deployment. The site.pp manifest captures the user modifiable components and defines the various parameters that must be set to configure the OpenStack cluster, including the puppetmaster and cobbler setup on the buidl server. clean_node.sh is a shell script provided as a convenience to deployment users; it wraps several cobbler and puppet commands for ease of use when building and rebuilding the nodes of the OpenStack cluster.

IMPORTANT! You must edit site.pp file. It is internally documented.

Then, use the ‘puppet apply’ command to activate the manifests:

puppet apply -v /etc/puppet/manifests/site.pp

And finally, if you did set up cobbler for your instances, you should see the nodes populated in cobbler:

cobbler system list

And you should be able to start the install process on all nodes with the clean_node.sh script:

for n in `cobbler system list`; do /etc/puppet/manifests/clean_node.sh $n `hostname -d` ; done

NOTE: IF You have proxies please read the following:


If you require a proxy server to access the internet, be aware that proxy users have occasionally reported problems during the phases of the installation process that download and install software packages. A common symptom of proxy trouble is that apt will complain about hash mismatches or file corruptions when verifying downloaded files. A few known scenarios and workarounds include:


1.) The Cisco Web Security Appliance (also known as "WSA" or "Ironport" products) has a bug in software versions prior to 7.5.1 that causes the IF-Range header to be dropped in certain circumstances. This bug is fixed in update 7.5.1. More information can be found here:

https://bugs.ironport.com/show_bug.cgi?id=86056


2.) If the apt-get process reports a "HASH mismatch", you may be facing an issue with a caching engine. If it's possible to do so, bypassing the caching engine may resolve the problem. If you are behind a Cisco Application and Content Networking System (ACNS), be aware of a bug that may cause data transfer toward HTTP clients to get terminated by TCP RST when client site persistent connections are disabled and HTTP pipelining is used. More information on this defect can be found here:

http://tools.cisco.com/Support/BugToolKit/search/getBugDetails.do?method=fetchBugDetails&bugId=CSCtf25331

You can work around this issue by disabling HTTP pipelining by adding the following to your site.pp near the top of the file:

file { '/etc/apt/apt.conf.d/00no_pipelining':
        ensure => file,
        owner => 'root',
        group => 'root',
        mode => '0644',
        content => 'Acquire::http::Pipeline-Depth "0";'
}



When the puppet apply command runs, the puppet client on the build server will follow the instructions in the site.pp and cobbler-node.pp manifests and will configure several programs on the build server:

  • ntpd -- a time synchronization server used on all OpenStack cluster nodes to ensure time throughout the cluster is correct
  • tftpd-hpa -- a TFTP server used as part of the PXE boot process when OpenStack nodes boot up
  • dnsmasq -- a DNS and DHCP server used as part of the PXE boot process when OpenStack nodes boot up
  • cobbler -- an installation and boot management daemon which manages the installation and booting of OpenStack nodes
  • apt-cacher-ng -- a caching proxy for package installations, used to speed up package installation on the OpenStack nodes
  • nagios -- a infrastructure monitoring application, used to monitor the servers and processes of the OpenStack cluster
  • collectd --a statistics collection application, used to gather performance and other metrics from the components of the OpenStack cluster
  • graphite and carbon -- a real-time graphing system for parsing and displaying metrics and statistics about OpenStack
  • apache -- a web server hosting sites to implement graphite, nagios, and puppet web services

The initial puppet configuration of the build server will take several minutes to complete as it downloads, installs, and configures all the software needed for these applications.

Once the puppet apply is completed, a reboot is recommended to ensure that all installed software is started in the correct sequence.

After the build server is configured and rebooted, the systems listed in cobbler-node.pp should be defined in cobbler on the build server:

# cobbler system list
   control
   compute01
   compute02
# 

And now, you should be able to use cobbler to build your controller:

/etc/puppet/manifests/clean_node.sh {node_name} example.com

Replace node_name with the name of your controller, and example.com with your cluster's domain.

clean_node.sh is a script which does several things:

  • configures Cobbler to PXE boot the specified node with appropriate PXE options to do an automated install of Ubuntu
  • uses Cobbler to power-cycle the node
  • removes any existing client registrations for the node from Puppet, so Puppet will treat it as a new install
  • removes any existing key entries for the node from the SSH known hosts database

When the script runs, you may see errors from the Puppet and SSH clean up steps if the machine did not already exist in Puppet or SSH. This is expected, and not a cause for alarm.

You can watch the progress on the console of your controller node as cobbler completes the automated install of Ubuntu. Once the installation finishes, the controller node will reboot and then will run puppet after it boots up. Puppet will pull and apply the controller node configuration defined in the puppet manifests on the build server.

This step will take several minutes, as puppet downloads, installs, and configures the various OpenStack components and support applications needed on the control node. /var/log/syslog on the controller node will display the progress of the puppet configuration run.

Note that it may take more than one puppet run for the controller node to be set up completely. Observe the log files to verify that the controller configuration has converged completely to the configuration defined in puppet.

Once the puppet configuration of the controller has completed, follow the same steps to build each of the other nodes in the cluster, using clean_node.sh to initiate each install. As with the controller, the other nodes will take several minutes for puppet configuration to complete, and may require multiple runs of puppet before they are fully converged to their defined configuration state.

As a short cut, if you want to build all of the nodes defined in your cobbler-node.pp file, you can run:

for n in `cobbler system list`; do clean_node.sh $n example.com ; done

note: replace example.com with your node's proper domain name.

Testing OpenStack

Once the nodes are built, and once puppet runs have completed on all nodes (watch /var/log/syslog on the cobbler node), you should be able to log into the OpenStack Horizon interface:

http://ip-of-your-control-node/horizon/ user: admin, password: Cisco123 (if you didn’t change the defaults in the site.pp file)

you will still need to log into the console of the control node to load in an image: user: localadmin, password: ubuntu. If you SU to root, there is an openrc auth file in root’s home directory, and you can launch a test file in /tmp/nova_test.sh.

Rating: 3.5/5 (37 votes cast)

Personal tools