OVB: TripleO on a Public OpenStack

In order to deploy an OpenStack cloud with the TripleO installer ( http://tripleo.org/ ), one needs at least 2 nodes for a basic setup.
If one is interested in deploying more complex topologies, more nodes might be required, for an Highly Available (HA) setup, 4 nodes are required.
Even if libvirt is well-suited for testing basic setups, its limits are easily reached as the setup topology grows in complexity.
OpenStack Virtual Baremetal, a.k.a OVB and former QuintipleO, aims to solve this issue. The user can create as much nova VMs as needed and deploy OpenStack on them.
At this point, OVB is already quite capable, but it also has some intrinsic blockers that need to be dealt with at the inner OpenStack cloud level:

  • patch nova to allow boot on the network on demand: by default, it’s not possible to deploy a VM over PXE.
  • turn off Neutron firewall: otherwise neutron will drop the DHCP answer coming from the virtualized undercloud.
  • disable the anti-spoofing: here again, this is mandatory because traffic will leave out of some NIC with a different IP address than the one Neutron knows. For example, if control01 holds the VIP, the traffic will go through this IP instead the IP address of the interface.

Ben Nemec maintains a nice Heat stack to make the deployment easier in this environment : https://github.com/cybertron/openstack-virtual-baremetal
Of course, when we use a public cloud, it’s no longer a good idea to disable the Neutron firewall or hack Nova and we need alternative solutions.
You will face various issues if you try such a deployment on a public cloud. The following is how we deal with them.

Networking tricks

Not knowing what the undercloud expects, the host cloud instance will provide incorrect IP addresses for the undercloud nodes.
The retained solution is to deploy a small script on the TripleO undercloud that does two things:

  • watches its network configuration and
  • push the changes on the host OpenStack.

A controller node can send traffic using its IP address like any other nodes, but if HA is enabled, it may also use the VIP IP address. This is something the host cloud Neutron won’t accept (IP spoofing), to avoid the issue, we also need to adjust the host cloud VM port configuration. Neutron uses the port allowed_address_pairs field as a whitelist of extra IP that the VM can hold.
This can be done from the command line with:

$ neutron port-update 73455460-9dc2-479e-9e7f-5f3026c75a4b –allowed-address-pairs type=dict list=true ip_address=192.0.2.51

 
Our script also take care of this and will inject the VIP IP addresses in the allowed_address_pairs of the baremetal VM of the host cloud.
You can get the last copy of this script from the python-tripleo-helper repository.

IPMI

Ironic is the OpenStack component used to deploy the physical nodes. It has a limited number of drivers and it cannot directly manage a nova VM. The idea is to bootstrap a BMC virtual machine and start a bunch of IPMI servers on it. This is exactly what Ben Nemec does on his OVB heat stack, however it will be a bit more difficult in our environment, here again because of the anti-spoofing filter. We have to register all the potential IP addresses in the allowed_address_pairs field of the BMC node.
 
Another option is to associate a new port to the BMC for every new baremetal node. In this case, we must create a new subnet per port because Neutron won’t accept more than one NIC per subnet on the same machine. You must also ensure the traffic will go outside of the machine through the NIC that was used to initialize the connection. We do this with the following routing rules for every NIC:
 

$ ip rule add from 192.0.2.45 table 4
$ ip route add default via 192.0.2.1 dev eth3 table 4

192.0.2.45 is the IPMI address, 192.0.2.1 is the gateway, eth3 is the name of the device associated and for is the number of ip route table.
You can download openstackbmc from here:
https://github.com/cybertron/openstack-virtual-baremetal/blob/master/bin/openstackbmc

Boot on the network

Nova is not able to switch from one boot media to another one like a real Baremetal node. Unfortunately, Ironic relies on this behavior to do the deployment on the nodes over PXE. There is already some Nova blueprints covering this topic.
There is two solutions to avoid this problem, both involve the use of a IPXE (http://www.ipxe.org) image to boot on the network.

Boot on the network with upstream IPXE ISO (first solution)

The idea here is to install IPXE on our Baremetal VM disk and let Ironic rewrite the disk.
You can get IPXE official images directly from their website. We use the ISO one, but the USB image will also work.
 

$ wget http://boot.ipxe.org/ipxe.iso
$ glance image-create –name ipxe.iso –disk-format raw –container-format bare < ipxe.iso

 
You can do a nova boot on this VM to validate IPXE starts as expected. Later, if we need to redeploy our VM, we will just have to do a ‘nova rebuild’:
 

$ nova rebuild baremetal_1 ipxe.iso

 
There is an Ironic bug (https://bugs.launchpad.net/ironic-lib/+bug/1550604) here that prevents the grub installation to succeed. The bug should be fixed for Newton ( 284347, 286283, 288062 ). Meanwhile, the solution is to slightly adjust the ironic image before the deployment:
 

$ mkdir /tmp/ramdisk
$ cd /tmp/ramdisk
$ zcat ~/tmp/ironic-python-agent.initramfs | cpio -id

 
Apply this little fix:

— ./usr/lib/python2.7/site-packages/ironic_python_agent/extensions/iscsi.py.orig      2016-03-23 15:36:53.785886733 -0400
+++ ./usr/lib/python2.7/site-packages/ironic_python_agent/extensions/iscsi.py   2016-03-23 15:38:16.209051447 -0400
@@ -143,6 +143,7 @@
iqn = ‘iqn.2008-10.org.openstack:%s’ % uuidutils.generate_uuid()
 
device = hardware.dispatch_to_managers(‘get_os_install_device’)
+        _execute(“wipefs -a %s” % device)
LOG.debug(“Starting ISCSI target with iqn %(iqn)s on device ”
“%(device)s”, {‘iqn’: iqn, ‘device’: device})

 
And you can now regenerate the image:
 

$ find . | cpio –create –format=’newc’ > ~/tmp/ironic-python-agent.initramfs_with_wipefs

 
You can use your new image in place of the original initramfs.

Boot on the network with our own IPXE image ipxe-boot element

As explain above, Ironic issue 1550604 breaks the installation of grub. This because the IPXE image does not have a partition table unlike the overcloud-full image. This will create a conflict during the grub installation. A way to avoid the issue is to come with a minimal IPXE disk that already comes with the correct partition table. This is precisely what Steve Baker’s ipxe-boot element does.
 

$ git clone https://github.com/steveb/openstack-virtual-baremetal
$ git checkout -b ipxe-image master/ipxe-image
$ cd ipxe/ make ipxe-boot.qcow2
$ pip install dib-utils
$ make ipxe-boot.qcow2

A binary image is also availble directly in Steve repository: https://github.com/steveb/openstack-virtual-baremetal/blob/ipxe-image/ipxe/ipxe-boot.qcow2
We can now inject the ipxe-boot element in Glance and continue.

TripleO and MTU > 1500

TripleO does not support a MTU below 1500. Since OpenStack encapsulates the traffic into VLANs, it’s common to see non-standard MTU.
If we ignore this, TripleO will start the deployment but the deployment will probably fail, likely because of a frozen IPXE or Ceph bootstrap failure.
 
As we already said, there is no parameter in the undercloud.conf to pass the MTU value. The best option at this point is to manually adjust the Network template:

$ sed -i ‘s/”name”: “br-ctlplane”,/”name”: “br-ctlplane”,\\n      “mtu”: 1400,/’ /usr/share/instack-undercloud/undercloud-stack-config/config.json.template
$ sed -i ‘s/”primary”: “true”/”primary”: “true”,\\n        “mtu”: 1400/’ /usr/share/instack-undercloud/undercloud-stack-config/config.json.template

 
A patch has been accepted upstream to fix that (288041).

Metadata server access

Queries from the baremetal toward the metadata server should reach the TripleO undercloud, not the host cloud metadata server. We can adjust easily. The first step is to inject a new host_routes in the subnet dedicated to TripleO:

‘host_routes’: [{‘destination’: ‘169.254.169.254/32’, ‘nexthop’: ‘192.0.2.240’}]

 
Of an example of a subnet configuration:

       ‘subnets’: [{
‘name’: ‘rdo-m-subnet’,
‘cidr’: ‘192.0.2.0/24’,
‘ip_version’: 4,
‘network_id’: netw[‘id’],
‘host_routes’: [{
               ‘destination’: ‘169.254.169.254/32’,
               ‘nexthop’: ‘192.0.2.240’
           }],
‘gateway_ip’: ‘192.0.2.1’,
‘dns_nameservers’: [‘8.8.8.8’, ‘8.8.4.4’],
‘allocation_pools’: [{‘start’: ‘192.0.2.30’, ‘end’: ‘192.0.2.199’}]}]

 
In this example, 192.0.2.240 is the IP of the undercloud. This can be done from the command line with something like:

neutron subnet-update 71f4da61-145d-4ad4-817b-d38db1a3e787 –host_routes type=dict list=true destination=169.254.169.254/32,nexthop=192.0.2.240

Since traffic will enter the undercloud with IP 169.254.169.254 instead of the expected 192.0.2.240, we will again be blocked by … the anti-spoofing filter. Let’s fix that by adjusting the allowed_address_pairs:

‘allowed_address_pairs’: [{“ip_address”: “169.254.169.254/32”}]

 

IPXE boot parameters

Since we won’t use directly the undercloud PXE server, we need to ensure the host cloud Neutron will pass the correct DHCP parameters to our IPXE image. This can be done by adjusting the network ports configuration with the following extra parameters:

‘extra_dhcp_opts’: [
{‘opt_name’: ‘bootfile-name’, ‘opt_value’: ‘http://192.0.2.240:8088/boot.ipxe’, ‘ip_version’: 4, },
{‘opt_name’: ‘tftp-server’, ‘opt_value’: ‘192.0.2.240’, ‘ip_version’: ‘4’},
{‘opt_name’: ‘server-ip-address’, ‘opt_value’: ‘192.0.2.240’, ‘ip_version’: ‘4’}]

Here we use boot.ipxe because we are about to do the deployment. If you want to start Ironic introspection, just replace the script name with: inspector.ipxe

Integrated solutions

openstack-virtual-baremetal

At this point, Ben Nemec’s heat stack still requires some deep changes in the host cloud configuration to work, but efforts are done to reduce them as much as possible, like for example Steve Baker’s ipxe-boot element:
https://github.com/cybertron/openstack-virtual-baremetal

python-tripleo-helper

The python-tripleo-helper library aims to make the deployment of a TripleO OpenStack simple. This library comes with the ovb-chainsaw example script that can do all of these tricks automatically.
 
One needs an OpenStack tenant with:

  • a network called private with a subnet called private.
  • a router called router with a public interface.
  • a glance image of IPXE ISO called ipxe.iso:

    curl -O http://boot.ipxe.org/ipxe.iso
glance image-create –name ipxe.iso \
–disk-format raw –container-format bare < ipxe.iso
$ virtualenv -p /usr/bin/python3 tripleo-helper
$ source tripleo-helper/bin/activate
$ pip install https://github.com/redhat-openstack/python-tripleo-helper
$ source my_openrc
$ curl -O http://boot.ipxe.org/ipxe.iso
glance image-create –name ipxe.iso \
–disk-format raw –container-format bare < ipxe.iso
$ chainsaw-ovb –config-file tripleo_helper_osp.conf

 
This is an example of configuration file:


logger:
file: /tmp/chainsaw.log
rhsm:
login: my_login
password: my_password
repositories: &DEFAULT_REPOSITORIES
– type: yum_repo
content: |
[RH7-RHOS-8.0]
name=RH7-RHOS-8.0
baseurl=http://192.168.1.2/rel-eng/OpenStack/8.0-RHEL-7/2016-02-25.1/RH7-RHOS-8.0/x86_64/os/
gpgcheck=0
enabled=1
dest: /etc/yum.repos.d/rhos-release-8.repo
– type: yum_repo
content: |
[RH7-RHOS-8.0-director]
name=RH7-RHOS-8.0-director
baseurl=http://192.168.1.2/rel-eng/OpenStack/8.0-RHEL-7-director/2016-02-25.3/RH7-RHOS-8.0-director/x86_64/os/
gpgcheck=0
enabled=1
dest: /etc/yum.repos.d/rhos-release-8-director.repo
provisioner:
image:
name: RHEL 7.2 x86_64
flavor: m1.hypervisor
network: private
keypair: DCI
security-groups:
– ssh
– rhos-mirror-user
repositories: *DEFAULT_REPOSITORIES
undercloud:
repositories: *DEFAULT_REPOSITORIES
guest_image_path: http://192.168.1.2/brewroot/packages/rhel-guest-image/7.2/20151102.0/images/rhel-guest-image-7.2-20151102.0.x86_64.qcow2
guest_image_checksum: 486900b54f4757cb2d6b59d9bce9fe90
ssh:
private_key: /home/somewhere/.ssh/DCI/id_rsa

Article written by