openstack – openSUSE Lizards

How we run our OpenStack cloud

bmwiedemann — Mon, 23 Jan 2017 13:12:10 +0000

This post it to document how we setup cloud.suse.de which is one of our many internal SUSE OpenStack Cloud deployments for use by R&D.

In 2016-06 we started the deployment with SOC6 on 4 nodes. 1 controller and 3 compute nodes that also served for ceph (distributed storage) with their 2nd HDD. Since the nodes are from 2012 they only have 1gbit network and spinning disks. Thus ceph only delivers ~50 MB/s which is sufficient for many use cases.

We did not deploy that cloud with HA, even though our product supports it. The two main reasons for that are

that it will use up two or three nodes instead of one for controller services, which is significant if you start out with only 4 (and grow to 6)
that it increases the complexity of setup, operations and debugging and thus might lead to decreased availability of the cloud service

Then we have a limited supply of vlans even though technically they are just numbers between 1 and 4095, in SUSE we do allocations to be able to switch together networks from further away. So we could not use vlan mode in neutron if we want to allow software defined network (=SDN) (we did not in old.cloud.suse.de and I did not hear complaints, but now I see a lot of people using SDN)
So we went with ovs+vxlan +dvr (open vSwitch + Virtual eXtensible LAN + Distributed Virtual Router) because that allows VMs to remain reachable even when the controller node reboots.
But then I found that they cannot use DNS during that time, because distributed virtual DNS was not yet implemented. And ovs has some annoying bugs are hard to debug and fix. So I built ugly workarounds that mostly hide^Wsolve the problems from our users’ point of view.
For the next cloud deployment, I will try to use linuxbridge+vlan or linuxbridge+vxlan mode.
And the uptime is pretty good. But it could be better with proper monitoring.

Because we needed to redeploy multiple times before we got all the details right and to document the setup, we scripted most of the deployment with qa_crowbarsetup (which is part of our CI) and extra files in https://github.com/SUSE-Cloud/automation/tree/production/scripts/productioncloud. The only part not in there are the passwords.

We use proper SSL certs from our internal SUSE CA.
For that we needed to install that root CA on all involved nodes.

We use kvm, because it is the most advanced and stable of the supported hypervisors. Xen might be a possible 2nd choice. We use two custom kvm patches to fix nested virt on our G3 Opteron CPUs.

Overall we use 3 vlans. One each for admin, public/floating, sdn/storage networks.
We increased the default /24 IP ranges because we needed more IPs in the fixed and public/floating networks

For authentication, we use our internal R&D LDAP server, but since it does not have information about user’s groups, I wrote a perl script to pull that information from the Novell/innerweb LDAP server and export it as json for use by the hybrid_json assignment backend I wrote.

In addition I wrote a cloud-stats.sh to email weekly reports about utilization of the cloud and another script to tell users about which instances they still have, but might have forgotten.

On the cloud user side, we and other people use one or more of

Salt-cloud
Nova boot
salt-ssh
terraform
heat

to script instance setup and administration.

Overall we are now hosting 70 instance VMs on 5 compute nodes that together have cost us below 20000€

OpenStack Infra/QA Meetup

bmwiedemann — Wed, 23 Jul 2014 13:54:38 +0000

Last week, around 30 people from around the world met in Darmstadt, Germany to discuss various things about OpenStack and its automatic testing mechanisms (CI).
The meeting was well-organized by Marc Koderer from Deutsche Telekom.
We were shown plans of what the Telekom intends to do with virtualization in general and OpenStack in particular and the most interesting one to me was to run clouds in dozens of datacenters across Germany, but have a single API for users to access.
There were some introductory sessions about the use of git review and gerrit, that mostly had things I (and I guess the majority of the others) already learned over the years. It included some new parts such as tracking “specs” – specifications (.rst files) in gerrit with proper review by the core reviewers, so that proper processes could already be applied in the design phase to ensure the project is moving in the right direction.

On the second day we learned that the infra team manages servers with puppet, about jenkins-job-builder (jjb) that creates around 4000 jobs from yaml templates. We learned about nodepool that keeps some VMs ready so that jobs in need will not have to wait for them to boot. 180-800 instances is quite an impressive number.
And then we spent three days on discussing and hacking things, the topics and outcomes of which you can find in the etherpad linked from the wiki page.
I got my first infra patch merged, and a SUSE Cloud CI account setup, so that in the future we can test devstack+tempest on openSUSE and have it comment in Gerrit. And maybe some day we can even have a test to deploy crowbar+openstack from git (including the patch from an open review) to provide useful feedback, but for that we might first want to move crowbar (which is consisting of dozens of repos – one for each module) to stackforge – which is the openstack-provided Gerrit hosting.

another way to access a cloud VM’s VNC console

bmwiedemann — Sat, 08 Feb 2014 08:14:58 +0000

If you have used a cloud, that was based on OpenStack, you will have seen the dashboard including a web-based VNC access using noVNC + WebSockets.
However, it was not possible to access this VNC directly (e.g. with my favourite gvncviewer from the gtk-vnc-tools package), because the actual compute nodes are hidden and accessing them would circumvent authentication, too.

I want this for the option to add an OpenStack-backend to openQA, my OS-autotesting framework, which emulates a user by using a few primitives: grabbing screenshots and typing keys (can be done through VNC), powering up a machine(=nova boot), inserting/ejecting an installation medium (=nova volume-attach / volume-detach).

To allow for this, I wrote a small perl script, that translates a TCP-connection into a WebSocket-connection.
It is installed like this
git clone https://github.com/bmwiedemann/connectionproxy.git sudo /sbin/OneClickInstallUI http://multiymp.zq1.de/perl-Protocol-WebSocket?base=http://download.opensuse.org/repositories/devel:languages:perl

and is used like this
nova get-vnc-console $YOURINSTANCE novnc perl wsconnectionproxy.pl --port 5942 --to http://cloud.example.com:6080/vnc_auto.html?token=73a3e035-cc28-49b4-9013-a9692671788e gvncviewer localhost:42

I hope this neat code will be useful for other people and tasks as well and wish you a lot of fun with it.

Some technical details:

The code is able to handle multiple connections in a single thread using select.
HTTPS is not supported in the code, but likely could be done with stunnel.
WebSocket-code was written in 3h.
noVNC tokens expire after a few minutes.

Hongkong OpenStack Design Summit

bmwiedemann — Wed, 13 Nov 2013 13:50:23 +0000

So last week many OpenStack (cloud software) developers met in Hongkong’s world expo halls to discuss the future development and show off what is done already.

Overall, I heard there were 3000 attendees, with 800 being developers or so. That sounds like a large number of people, but luckily everything felt well-organized and the rooms were always big enough to have seats for all interested.

The design sessions were usually pretty low-level and focused into one component, so it was not easy for me to make useful contributions in there. The session about read-only API access (e.g. for helpdesk workers and monitoring) and about HA were most useful to me.

In the breakout rooms were interesting sessions by many large OpenStack users (CERN, Ebay, Paypal, Dreamhost, Rackspace) giving valuable insights into what people expect from and do with a cloud. Many of them are using custom-built parts, because the plain OpenStack is still not complete to run a cloud. SUSE Cloud ships with some such missing parts (e.g. deployment and configuration management), but most organisations seem to run their own at the moment.

Cloudbase was there telling about their Hyper-V support that we integrated in SUSE Cloud.
Apart from the 6 SUSE Cloud developers there were several local (and one Australian) SUSE guys manning the booth.

Overall it was quite some experience to be there (in such an exotic and yet nice place) and listen and talk to so many different people from very different backgrounds.

CLI to upload image to openstack cloud

Josef Reidinger — Wed, 18 Apr 2012 11:18:39 +0000

I work on automatic testing of one of our products that creates other projects.
And because there is a lot of clouds everywhere I want to use them too. We
have internally an OpenStack cloud (still Diablo release). So I need to solve
automatic uploading of images built in the Build Service. Below I describe my working version.

At first, for other cloud related tasks we are using the nova command (which
e.g. has also image-delete, but not add). For uploading we use
glance. I found a few obstacles which I separately describe and also provide
a solution.

Authentication

The first chalenge is authentication, as glance doesn`t use NOVA_*
environment variables. But it allows to use an authentication token. So we
just need to get such a token. With help of Martin Vidner we have this script,
that returns a valid token.

# cloud_auth_token.sh
OS_AUTH_URL="http://cloud.example.com:5000/v2.0"
OS_TENANT_NAME=$NOVA_USERNAME
OS_USERNAME=$NOVA_USERNAME
OS_PASSWORD=$NOVA_API_KEY
AUTH_JSON="{\"auth\":{\"passwordCredentials\":{\"username\": \"$OS_USERNAME\", \"password\":\"$OS_PASSWORD\"}, \"tenantName\":\"$OS_TENANT_NAME\"}}"
curl -s \
    -d "$AUTH_JSON" -H "Content-type: application/json" \
    $OS_AUTH_URL/tokens \
    | python -c "import sys; import json; tok = json.loads(sys.stdin.read()); print tok['access']['token']['id'];"

What does it do? It calls OpenStack Identity API, passes credentials encoded
as JSON. The response is also JSON, so we use python that is already on the
system to parse the response and get the token.

Compressing the Image

The next challenge is compression of the image. We get a raw disk from the
build service and we extend it to have more than 15GB (we mirror there rpms
so we need this space). For resizing we use qemu-img from
virt-utils. If we simply upload this image it means that we send the whole
15GB over the network. Which is fine for one-time tasks, but for regular
testing it is a problem. Thanks to Christoph Thiel we solved it with the
conversion to qcow2. Qcow2 is also supported in OpenStack and qcow2 allows
compression. The final script for conversion:

qemu-img convert -c -f raw -O qcow2 img.raw img.qcow2
qemu-img resize img.qcow2 +10G

Using it Together

Now we have prepared an image and a helper script to get a cloud auth
token. So let’s upload the image.

cat img.qcow2 | glance -H cloud.example.com -A `sh cloud_auth_token.sh` add name='testing_img' is_public=False disk_format=qcow2 container_format=bare

Cleaning Up After Testing

We use it for testing and release new versions of the testing appliance often,
therefore we need cleaning up. It is quite simple with unix text utils:

for i in `nova image-list | grep "image_name" | sed "s/^|[[:space:]]\+\([[:xdigit:]-]\+\).*$/\1/"`; do nova image-delete $i; done

I hope that it helps you with automatic uploading of images to
OpenStack. It works for me with the Diablo release and there is no guarantee that it is the best way