Why OpenStack Failed, or How I Came to Love the Idea of a BDFL

OK, so the title of this is a bit clickbait-y, but let me explain. By some measures, OpenStack is a tremendous success, being used to power several public clouds and many well-known businesses. But it has failed to become a powerful player in the cloud space, and I believe the reason is not technical in nature, but a lack of leadership.

OpenStack began as a collaboration between Rackspace, a commercial, for-profit business, and a consulting group working for NASA. While there were several companies involved in the beginning, Rackspace dominated by sheer numbers. This dominance was a concern to many companies – why should they contribute their time and resources to a project that might only benefit Rackspace? This fear was not entirely unfounded, as the OpenStack API was initially created to match Rackspace’s legacy cloud API, and much of the early naming of things matched Rackspace’s terminology – I mean, who ever thought of referring to virtual machines as “servers”? But that matched the “Cloud Servers” branding that Rackspace used for its cloud offering, and that name, as well as the use of “flavor” for instance sizing, persist today. The early governance was democratic, but when one company has many more votes than the others…

The executives at Rackspace were aware of this concern, and quickly created the OpenStack Foundation, which would be an independent entity that would own the intellectual property, helping to guarantee that one commercial company would not control the destiny of OpenStack. More subtly, though, it also engendered a deep distrust of any sort of top-down control over the direction of the software development. Each project within OpenStack was free to pretty much do things however they wanted, as long as they remained within the bounds of the Four Opens: Open Source, Open Design, Open Development, and Open Community.

That sound pretty good, right? I mean, who needs someone imposing their opinions on you?

Well, it turns out that OpenStack needed that. For those who don’t know the term “BDFL“, it is an acronym for “Benevolent Dictator For Life”. It means that the software created under a BDFL is opinionated, but it is also consistently opinionated. A benevolent dictator listens to the various voices asking for features, or designing an API, and makes a decision based on the overall good of the project, and not on things like favoring corporate interests for big contributors, or strong personalities that otherwise dominate design discussions. Can you imagine what AWS would be like if each group within could just decide how they wanted to do things? The imposition of the design from above assures AWS that each of its projects can work easily with others.

The closest thing to that in OpenStack is the Technical Committee (TC), which “is an elected group that represents the contributors to the open source project, and has oversight on all technical matters”. Despite the typical meaning of “oversight”, the TC is essentially a suggestion body, and has no real enforcement power. They can spend months agonizing over the wording of mission statements and community goals, but shy away from anything that might appear to be a directive that others must do. I don’t think the word “must” is in their vocabulary.

They also bend over backwards to avoid potentially offending anyone. Here is one example from my interactions with them: one of the things the TC does is “tag” projects, so that newcomers to OpenStack can get a better idea how mature a particular project is, or how stable, etc. One of the proposed tags was to warn potential users that a project was primarily being developed by a single company; the concern is that all it would take is one manager at that company to decide to re-assign their employees, and the project would be dead. This is a very valid concern for open source projects, and it was proposed that a tag named “team:diverse-affiliation-danger” be created to flag such projects. What followed was much back-and-forth on the review of the proposal as well as in TC meetings about how the tag name was negative and would hurt people’s feelings, how it would be seen as an attack against a project, that it was more of a stick rather than a carrot, etc. All of this hand-wringing over an objective measurement of the content of a project’s current level of activity. (Epilogue: they ended up making it a positive-sounding tag: “team:single-vendor”, and no tears were shed)

Having ineffective leadership like the TC has ripple effects throughout all of OpenStack. Each project is an island, and calls its own shots. So when two projects need to interact, they both see it from the perspective of “how will this affect me?” instead of “how will this improve OpenStack?”. This results in protracted discussions about interfaces and who will do what thing in what order. And when I say “protracted”, I don’t just mean weeks or months; some, such as the CyborgNova integration discussions, have dragged on for two years! I cannot imaging that happening in a world with an OpenStack BDFL. This inter-project friction slows down development of OpenStack as a whole, and in my opinion, contributes to developer dissatisfaction.

So what would OpenStack have been like if it had had a BDFL? Of course, that would depend entirely on the individual, but I can say this: it would have flamed out very quickly with a poor BDFL, or it would be a much better product with a much higher adoption with a good one. Back in 2013 I had predicted that OpenStack would eventually rival the commercial clouds in much the same manner that Linux now dominates the internet over proprietary operating systems. In the early days of the internet, the ability for people to download and play with free software such as the LAMP stack enabled people with big ideas but small budgets to turn those ideas into reality. OpenStack began in the early days of cloud computing, and it seemed logical that having a freely-available alternative to the commercial clouds might likewise result in new cloud-native creations becoming reality. It was a believable prediction, but I missed the effect that a lack of coordination from above would have on OpenStack achieving the potential to fill that role.

By the way, many people point to Linux and its BDFL, Linus Torvalds, as the argument against having a BDFL, as Linus has repeatedly behaved as an offensive ass towards others when he didn’t like their ideas. But ass or not, Linux succeeded because of having that single opinion consistently shaping its development. Most BDFLs, though, are not insufferable asses, and their projects are better off as a result.

OpenStack PTG, Denver 2019

PTG Denver 2019 Logo

Immediately following the Open Infrastructure Summit in Denver was the 3-day Project Teams Gathering (PTG). This was the first time that these two events were scheduled back-to-back. It was in response to some members of the community complaining that traveling to 4 separate events a year (2 Summits, 2 PTGs) was both too expensive and too tiring. The idea was that now you would only have to travel twice a year.

Now that I’ve experienced these back-to-back events, I think that this was a giant step backwards. Let me explain why.

First, it was exhausting! Being in rooms with lots of people for days on end is very draining for those of us who are introverts. Sure, we can be outgoing and interact with people, but it takes a toll, and downtime is necessary to recharge the psychological batteries. At several points I found myself faced with attending a session or finding an empty room to work on stuff by myself, and the latter often won out.

Second, the main idea of the PTG was to take the midcycle get-togethers that many teams had been doing, and formalize a single place for them to meet. The feeling was that having these teams in the same place would spur cross-project discussions, and that definitely was the case. But now that teams will only be getting together every 6 months, we’re back to the situation we were in before the PTGs were created: many teams will need a mid-cycle meeting to ensure that everyone is on-track to complete the goals for that release cycle.

Third, being away from home for an entire week is too long. OK, maybe I’m just getting old, but I really do like being home. One of the nice things about traveling to conferences is tacking on a few extra days to explore the area. For example, after last year’s PTG in Denver, my wife flew out to join me, and we spent a long weekend in Rocky Mountain National Park and other nearby natural areas. But after a solid week of stuff, I couldn’t wait to go home.

Fourth, many people time their return travel so that they miss the last day (or part of it). My unscientific observation was that attendance on the last day of this PTG showed a more dramatic drop than in previous PTGs. I think that’s because it doesn’t seem as severe to miss one day out of 6 than to miss one day out of 3.

As is the tradition at PTGs, there was a feedback session at lunch on the second day, and a lot of the feedback was in line with my observations. Of course, there were a lot of people who liked the format, and for the exact opposite reasons! Goes to show you can’t please everyone.

As for the sessions, the API-SIG was scheduled in a room for Thursday morning. I hung out there, and a few people did come in, but I think we had covered all of the outstanding issues at the BoF session on Tuesday. So I got to spend a lot of the morning hacking on Neo4j, and was able to implement a lot of the functionality that is missing in Placement: nested providers, shared providers, and quotas. I put together a series of Jupyter Notebooks that demonstrated all of these things working with just a small amount of code so that I could share with other people involved in Placement.

And then there was lunch! After 3 days of either going hungry or grabbing something nearby, it was so much nicer to sit down with people while eating lunch. Unfortunately, the box lunches provided seemed to have been kept at near-freezing temperatures until just before the lunch break, and almost too cold to eat. Still, I much preferred them to not having any lunch session at all, if for nothing else than being able to share a meal with other OpenStackers.

In the afternoon we had the Nova – Placement cross-project session, to which the Placement PTL, Chris Dent, brought some bottles of bubbly to celebrate the deletion of the Placement code base from Nova. That commit ended up getting delayed for one more day, but still, it was a milestone to celebrate.

The rest of the session was personally painful to sit through, as the topics revolved around the things that we have been fighting to implement for over 2.5 years: nested providers, shared providers, tree affinity, and other complex relationships among resources. It was painful because I just wanted to shout out “WE’RE USING THE WRONG TOOL!”, as these things naturally flowed from a graph database. I was able to get all of these things working in my spare time over the previous few days. I like to think that I’m a pretty smart guy, but I’m not THAT smart. It’s just because the tool fits the problem domain.

Nested Provider demo
Jupyter notebook showing a section of the Nested Provider demo. It’s a little hard to see, but the two results show that there two possible solutions, both starting with the ComputeNode named ‘balanced_testnode’. Each solution shows that the requested resources both came from the same NUMA node. This is one of the things that comes naturally with a graph DB that is really, really hard in a SQL DB.

I spent that evening working to finish up my Neo4j examples, as I had asked several key placement contributors to take a few minutes to sit down with me so that I could show them what I had done. On Friday morning I showed my graph work to several people, and while each reaction was different, there was a definite flow from skepticism to curiosity and then (for some) to agreement. One of the people to whom I especially wanted to show this was Jay Pipes, whom I had mentioned in my earlier experiments with graph DBs. He had already seen the potential after those blog posts, but he was concerned with developers having to learn some new, cryptic language in order to implement this. However, after about 10 minutes of my demos, I showed him the query I was currently working on that wasn’t quite right. He looked it over, made a suggestion, and when I ran it, it worked correctly! So I think that if he could get a working knowledge after just 10 minutes of seeing the Cypher Query Language, it won’t be hard for other devs to pick it up.

Later in the day we had a good discussion with the Ironic team about a need that they had for stand-alone (i.e, not running under Nova). In such situations, they wanted to use the full resource amounts in placement, as opposed to the current approach used in Nova, which is to represent an Ironic node as an inventory of 1 thing. The issue with representing a baremetal server as, say, 500GB of disk and 16 CPUs is that it may occasionally be selected from a request for 250GB and 8CPU. Since each server cannot be shared, we needed to figure out a way to fully consume the resources on the machine when it was selected, even if the request was for a lower amount. Several ideas were floated and discussed, all with varying degrees of messiness. We finally settled on adding a new API endpoint that would accept a Resource Provider, and allocate all of its resources so that it would no longer be available to any other request.

Hallway Sign

On Saturday morning we started with the Cyborg-Nova cross-project session, at which we could finally see a demonstration of Cyborg in action! I had thought that the Summit sessions would have been much more useful if the demo had been shown then, so that we could have something concrete to discuss. I was glad to see that Cyborg is working and handling accelerators after a few years of planning and design, and I look forward to making further progress integrating it with Nova and Placement.

There were a few discussions in the afternoon that had to do with representing nested resources and their relationships. Once again, it was difficult to listen to these attempts to represent complex relationships in a SQL DB, when I had just demonstrated how simple it was in a graph DB. It was indeed telling that the subject was entitled “Implementing Nested Magic” – getting this working in SQL does seem to require supernatural powers!

I had to leave around 3pm to get to the airport, so I missed anything after that. But most people seemed to have left by then anyway. It had been a long week, and I was burnt out. I also missed being home with my wife, sleeping in my own bed, working at my own desk, and eating my own food. I sincerely hope that the Foundation reconsiders this back-to-back setup. I realize that they are trying to save money wherever possible, but this just wasn’t worth it.

Open Infrastructure Summit, Denver 2019

The first ever Open Infrastructure Summit was held in the last week of April 2019 at the Colorado Convention Center in Denver, CO. It’s the first since the re-branding from OpenStack to Open Infrastructure began last year to be officially held with the new name. Otherwise, it felt just like the OpenStack summits of old.

The keynotes were better than in prior summits – I think the sponsors got the feedback that no one was interested in sitting through a recap of “how they did X with OpenStack”, and instead focused more on what they intended to do with it. There was a great demo by Chris Hoge and Julia Kreger that showed a kubernetes operator managing a bare metal infrastructure; it showed very clearly that the typical media message around “Kubernetes is replacing OpenStack” is silly. They exist in different problem spaces, and work well together. The only place Kubernetes is replacing OpenStack is in the hype cycle.

After the keynotes I went to the Nova Project Update session. It was very thorough, but felt more like someone reading release notes out loud. I had hoped for more of a discussion about the thinking that went into some of the things that were worked on or are being planned rather than just a straight recitation.

After that was lunch – sort of. For the first time since these summits began, lunch was not provided. Instead, you were supposed to go to one of the many restaurants in the area and buy your own lunch. However, since we had pretty poor weather—freezing temperatures, snow, and rain—walking around downtown Denver wasn’t what I felt like doing. Judging by how packed the restaurant in the hotel across the street was, a lot of other people felt the same way. I understand that times are not as heady as in previous years when OpenStack was the latest hotness, but this seemed like a poor place to cut back. I always enjoyed sharing a table with a bunch of other OpenStackers and learning about where they were from and what they were doing with OpenStack. Going out to lunch meant that people tended to stay with groups they already knew. The afternoon snacks were also gone, which is no big deal for me, but others mentioned to me that they missed having them. Finally, they didn’t have a signature piece of conference swag. I’m typing this wearing the OpenStack hoodie I got back in the Paris 2014 summit, and have my sweatshirt from Tokyo 2015 in my room. Well, OK, they did give out a pair of socks, but they weren’t tied to the event. It’s not a huge thing, but not having something this time really makes things feel… different. And not in a good way.

There weren’t any sessions in the afternoon that I really wanted to go to, so instead I worked on two OpenStack-related projects: etcd-compute and using Graph Databases, such as Neo4j, to hold information for the Placement service. I have previously written about my work with both of these. And since the author of etcd-compute, Chris Dent, was also here at the summit, it was a perfect time to work on it together, so I set up several VMs for us to “play with”.

Monday evening after the sessions was the “Marketplace Mixer”, which is a way to get the attendees to visit the vendor area. They provided food and beverages, and I had my badge scanned several times in exchange for some local craft beer. There wasn’t a lot offered by the vendors that would be useful to me, but I did run into a lot of people I knew. When you’re in your 10th year of working on OpenStack, you get to know quite a few people!

On Tuesday I started with a session on Nova-Cyborg integration. Or at least that was what it was advertised as. It turned out to be more of an “Introduction to Cyborg Concepts” talk, rather than focusing on where the two projects needed to integrate.

cyborg-nova
The crowd at the Cyborg-Nova integration session

Later on was the API-SIG BoF (Birds of a Feather) session that I headed up. There hadn’t been much traffic in the SIG ahead of the summit, so I was happily surprised when several people showed up. We ended up having a good discussion on a variety of API-related topics, and I got to meet several of the people who have joined in some of the more recent IRC discussions and Office Hours who previously I had only known by their IRC handles. It’s always nice to put a face to a name.

In the afternoon was a session to update everyone on the process of extracting Placement from Nova. In the past this has been a somewhat heated topic, but this time everyone seemed to understand where things were and were pretty cool with it. There weren’t any long discussions, so the session finished early. I guess that’s a very good sign that we handed that process well.

The final session of the afternoon was to discuss what the various SIGs (Special Interest Groups) and WGs (Working Groups) needed to be successful. Since the API-SIG has been around for many years, we didn’t really have any needs along these lines. Sure, it would be great to get more people involved, but it isn’t critical. Some of the newer groups explored ways of getting the word out about their existence, which is always a problem. There is so much going on in the OpenStack world that getting people to pay attention to yet another thing is always challenging.

That evening was the Open Infrastructure party, sponsored by Trilio, Mirantis, Red Hat, Open Telekom Cloud, & AVI Networks. It was held in The Church Nightclub, which is an old church that has been converted to a nightclub. There was an open bar and food available, and they had a band playing for entertainment. The location was fun, but being indoors with loud music meant that there was only so much conversation you could have. Still, it was fun!

Open Infrastructure Party
The crowd at the Open Infrastructure Party at the Church Niteclub
church niteclub
A view from higher up that shows how an old church was converted into a niteclub. You can see the some of the band playing at the very bottom.

There weren’t any talks on Wednesday morning that I really wanted to attend, so I spent most of the morning in the designated hacking room working on the etcd-compute project for a while, and then on implementing many of the features that are currently lacking in Placement in my graph database code. I managed to implement passing a tree structure to represent nested resource providers so that it creates the corresponding nodes and relationships in the database. This implementation is becoming more and more complete, and I hope when I show it to others this week that they are able to get out of their MySQL comfort zone and see how much better this approach is for representing resources.

I went to lunch with some of the members of my team at IBM who were at the Summit, along with some people from Red Hat with whom we are working to ensure that their various offerings run as well on Power hardware as on x86. So while the pizza was tasty, it was definitely a working lunch. It was also great to meet some of the people I had only known online before.

The Red Hat – IBM lunch *after* the food had been eaten.

After lunch was a session focused on the gaps between Nova functionality and what has been implemented in OpenStack Client. Most of the missing functionality is concerned with supporting new microversions, and this support is several years behind. I’m not sure how effective the discussions were, since what is really needed is for people to take ownership of some of the needed tasks, and I didn’t hear a lot of that happening.

After that I went to the Cyborg Project Update. Once again, it probably would have been much more useful to anyone who hadn’t been following along with the project, so while I didn’t get much from it, there was a lot of information presented on the current state and future plans for Cyborg.

And that was it! The end of another Summit, even if it was the first. That evening I met my sister for dinner. She lives in the Denver area, and it was great to catch up with her and spend some time relaxing after 3 long days. But the relaxation will be short-lived, as the Train PTG starts first thing tomorrow morning!

Geri & Ed
Selfie with my big sister Geri

More fun with etcd-compute

Last time I ended my work getting etcd-compute running at the point where I needed to configure the virtual networking. I’ve been busy the past few days with meetings and other work-related stuff, so it’s taken me a while to continue on this experiment. But I have some time now; let’s jump back in!

The reason I thought that I needed to set up virtual networking was that when I ran ip a on my controller node, all I had was the loopback and main ethernet interfaces. The directions for etcd-compute talked about setting up the metadata server by adding the IP address it uses to a virtual bridge: sudo ip addr add 169.254.169.254 dev virbr0. As I didn’t have such a bridge on my VM, I figured I had to add it. I tried sever guides on adding a bridge to an Ubuntu server, but each one ended up messing up the networking, making the VM unreachable. I ended up re-creating my etcd1 so many times that I gave up and figured I try without the metadata server. I started the placement and etcd servers by running docker.sh, and then just on a lark I re-ran ip a. This time it showed:

ed@etcd1:~$ ip a
1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens3: mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether fa:16:3e:90:6d:d0 brd ff:ff:ff:ff:ff:ff
inet 9.114.111.201/24 brd 9.114.111.255 scope global ens3
valid_lft forever preferred_lft forever
inet6 fe80::f816:3eff:fe90:6dd0/64 scope link
valid_lft forever preferred_lft forever
3: virbr0: mtu 1500 qdisc noqueue state DOWN group default qlen 1000
link/ether 52:54:00:35:c1:0d brd ff:ff:ff:ff:ff:ff
inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
valid_lft forever preferred_lft forever
4: virbr0-nic: mtu 1500 qdisc fq_codel master virbr0 state DOWN group default qlen 1000
link/ether 52:54:00:35:c1:0d brd ff:ff:ff:ff:ff:ff

I’m not sure how those entries for ‘virbr0’ and ‘virbr0-nic’ got added (maybe docker added them?), but I wasn’t going to worry about that! So I ran the following commands, and they worked without a problem:

sudo ip addr add 169.254.169.254 dev virbr
sudo python md_server/mdserver/server.py mdserver.conf &

So now that the metadata server is running, time to try running ecompute on all the nodes. I use iTerm2, which has some sweet tools for splitting the terminal screen and running the same command in the different panes. I recorded a script of what happened:

I ran the command ecompute & on all the nodes to start the compute service in the background.

ed@etcd1:~/projects/etcd-compute(master)$ ecompute &
[1] 4661
ed@etcd1:~/projects/etcd-compute(master)$ 1556230694.3633301: PID: 4661 [None] {'uuid': '19a89e30-4bdd-49e7-b1a0-d4172bf7b289', 'placement': {'endpoint': 'http://etcd1:8080'}, 'etcd': {'host': 'etcd1'}, 'resize': False, 'bridge': 'br0'}
1556230694.364856: PID: 4661 [19a89e30-4bdd-49e7-b1a0-d4172bf7b289] {'VCPU': 4, 'DISK_GB': 77, 'MEMORY_MB': 7976}
1556230694.5012665: PID: 4661 [19a89e30-4bdd-49e7-b1a0-d4172bf7b289] Existing resource provider with gen 7 found with usages: VCPU: 0, MEMORY_MB: 0, DISK_GB: 0.

It’s interesting to see that because I had run this a few times earlier, etcd-compute recognized the UUID of the node, and noted that there was already an entry for that resource provider, with a generation of 7. If I were to stop that ecompute service and then re-start it, I would see the same as above, except this time the generation would be 8. That’s because when the service is killed, it changes the ‘reserved’ amount of its VCPU inventrory to the total amount, effectively preventing that node from being provisioned. That change increments the resource provider’s generation.

At about the 30-second mark, I tried to create a VM by running the command eschedule 'resources=VCPU:1,DISK_GB:1,MEMORY_MB:256' on the etcd3 node. That worked, and almost immediately you can see that it was scheduled to the etcd1 node, and the build process starts. However, there were many errors output, with the main one being error: failed to get domain ‘ff77fe58-e96a-498b-a3f5-a59030987238’. This is repeated several times, along with a bunch of network errors. So at this point I stopped the experiment.

There’s a lot I learned by going through all this, and I see many places where the etcd-compute project could be improved, starting with the documentation. I’d also like to get some less ethereal debugging output, so that when there are problems like I had spinning up a VM, they are recorded for later analysis. I’d also like to learn a lot more about the details of the networking required so that I can make sense of some of the networking errors.

The author of etcd-compute, Chris Dent, and I are hoping to have a mini-sprint on this project next week at the Open Infrastructure Summit in Denver, Colorado. If you will be there and want to join in the fun, drop me an email and I’ll let you know when we settle on a time and place.

Playing with etcd-compute

I’ve been interested in the etcd-compute project by Chris Dent. It’s sort of a lightweight virtual machine manager like OpenStack Nova, but without the complexity and cruft Nova has accumulated over the past 9 years. It takes advantage of technologies that simply didn’t exist in 2010 when Nova was created, using etcd‘s built-in notifications instead of passing large, complex objects over a message bus to make Remote Procedure Calls (RPC).

Keep in mind that Nova does a lot of things that etcd-compute can’t, so this isn’t a potential 1:1 replacement for Nova. But it does have potential as a much lighter replacement for those applications where the full power of Nova isn’t needed.

This post is designed to be obsolete within a week or so. What I’m aiming for is to record what worked for me following Chris’s instructions. Where I run into problems shows one of three things: our systems start out differently, or Chris assumed something that wasn’t in the README.md file, or my brain is not firing on all cylinders. It is my hope that this may help improve the installation instructions, and guide others who may wish to explore etcd-compute.

I don’t have a lot of hardware—ok, any hardware—at my disposal to experiment with, so I started by creating 3 Ubuntu 18.04 VMs in the internal OpenStack cloud for my team here at IBM. Yes, you can run virtualization on top of virtualization, and it’s turtles all the way down. But it does work! I named the instances etcd1, etcd2, and etcd3, with etcd1 being the controller and the others used as standard compute nodes.

There are some requirements—docker.io, virtinst, libvirt-daemon, libvirt-clients, and libguestfs-tools—that need to be installed on all the nodes, so I updated the distro packages and installed the requirements. Unfortunately, libvirtd wouldn’t start, and well, that’s kind of an important piece. So I cleaned house and tried again:

ed@etcd1:~$sudo aptitude purge libvirt-daemon
ed@etcd1:~$sudo apt install -y qemu qemu-kvm libvirt-bin  bridge-utils  virt-manager
ed@etcd1:~$ sudo systemctl enable libvirtd.service
Synchronizing state of libvirtd.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install enable libvirtd
Created symlink /etc/systemd/system/libvirt-bin.service → /lib/systemd/system/libvirtd.service.
Created symlink /etc/systemd/system/sockets.target.wants/virtlockd.socket → /lib/systemd/system/virtlockd.socket.
Created symlink /etc/systemd/system/sockets.target.wants/virtlogd.socket → /lib/systemd/system/virtlogd.socket.
ed@etcd1:~$ sudo systemctl start libvirtd.service
ed@etcd1:~$ sudo systemctl status libvirtd.service
● libvirtd.service - Virtualization daemon
Loaded: loaded (/lib/systemd/system/libvirtd.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2019-04-23 23:29:30 UTC; 5s ago
Docs: man:libvirtd(8)
https://libvirt.org
Main PID: 5289 (libvirtd)
Tasks: 19 (limit: 32768)
CGroup: /system.slice/libvirtd.service
├─4486 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/lib/libvirt/libvirt_
├─4487 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/lib/libvirt/libvirt_
└─5289 /usr/sbin/libvirtd
Apr 23 23:29:30 egleafe-etcdcompute-1 systemd[1]: Starting Virtualization daemon…
Apr 23 23:29:30 egleafe-etcdcompute-1 systemd[1]: Started Virtualization daemon.
Apr 23 23:29:30 egleafe-etcdcompute-1 dnsmasq[4486]: read /etc/hosts - 10 addresses
Apr 23 23:29:30 egleafe-etcdcompute-1 dnsmasq[4486]: read /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses
Apr 23 23:29:30 egleafe-etcdcompute-1 dnsmasq-dhcp[4486]: read /var/lib/libvirt/dnsmasq/default.hostsfile
lines 1-17/17 (END)

So I guess it’s working now!

It’s also a pain to always have to use sudo to run docker commands, so add your user to the docker group. The command for this is sudo usermod -a -G docker ed, which adds user ‘ed’ to the group ‘docker’. You have to log out and log back in for it to take effect, but once you do, you can run commands like docker ps -a without sudo.

Also, in my experience I’ve run into various odd problems using the distro version of Python, so I prefer to install from source to get the latest Python (3.7.3 right now).

Being a creature of habit, I like having the project code I’m working with to be under a ~/projects directory. So for each of these instances, I ran the following:

ed@etcd1:~$ mkdir projects
ed@etcd1:~$ cd projects/
ed@etcd1:~/projects$ git clone https://github.com/cdent/etcd-compute.git
Cloning into 'etcd-compute'…
remote: Enumerating objects: 169, done.
remote: Counting objects: 100% (169/169), done.
remote: Compressing objects: 100% (107/107), done.
remote: Total 205 (delta 97), reused 124 (delta 61), pack-reused 36
Receiving objects: 100% (205/205), 54.58 KiB | 119.00 KiB/s, done.
Resolving deltas: 100% (107/107), done.
ed@etcd1:~/projects$ cd etcd-compute/
ed@etcd1:~/projects/etcd-compute(master)$

As the etcd-compute code has its own dependencies, those need to be installed by running sudo python setup.py develop. When I ran that the first time, I got an error when it was trying to install libvirt-python. I tried installing some other libvirt-related libraries and binaries, but I kept getting the same error. After a while I was trying anything I could think of, even running under Python 2! (didn’t work). Maybe it was something about Python 3.7 that was problematic, so I created a venv for Python 3.6, and ran pip install libvirt-python. It installed without a problem. Hmmm. So I fired up a Python 3.7 venv, and it also installed into that. It seems that the installation using setup.py was doing something different than a straight pip install. To test that, I got rid of the venvs, and ran sudo pip install libvirt-python, and it worked just fine. I was then able to install the rest of the dependencies by running sudo python setup.py develop.

Now that the dependencies are installed, we need to create the database for placement, and then modify the dockerenv file so that the OS_PLACEMENT_DATABASE__CONNECTION setting points to that. My database is on a MariaDB server, so I needed to change the value to:

mysql+pymysql://user:secret@my_host/placement

That means, of course, that I need to install pymysql using sudo pip install pymysql before I can make a connection. Once that’s done, I started the docker containers by running ./docker.sh from the primary VM. In my case, that’s etcd1.

That brings up another edit that’s needed on all your “machines”: changing the location of the host in compute.yaml and schedule.yaml. These assume that the host is named ‘ds1’, which isn’t true in this case. I changed ‘ds1’ to ‘etcd1’, and then added an entry in each node’s /etc/hosts file with the IP address of the etcd1 VM.

We also want to create a value for the uuid in compute.yaml. One simple way is to run python -c "import uuid; print(uuid.uuid4())", and copy the output to paste into compute.yaml. Do that on every compute node you are running.

That’s enough for one day. Tomorrow we start with configuring networking!