I’ve been interested in the etcd-compute project by Chris Dent. It’s sort of a lightweight virtual machine manager like OpenStack Nova, but without the complexity and cruft Nova has accumulated over the past 9 years. It takes advantage of technologies that simply didn’t exist in 2010 when Nova was created, using etcd‘s built-in notifications instead of passing large, complex objects over a message bus to make Remote Procedure Calls (RPC).
Keep in mind that Nova does a lot of things that etcd-compute can’t, so this isn’t a potential 1:1 replacement for Nova. But it does have potential as a much lighter replacement for those applications where the full power of Nova isn’t needed.
This post is designed to be obsolete within a week or so. What I’m aiming for is to record what worked for me following Chris’s instructions. Where I run into problems shows one of three things: our systems start out differently, or Chris assumed something that wasn’t in the README.md file, or my brain is not firing on all cylinders. It is my hope that this may help improve the installation instructions, and guide others who may wish to explore etcd-compute.
I don’t have a lot of hardware—ok, any hardware—at my disposal to experiment with, so I started by creating 3 Ubuntu 18.04 VMs in the internal OpenStack cloud for my team here at IBM. Yes, you can run virtualization on top of virtualization, and it’s turtles all the way down. But it does work! I named the instances etcd1, etcd2, and etcd3, with etcd1 being the controller and the others used as standard compute nodes.
There are some requirements—docker.io, virtinst, libvirt-daemon, libvirt-clients, and libguestfs-tools—that need to be installed on all the nodes, so I updated the distro packages and installed the requirements. Unfortunately, libvirtd wouldn’t start, and well, that’s kind of an important piece. So I cleaned house and tried again:
ed@etcd1:~$sudo aptitude purge libvirt-daemon
ed@etcd1:~$sudo apt install -y qemu qemu-kvm libvirt-bin bridge-utils virt-manager
ed@etcd1:~$ sudo systemctl enable libvirtd.service
Synchronizing state of libvirtd.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install enable libvirtd
Created symlink /etc/systemd/system/libvirt-bin.service → /lib/systemd/system/libvirtd.service.
Created symlink /etc/systemd/system/sockets.target.wants/virtlockd.socket → /lib/systemd/system/virtlockd.socket.
Created symlink /etc/systemd/system/sockets.target.wants/virtlogd.socket → /lib/systemd/system/virtlogd.socket.
ed@etcd1:~$ sudo systemctl start libvirtd.service
ed@etcd1:~$ sudo systemctl status libvirtd.service
● libvirtd.service - Virtualization daemon
Loaded: loaded (/lib/systemd/system/libvirtd.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2019-04-23 23:29:30 UTC; 5s ago
Main PID: 5289 (libvirtd)
Tasks: 19 (limit: 32768)
├─4486 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/lib/libvirt/libvirt_
├─4487 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/lib/libvirt/libvirt_
Apr 23 23:29:30 egleafe-etcdcompute-1 systemd: Starting Virtualization daemon…
Apr 23 23:29:30 egleafe-etcdcompute-1 systemd: Started Virtualization daemon.
Apr 23 23:29:30 egleafe-etcdcompute-1 dnsmasq: read /etc/hosts - 10 addresses
Apr 23 23:29:30 egleafe-etcdcompute-1 dnsmasq: read /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses
Apr 23 23:29:30 egleafe-etcdcompute-1 dnsmasq-dhcp: read /var/lib/libvirt/dnsmasq/default.hostsfile
lines 1-17/17 (END)
So I guess it’s working now!
It’s also a pain to always have to use sudo to run docker commands, so add your user to the docker group. The command for this is
sudo usermod -a -G docker ed, which adds user ‘ed’ to the group ‘docker’. You have to log out and log back in for it to take effect, but once you do, you can run commands like
docker ps -a without sudo.
Also, in my experience I’ve run into various odd problems using the distro version of Python, so I prefer to install from source to get the latest Python (3.7.3 right now).
Being a creature of habit, I like having the project code I’m working with to be under a ~/projects directory. So for each of these instances, I ran the following:
ed@etcd1:~$ mkdir projects
ed@etcd1:~$ cd projects/
ed@etcd1:~/projects$ git clone https://github.com/cdent/etcd-compute.git
Cloning into 'etcd-compute'…
remote: Enumerating objects: 169, done.
remote: Counting objects: 100% (169/169), done.
remote: Compressing objects: 100% (107/107), done.
remote: Total 205 (delta 97), reused 124 (delta 61), pack-reused 36
Receiving objects: 100% (205/205), 54.58 KiB | 119.00 KiB/s, done.
Resolving deltas: 100% (107/107), done.
ed@etcd1:~/projects$ cd etcd-compute/
As the etcd-compute code has its own dependencies, those need to be installed by running
sudo python setup.py develop. When I ran that the first time, I got an error when it was trying to install libvirt-python. I tried installing some other libvirt-related libraries and binaries, but I kept getting the same error. After a while I was trying anything I could think of, even running under Python 2! (didn’t work). Maybe it was something about Python 3.7 that was problematic, so I created a venv for Python 3.6, and ran
pip install libvirt-python. It installed without a problem. Hmmm. So I fired up a Python 3.7 venv, and it also installed into that. It seems that the installation using setup.py was doing something different than a straight pip install. To test that, I got rid of the venvs, and ran
sudo pip install libvirt-python, and it worked just fine. I was then able to install the rest of the dependencies by running
sudo python setup.py develop.
Now that the dependencies are installed, we need to create the database for placement, and then modify the dockerenv file so that the OS_PLACEMENT_DATABASE__CONNECTION setting points to that. My database is on a MariaDB server, so I needed to change the value to:
That means, of course, that I need to install pymysql using
sudo pip install pymysql before I can make a connection. Once that’s done, I started the docker containers by running
./docker.sh from the primary VM. In my case, that’s etcd1.
That brings up another edit that’s needed on all your “machines”: changing the location of the host in compute.yaml and schedule.yaml. These assume that the host is named ‘ds1’, which isn’t true in this case. I changed ‘ds1’ to ‘etcd1’, and then added an entry in each node’s /etc/hosts file with the IP address of the etcd1 VM.
We also want to create a value for the uuid in compute.yaml. One simple way is to run
python -c "import uuid; print(uuid.uuid4())", and copy the output to paste into compute.yaml. Do that on every compute node you are running.
That’s enough for one day. Tomorrow we start with configuring networking!