etcd-compute – Walking Contradiction

ed@etcd1:~$ ip a
1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens3: mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether fa:16:3e:90:6d:d0 brd ff:ff:ff:ff:ff:ff
inet 9.114.111.201/24 brd 9.114.111.255 scope global ens3
valid_lft forever preferred_lft forever
inet6 fe80::f816:3eff:fe90:6dd0/64 scope link
valid_lft forever preferred_lft forever
3: virbr0: mtu 1500 qdisc noqueue state DOWN group default qlen 1000
link/ether 52:54:00:35:c1:0d brd ff:ff:ff:ff:ff:ff
inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
valid_lft forever preferred_lft forever
4: virbr0-nic: mtu 1500 qdisc fq_codel master virbr0 state DOWN group default qlen 1000
link/ether 52:54:00:35:c1:0d brd ff:ff:ff:ff:ff:ff

ed@etcd1:~/projects/etcd-compute(master)$ ecompute &
[1] 4661
ed@etcd1:~/projects/etcd-compute(master)$ 1556230694.3633301: PID: 4661 [None] {'uuid': '19a89e30-4bdd-49e7-b1a0-d4172bf7b289', 'placement': {'endpoint': 'http://etcd1:8080'}, 'etcd': {'host': 'etcd1'}, 'resize': False, 'bridge': 'br0'}
1556230694.364856: PID: 4661 [19a89e30-4bdd-49e7-b1a0-d4172bf7b289] {'VCPU': 4, 'DISK_GB': 77, 'MEMORY_MB': 7976}
1556230694.5012665: PID: 4661 [19a89e30-4bdd-49e7-b1a0-d4172bf7b289] Existing resource provider with gen 7 found with usages: VCPU: 0, MEMORY_MB: 0, DISK_GB: 0.

I’ve been interested in the etcd-compute project by Chris Dent. It’s sort of a lightweight virtual machine manager like OpenStack Nova, but without the complexity and cruft Nova has accumulated over the past 9 years. It takes advantage of technologies that simply didn’t exist in 2010 when Nova was created, using etcd‘s built-in notifications instead of passing large, complex objects over a message bus to make Remote Procedure Calls (RPC).

Keep in mind that Nova does a lot of things that etcd-compute can’t, so this isn’t a potential 1:1 replacement for Nova. But it does have potential as a much lighter replacement for those applications where the full power of Nova isn’t needed.

This post is designed to be obsolete within a week or so. What I’m aiming for is to record what worked for me following Chris’s instructions. Where I run into problems shows one of three things: our systems start out differently, or Chris assumed something that wasn’t in the README.md file, or my brain is not firing on all cylinders. It is my hope that this may help improve the installation instructions, and guide others who may wish to explore etcd-compute.

I don’t have a lot of hardware—ok, any hardware—at my disposal to experiment with, so I started by creating 3 Ubuntu 18.04 VMs in the internal OpenStack cloud for my team here at IBM. Yes, you can run virtualization on top of virtualization, and it’s turtles all the way down. But it does work! I named the instances etcd1, etcd2, and etcd3, with etcd1 being the controller and the others used as standard compute nodes.

There are some requirements—docker.io, virtinst, libvirt-daemon, libvirt-clients, and libguestfs-tools—that need to be installed on all the nodes, so I updated the distro packages and installed the requirements. Unfortunately, libvirtd wouldn’t start, and well, that’s kind of an important piece. So I cleaned house and tried again:

ed@etcd1:~$sudo aptitude purge libvirt-daemon
ed@etcd1:~$sudo apt install -y qemu qemu-kvm libvirt-bin  bridge-utils  virt-manager
ed@etcd1:~$ sudo systemctl enable libvirtd.service
 Synchronizing state of libvirtd.service with SysV service script with /lib/systemd/systemd-sysv-install.
 Executing: /lib/systemd/systemd-sysv-install enable libvirtd
 Created symlink /etc/systemd/system/libvirt-bin.service → /lib/systemd/system/libvirtd.service.
 Created symlink /etc/systemd/system/sockets.target.wants/virtlockd.socket → /lib/systemd/system/virtlockd.socket.
 Created symlink /etc/systemd/system/sockets.target.wants/virtlogd.socket → /lib/systemd/system/virtlogd.socket.
 ed@etcd1:~$ sudo systemctl start libvirtd.service
 ed@etcd1:~$ sudo systemctl status libvirtd.service
 ● libvirtd.service - Virtualization daemon
    Loaded: loaded (/lib/systemd/system/libvirtd.service; enabled; vendor preset: enabled)
    Active: active (running) since Tue 2019-04-23 23:29:30 UTC; 5s ago
      Docs: man:libvirtd(8)
            https://libvirt.org
  Main PID: 5289 (libvirtd)
     Tasks: 19 (limit: 32768)
    CGroup: /system.slice/libvirtd.service
            ├─4486 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/lib/libvirt/libvirt_
            ├─4487 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/lib/libvirt/libvirt_
            └─5289 /usr/sbin/libvirtd
 Apr 23 23:29:30 egleafe-etcdcompute-1 systemd[1]: Starting Virtualization daemon…
 Apr 23 23:29:30 egleafe-etcdcompute-1 systemd[1]: Started Virtualization daemon.
 Apr 23 23:29:30 egleafe-etcdcompute-1 dnsmasq[4486]: read /etc/hosts - 10 addresses
 Apr 23 23:29:30 egleafe-etcdcompute-1 dnsmasq[4486]: read /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses
 Apr 23 23:29:30 egleafe-etcdcompute-1 dnsmasq-dhcp[4486]: read /var/lib/libvirt/dnsmasq/default.hostsfile
 lines 1-17/17 (END)

So I guess it’s working now!

It’s also a pain to always have to use sudo to run docker commands, so add your user to the docker group. The command for this is sudo usermod -a -G docker ed, which adds user ‘ed’ to the group ‘docker’. You have to log out and log back in for it to take effect, but once you do, you can run commands like docker ps -a without sudo.

Also, in my experience I’ve run into various odd problems using the distro version of Python, so I prefer to install from source to get the latest Python (3.7.3 right now).

Being a creature of habit, I like having the project code I’m working with to be under a ~/projects directory. So for each of these instances, I ran the following:

ed@etcd1:~$ mkdir projects

ed@etcd1:~$ cd projects/

ed@etcd1:~/projects$ git clone https://github.com/cdent/etcd-compute.git

Cloning into 'etcd-compute'…

remote: Enumerating objects: 169, done.

remote: Counting objects: 100% (169/169), done.

remote: Compressing objects: 100% (107/107), done.

remote: Total 205 (delta 97), reused 124 (delta 61), pack-reused 36

Receiving objects: 100% (205/205), 54.58 KiB | 119.00 KiB/s, done.

Resolving deltas: 100% (107/107), done.

ed@etcd1:~/projects$ cd etcd-compute/

ed@etcd1:~/projects/etcd-compute(master)$

As the etcd-compute code has its own dependencies, those need to be installed by running sudo python setup.py develop. When I ran that the first time, I got an error when it was trying to install libvirt-python. I tried installing some other libvirt-related libraries and binaries, but I kept getting the same error. After a while I was trying anything I could think of, even running under Python 2! (didn’t work). Maybe it was something about Python 3.7 that was problematic, so I created a venv for Python 3.6, and ran pip install libvirt-python. It installed without a problem. Hmmm. So I fired up a Python 3.7 venv, and it also installed into that. It seems that the installation using setup.py was doing something different than a straight pip install. To test that, I got rid of the venvs, and ran sudo pip install libvirt-python, and it worked just fine. I was then able to install the rest of the dependencies by running sudo python setup.py develop.

Now that the dependencies are installed, we need to create the database for placement, and then modify the dockerenv file so that the OS_PLACEMENT_DATABASE__CONNECTION setting points to that. My database is on a MariaDB server, so I needed to change the value to:

mysql+pymysql://user:secret@my_host/placement

That means, of course, that I need to install pymysql using sudo pip install pymysql before I can make a connection. Once that’s done, I started the docker containers by running ./docker.sh from the primary VM. In my case, that’s etcd1.

That brings up another edit that’s needed on all your “machines”: changing the location of the host in compute.yaml and schedule.yaml. These assume that the host is named ‘ds1’, which isn’t true in this case. I changed ‘ds1’ to ‘etcd1’, and then added an entry in each node’s /etc/hosts file with the IP address of the etcd1 VM.

We also want to create a value for the uuid in compute.yaml. One simple way is to run python -c "import uuid; print(uuid.uuid4())", and copy the output to paste into compute.yaml. Do that on every compute node you are running.

That’s enough for one day. Tomorrow we start with configuring networking!

Tag: etcd-compute

More fun with etcd-compute

Playing with etcd-compute