More fun with etcd-compute

Last time I ended my work getting etcd-compute running at the point where I needed to configure the virtual networking. I’ve been busy the past few days with meetings and other work-related stuff, so it’s taken me a while to continue on this experiment. But I have some time now; let’s jump back in!

The reason I thought that I needed to set up virtual networking was that when I ran ip a on my controller node, all I had was the loopback and main ethernet interfaces. The directions for etcd-compute talked about setting up the metadata server by adding the IP address it uses to a virtual bridge: sudo ip addr add 169.254.169.254 dev virbr0. As I didn’t have such a bridge on my VM, I figured I had to add it. I tried sever guides on adding a bridge to an Ubuntu server, but each one ended up messing up the networking, making the VM unreachable. I ended up re-creating my etcd1 so many times that I gave up and figured I try without the metadata server. I started the placement and etcd servers by running docker.sh, and then just on a lark I re-ran ip a. This time it showed:

ed@etcd1:~$ ip a
1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
2: ens3: mtu 1500 qdisc fq_codel state UP group default qlen 1000
link/ether fa:16:3e:90:6d:d0 brd ff:ff:ff:ff:ff:ff
inet 9.114.111.201/24 brd 9.114.111.255 scope global ens3
valid_lft forever preferred_lft forever
inet6 fe80::f816:3eff:fe90:6dd0/64 scope link
valid_lft forever preferred_lft forever
3: virbr0: mtu 1500 qdisc noqueue state DOWN group default qlen 1000
link/ether 52:54:00:35:c1:0d brd ff:ff:ff:ff:ff:ff
inet 192.168.122.1/24 brd 192.168.122.255 scope global virbr0
valid_lft forever preferred_lft forever
4: virbr0-nic: mtu 1500 qdisc fq_codel master virbr0 state DOWN group default qlen 1000
link/ether 52:54:00:35:c1:0d brd ff:ff:ff:ff:ff:ff

I’m not sure how those entries for ‘virbr0’ and ‘virbr0-nic’ got added (maybe docker added them?), but I wasn’t going to worry about that! So I ran the following commands, and they worked without a problem:

sudo ip addr add 169.254.169.254 dev virbr
sudo python md_server/mdserver/server.py mdserver.conf &

So now that the metadata server is running, time to try running ecompute on all the nodes. I use iTerm2, which has some sweet tools for splitting the terminal screen and running the same command in the different panes. I recorded a script of what happened:

I ran the command ecompute & on all the nodes to start the compute service in the background.

ed@etcd1:~/projects/etcd-compute(master)$ ecompute &
[1] 4661
ed@etcd1:~/projects/etcd-compute(master)$ 1556230694.3633301: PID: 4661 [None] {'uuid': '19a89e30-4bdd-49e7-b1a0-d4172bf7b289', 'placement': {'endpoint': 'http://etcd1:8080'}, 'etcd': {'host': 'etcd1'}, 'resize': False, 'bridge': 'br0'}
1556230694.364856: PID: 4661 [19a89e30-4bdd-49e7-b1a0-d4172bf7b289] {'VCPU': 4, 'DISK_GB': 77, 'MEMORY_MB': 7976}
1556230694.5012665: PID: 4661 [19a89e30-4bdd-49e7-b1a0-d4172bf7b289] Existing resource provider with gen 7 found with usages: VCPU: 0, MEMORY_MB: 0, DISK_GB: 0.

It’s interesting to see that because I had run this a few times earlier, etcd-compute recognized the UUID of the node, and noted that there was already an entry for that resource provider, with a generation of 7. If I were to stop that ecompute service and then re-start it, I would see the same as above, except this time the generation would be 8. That’s because when the service is killed, it changes the ‘reserved’ amount of its VCPU inventrory to the total amount, effectively preventing that node from being provisioned. That change increments the resource provider’s generation.

At about the 30-second mark, I tried to create a VM by running the command eschedule 'resources=VCPU:1,DISK_GB:1,MEMORY_MB:256' on the etcd3 node. That worked, and almost immediately you can see that it was scheduled to the etcd1 node, and the build process starts. However, there were many errors output, with the main one being error: failed to get domain ‘ff77fe58-e96a-498b-a3f5-a59030987238’. This is repeated several times, along with a bunch of network errors. So at this point I stopped the experiment.

There’s a lot I learned by going through all this, and I see many places where the etcd-compute project could be improved, starting with the documentation. I’d also like to get some less ethereal debugging output, so that when there are problems like I had spinning up a VM, they are recorded for later analysis. I’d also like to learn a lot more about the details of the networking required so that I can make sense of some of the networking errors.

The author of etcd-compute, Chris Dent, and I are hoping to have a mini-sprint on this project next week at the Open Infrastructure Summit in Denver, Colorado. If you will be there and want to join in the fun, drop me an email and I’ll let you know when we settle on a time and place.

Playing with etcd-compute

I’ve been interested in the etcd-compute project by Chris Dent. It’s sort of a lightweight virtual machine manager like OpenStack Nova, but without the complexity and cruft Nova has accumulated over the past 9 years. It takes advantage of technologies that simply didn’t exist in 2010 when Nova was created, using etcd‘s built-in notifications instead of passing large, complex objects over a message bus to make Remote Procedure Calls (RPC).

Keep in mind that Nova does a lot of things that etcd-compute can’t, so this isn’t a potential 1:1 replacement for Nova. But it does have potential as a much lighter replacement for those applications where the full power of Nova isn’t needed.

This post is designed to be obsolete within a week or so. What I’m aiming for is to record what worked for me following Chris’s instructions. Where I run into problems shows one of three things: our systems start out differently, or Chris assumed something that wasn’t in the README.md file, or my brain is not firing on all cylinders. It is my hope that this may help improve the installation instructions, and guide others who may wish to explore etcd-compute.

I don’t have a lot of hardware—ok, any hardware—at my disposal to experiment with, so I started by creating 3 Ubuntu 18.04 VMs in the internal OpenStack cloud for my team here at IBM. Yes, you can run virtualization on top of virtualization, and it’s turtles all the way down. But it does work! I named the instances etcd1, etcd2, and etcd3, with etcd1 being the controller and the others used as standard compute nodes.

There are some requirements—docker.io, virtinst, libvirt-daemon, libvirt-clients, and libguestfs-tools—that need to be installed on all the nodes, so I updated the distro packages and installed the requirements. Unfortunately, libvirtd wouldn’t start, and well, that’s kind of an important piece. So I cleaned house and tried again:

ed@etcd1:~$sudo aptitude purge libvirt-daemon
ed@etcd1:~$sudo apt install -y qemu qemu-kvm libvirt-bin  bridge-utils  virt-manager
ed@etcd1:~$ sudo systemctl enable libvirtd.service
Synchronizing state of libvirtd.service with SysV service script with /lib/systemd/systemd-sysv-install.
Executing: /lib/systemd/systemd-sysv-install enable libvirtd
Created symlink /etc/systemd/system/libvirt-bin.service → /lib/systemd/system/libvirtd.service.
Created symlink /etc/systemd/system/sockets.target.wants/virtlockd.socket → /lib/systemd/system/virtlockd.socket.
Created symlink /etc/systemd/system/sockets.target.wants/virtlogd.socket → /lib/systemd/system/virtlogd.socket.
ed@etcd1:~$ sudo systemctl start libvirtd.service
ed@etcd1:~$ sudo systemctl status libvirtd.service
● libvirtd.service - Virtualization daemon
Loaded: loaded (/lib/systemd/system/libvirtd.service; enabled; vendor preset: enabled)
Active: active (running) since Tue 2019-04-23 23:29:30 UTC; 5s ago
Docs: man:libvirtd(8)
https://libvirt.org
Main PID: 5289 (libvirtd)
Tasks: 19 (limit: 32768)
CGroup: /system.slice/libvirtd.service
├─4486 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/lib/libvirt/libvirt_
├─4487 /usr/sbin/dnsmasq --conf-file=/var/lib/libvirt/dnsmasq/default.conf --leasefile-ro --dhcp-script=/usr/lib/libvirt/libvirt_
└─5289 /usr/sbin/libvirtd
Apr 23 23:29:30 egleafe-etcdcompute-1 systemd[1]: Starting Virtualization daemon…
Apr 23 23:29:30 egleafe-etcdcompute-1 systemd[1]: Started Virtualization daemon.
Apr 23 23:29:30 egleafe-etcdcompute-1 dnsmasq[4486]: read /etc/hosts - 10 addresses
Apr 23 23:29:30 egleafe-etcdcompute-1 dnsmasq[4486]: read /var/lib/libvirt/dnsmasq/default.addnhosts - 0 addresses
Apr 23 23:29:30 egleafe-etcdcompute-1 dnsmasq-dhcp[4486]: read /var/lib/libvirt/dnsmasq/default.hostsfile
lines 1-17/17 (END)

So I guess it’s working now!

It’s also a pain to always have to use sudo to run docker commands, so add your user to the docker group. The command for this is sudo usermod -a -G docker ed, which adds user ‘ed’ to the group ‘docker’. You have to log out and log back in for it to take effect, but once you do, you can run commands like docker ps -a without sudo.

Also, in my experience I’ve run into various odd problems using the distro version of Python, so I prefer to install from source to get the latest Python (3.7.3 right now).

Being a creature of habit, I like having the project code I’m working with to be under a ~/projects directory. So for each of these instances, I ran the following:

ed@etcd1:~$ mkdir projects
ed@etcd1:~$ cd projects/
ed@etcd1:~/projects$ git clone https://github.com/cdent/etcd-compute.git
Cloning into 'etcd-compute'…
remote: Enumerating objects: 169, done.
remote: Counting objects: 100% (169/169), done.
remote: Compressing objects: 100% (107/107), done.
remote: Total 205 (delta 97), reused 124 (delta 61), pack-reused 36
Receiving objects: 100% (205/205), 54.58 KiB | 119.00 KiB/s, done.
Resolving deltas: 100% (107/107), done.
ed@etcd1:~/projects$ cd etcd-compute/
ed@etcd1:~/projects/etcd-compute(master)$

As the etcd-compute code has its own dependencies, those need to be installed by running sudo python setup.py develop. When I ran that the first time, I got an error when it was trying to install libvirt-python. I tried installing some other libvirt-related libraries and binaries, but I kept getting the same error. After a while I was trying anything I could think of, even running under Python 2! (didn’t work). Maybe it was something about Python 3.7 that was problematic, so I created a venv for Python 3.6, and ran pip install libvirt-python. It installed without a problem. Hmmm. So I fired up a Python 3.7 venv, and it also installed into that. It seems that the installation using setup.py was doing something different than a straight pip install. To test that, I got rid of the venvs, and ran sudo pip install libvirt-python, and it worked just fine. I was then able to install the rest of the dependencies by running sudo python setup.py develop.

Now that the dependencies are installed, we need to create the database for placement, and then modify the dockerenv file so that the OS_PLACEMENT_DATABASE__CONNECTION setting points to that. My database is on a MariaDB server, so I needed to change the value to:

mysql+pymysql://user:secret@my_host/placement

That means, of course, that I need to install pymysql using sudo pip install pymysql before I can make a connection. Once that’s done, I started the docker containers by running ./docker.sh from the primary VM. In my case, that’s etcd1.

That brings up another edit that’s needed on all your “machines”: changing the location of the host in compute.yaml and schedule.yaml. These assume that the host is named ‘ds1’, which isn’t true in this case. I changed ‘ds1’ to ‘etcd1’, and then added an entry in each node’s /etc/hosts file with the IP address of the etcd1 VM.

We also want to create a value for the uuid in compute.yaml. One simple way is to run python -c "import uuid; print(uuid.uuid4())", and copy the output to paste into compute.yaml. Do that on every compute node you are running.

That’s enough for one day. Tomorrow we start with configuring networking!

Running Pi-hole

A few days ago I read a tweet from someone who recounted their experience with Pi-hole, which brands itself as “A black hole for Internet advertisements”. My curiosity being piqued, I read up some more, and I liked what I read.

I’ve run ad blocking extensions in my browsers for years, and it’s made using the web bearable. But if you think about it, what those blockers are doing is simply preventing those ads from displaying; they are still fetched from the internet, using up bandwidth, which slows down your browsing.

Pi-hole takes a different approach, acting as the DNS server for your network. When a request is received, it compares it to a curated list of domains known to serve advertisements and track users, and if it is in that list, instead of forwarding the request to that domain, it forwards it to its own built-in web server, and returns a blank page. In other words, no internet traffic at all is generated!

I had an extra Raspberry Pi on hand, so I figured that I’d play around with it this past weekend. I installed the latest Raspbian operating system on it, and followed the simple instructions to git clone the Pi-hole source code and install it. There are several other pages I found detailing the steps, but as it’s pretty straightforward I didn’t feel that I had anything to add by creating yet another step-by-step guide.

Note that you don’t have to use a Raspberry Pi to run Pi-hole; you could run it on just about any Linux system. There is even a Docker image you can pull and run, so you have plenty of options if you don’t have any Raspberry Pis on hand. If you do want to pick up a Pi or three, I recommend PiShop.us. They have the raw components as well as full kits for the Pi, and also lots of other maker-oriented products.

I use the Google Wifi mesh system in my house, so I was a little concerned that getting it to work with the Pi-hole DNS might be tricky, as the Google Wifi system is rather limited in your ability to customize it. I configured the Raspberry Pi to only use the wired ethernet connection, and plugged that into the output jack of the Google Wifi unit. I set the Google Wifi app to give that Raspberry Pi a static IP address. So far, so good. I first tested it by changing my laptop’s DNS to point to the Pi-hole’s address, and loaded a few web pages. The Pi-hole comes with an admin web server that allows you to not only configure things, but also see real-time stats of the traffic that it’s handling.

Pi-hole Admin Dashboard
Pi-hole Admin Dashboard

Once I saw that it was indeed working and working well, I opened the Google Wifi app to the Settings tab, and then opened the Network and General button. That gives you a page with several options; the one you need is Advanced Networking. The top option on that screen is DNS; open that and change it from the default of Automatic to Custom. Set the Primary server to the IP address of the Raspberry Pi running Pi-hole, and the secondary address to some other server (I used Google’s default of 8.8.8.8). Save that, and you’re in business! (see photo below) Now every device on your local network will experience faster, cleaner internet browsing free of ads!

Google Wifi settings
Configuring the Google Wifi app to use the Pi-hole for DNS. Replace the blacked-out address with your device’s IP address.

If you don’t have a central device acting as a router that you can configure, you’ll have to change each device’s settings to use the Pi-hole for its DNS. A little more work, but once it’s done, it’s done.

As an added bonus, once you’re using Pi-hole, you can disable your browser’s ad blocking software, as it is pretty much redundant. Since I’ve disabled those extensions, I haven’t gotten any of those annoying “Please turn off your ad blocker” popups when I go to some sites. I’m sure that there are some techniques that might flag Pi-hole, but so far I haven’t hit any.

I’ve only been using it for a few days now, but the results were instantly noticeable. The percentage of blocked requests varies slightly, but has typically been around 12%. The Pi-hole project is entirely user-supported, and I was more than happy to donate even after using it for a short time.

A New Ubuntu VM With Your Username

While many of the VMs I create are meant to be short-lived, some are created as working environments for me. For me, it is infinitely easier to have them all have the username ‘ed‘ so that my brain doesn’t have to do too much context-switching, and so I can use muscle memory.

When creating a new VM with an Ubuntu image, though, the default user is ‘ubuntu‘. It’s rather simple to change that to ‘ed’, but to do it correctly requires a series of steps. Note that these steps are for the current 18.04 LTS release; other releases may need some small changes, but the general steps should be the same. And as I found out those steps by searching the web and finding blogs of people who had done similar things, I thought I’d write down the steps so that in the future others can find this posting and find it useful. And it’ll also be easier for me to find all of it in one place!

So start by creating a VM, using whatever tool you prefer. I use OpenStack (surprise!), which gives me the ability to upload my public key and have it available in the VM so that passwords aren’t needed. The one I’m working on today is for deploying my IRCbot using kubernetes, so I named the instance ‘kubeirc‘. I add a line to my /etc/hosts file to point that name to the IP address of the instance. Once the VM is running I can connect using:

ssh ubuntu@kubeirc

Some people feel that using a password-less sudo reduces security, but if someone has access to your user in your VM, you are probably already pretty hosed. So I change sudo to not require passwords. To do this, run sudo visudo, and change the line beginning with ‘%sudo’ to read:

%sudo   ALL=NOPASSWD: ALL

One of the problems to overcome is that you cannot rename the user you used to SSH into the instance. So I create another user, and give them sudo powers:

sudo adduser temp
sudo usermod -a -G sudo temp

My cloud doesn’t allow SSH with passwords, so the next thing to do is copy the public key from the ubuntu user to the temp user:

sudo cp -R .ssh /home/temp/
sudo chown -R temp:temp /home/temp

Once that’s done, I disconnect the SSH session, and the re-connect as the temp user:

ssh temp@kubeirc

If that doesn’t connect you, go back in as the ubuntu user and re-check the permissions on the /home/temp/.ssh directory and its contents – that’s usually the problem. Once I’m connected as temp, I can then get to work on switching the ubuntu user.

# Change the ubuntu username to 'ed', and change the home directory
sudo usermod -l ed -d /home/ed -m ubuntu
# Rename the 'ubuntu' group
sudo groupmod -n ed ubuntu

That’s it! Now to verify, disconnect the current SSH session, and re-connect with the new name. Since my username on my home system is ‘ed’ (I told you I like muscle memory), I can just run:

ssh kubeirc

…and I’m in! The only thing left is to remove the ‘temp’ user. It’s not critical that you do so, but I like to clean up after myself. To do that, run the following:

sudo deluser temp

That will delete the user and group named ‘temp’, as well as the ‘/home/temp’ directory.

Now I can continue using that VM with the same username as my other development environments. And while this isn’t a large amount of work to do each time, I’d rather not do such repetitive work if I don’t have to. So I add my standard stuff, such as git and vim configuration files, and then take a snapshot of the instance at this point. In the future, whenever I need an Ubuntu 18.04 instance, I can create it from this snapshot, and I’m ready to go.

Using a Python virtual environment

This is a quick demonstration of how to create a virtual environment in Python3. I’m starting with an empty directory in ~/projects/demo. I then run the command to create a virtual environment:

ed@imac:~/projects/demo$ ll
ed@imac:~/projects/demo$ python3 -m venv my_env
ed@imac:~/projects/demo$ ll
total 0
drwxr-xr-x 6 ed staff 192B Jan 24 18:14 my_env

Note that the command created a directory with the name I gave it: ‘my_env’. Next we have to activate it. ‘Activate’ changes Python’s internal references to look for things such as which Python version to run, and where installed modules are placed.

ed@imac:~/projects/demo$ source my_env/bin/activate (my_env)ed@imac:~/projects/demo$

I have a bash script that changes the prompt to show the current Python environment; notice that after activating the prompt now starts with ‘(myenv)’.

Installed modules are located in the ‘site-packages’ subdirectory that’s a few levels deep. Let’s see what’s in this fresh virtual env’s site-packages:

(my_env)ed@imac:~/projects/demo$ ll my_env/lib/python3.6/site-packages/
total 8
drwxr-xr-x 3 ed staff 96B Jan 24 18:14 pycache
-rw-r--r-- 1 ed staff 126B Jan 24 18:14 easy_install.py
drwxr-xr-x 23 ed staff 736B Jan 24 18:14 pip
drwxr-xr-x 10 ed staff 320B Jan 24 18:14 pip-9.0.1.dist-info
drwxr-xr-x 6 ed staff 192B Jan 24 18:14 pkg_resources
drwxr-xr-x 34 ed staff 1.1K Jan 24 18:14 setuptools
drwxr-xr-x 12 ed staff 384B Jan 24 18:14 setuptools-28.8.0.dist-info

One of my favorite development tools for Python is the pudb debugger. To show that we can install a package, let’s try importing it first (and failing):

(my_env)ed@family-imac:~/projects/demo$ python
Python 3.6.4 (v3.6.4:d48ecebad5, Dec 18 2017, 21:07:28)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pudb
Traceback (most recent call last):
File "", line 1, in
ModuleNotFoundError: No module named 'pudb'
>>> ^D
(my_env)ed@family-imac:~/projects/demo$

Now let’s install it using pip:

(my_env)ed@family-imac:~/projects/demo$ pip install pudb
Collecting pudb
Collecting urwid>=1.1.1 (from pudb)
Collecting pygments>=1.0 (from pudb)
Using cached https://files.pythonhosted.org/packages/13/e5/6d710c9cf96c31ac82657bcfb441df328b22df8564d58d0c4cd62612674c/Pygments-2.3.1-py2.py3-none-any.whl
Installing collected packages: urwid, pygments, pudb
Successfully installed pudb-2018.1 pygments-2.3.1 urwid-2.0.1
(my_env)ed@family-imac:~/projects/demo$

Let’s look at the site-packages directory after installing pudb:

(my_env)ed@family-imac:~/projects/demo$ ll my_env/lib/python3.6/site-packages/
total 8
drwxr-xr-x 10 ed staff 320B Jan 24 18:38 Pygments-2.3.1.dist-info
drwxr-xr-x 3 ed staff 96B Jan 24 18:14 pycache
-rw-r--r-- 1 ed staff 126B Jan 24 18:14 easy_install.py
drwxr-xr-x 7 ed staff 224B Jan 24 18:35 pip
drwxr-xr-x 9 ed staff 288B Jan 24 18:35 pip-19.0.1.dist-info
drwxr-xr-x 6 ed staff 192B Jan 24 18:14 pkg_resources
drwxr-xr-x 18 ed staff 576B Jan 24 18:38 pudb
drwxr-xr-x 9 ed staff 288B Jan 24 18:38 pudb-2018.1.dist-info
drwxr-xr-x 22 ed staff 704B Jan 24 18:38 pygments
drwxr-xr-x 34 ed staff 1.1K Jan 24 18:14 setuptools
drwxr-xr-x 12 ed staff 384B Jan 24 18:14 setuptools-28.8.0.dist-info
drwxr-xr-x 33 ed staff 1.0K Jan 24 18:38 urwid
drwxr-xr-x 7 ed staff 224B Jan 24 18:38 urwid-2.0.1.dist-info
(my_env)ed@family-imac:~/projects/demo$ python

Note that there are now entries for both pudb and its dependency, pygments. And to verify that it has been successfully installed:

(my_env)ed@family-imac:~/projects/demo$ python
Python 3.6.4 (v3.6.4:d48ecebad5, Dec 18 2017, 21:07:28)
[GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pudb
>>> pudb.version
'2018.1'
>>> ^D
(my_env)ed@family-imac:~/projects/demo$

There’s a ton more to using virtual environments, but that should give you a start.