The work on the OpenStack Placement service has been getting more and more complex in the last few months, as the service has matured to handle more complex resources. At first, things were pretty basic: you had a ResourceProvider, and it had Inventory of various ResourceClasses. You told Placement how much you needed of one or more ResourceClasses, and it returned those providers that had enough to meet your request. Simple.
This was the world where the ResourceProviders were simple computers, and the resources were that computer’s RAM, disk, and VCPUs. You wanted a VM with so much RAM, disk, and VCPU, and Placement returned those compute nodes that could provide that. This was all implemented internally using common relational database techniques, and it was simple and fast.
Now we are starting to model more complex resources, and the complexity is growing rapidly. We have sharing providers, with the canonical example being a shared disk whose storage can be consumed by a number of compute nodes, so there needs to be a way to model that. There are also nested providers, such as a network virtual function (VF) that is provided by a NIC that is part of the computer. This means that when you ask Placement for an instance that has a VF, it has to know the relationship between the compute node (root provider), the NIC (resource provider), and the VF (resource), and return the whole structure. These nested relationships aren’t limited to just one level; when you add in things such as NUMA, there can be many such nesting levels.
The modeling complexity has now grown significantly, and the ability to implement this complexity using relational databases, while certainly still possible, requires solutions that are less and less readily comprehensible for anyone looking at the code for the first time. I worked as a SQL DBA for several years, so I’m not a stranger to SQL, but I need to read the code several times, usually with a pen and paper to diagram things, before I truly understand what each bit is doing. If you’d like to see what I mean, peruse the _get_trees_matching_all() method of nova/api/openstack/placement/objects/resource_provider.py. It’s truly amazing that that code works as well as it does, but drives home the point that overly complex solutions indicate that there is a poor fit between your model and the thing that you’re trying to model.
No database will ever get rid of the complexity, of course, but some are much better suited to handling relations of these types than traditional relational databases. At the last PTG, these discussions, as well as the drawings that were made to illustrate the relationships, made me think about another type of database: graph databases. So I started playing around with the most popular one, Neo4j. Graph databases use a relationship-first approach to storing and accessing data, and this seemed to better model the needs of Placement than an RDBMS.
I do realize that it is too late for OpenStack Placement to change something as fundamental as their data storage model. But after playing around with Neo4j for a little while, it’s obvious that it fits the problem domain better than MySQL does. I’ll demonstrate this in the next post in this series. But the current state of Placement reminds me of the old saying “when all you have is a hammer…”.
One thought on “Hammer and Nail”