OpenStack Focus

There has been a bit of concern expressed lately that OpenStack is somehow losing its focus, and is in danger of losing its momentum because of the effect of the Big Tent, and its  resulting growth in the number of projects that call themselves OpenStack. This growth has even been blamed for the recent layoffs of OpenStack developers. Some have called on the OpenStack leadership to dismantle the Big Tent approach, and only focus on a few core projects.

While it is true that the plethora of projects has diverted attention from the work needed in the heart of OpenStack (and I won’t go into how to draw the line separating the two here), I feel that the criticism is misplaced. It isn’t up to the governing bodies of OpenStack to enforce such a refocusing; rather, it is up to the contributors to make such decisions. That’s just the way that open source development works. It is silly to think that companies like HPE would take their marching orders from the OpenStack Foundation Board, or the OpenStack TC. The idea of the Big Tent, that all projects that are “one of us” shall have access to the same resources, is fine as it is.

The mistake that I believe many companies made is that they tried to focus on beefing up numbers that are irrelevant, such as lines of code, or the number of cores or PTLs they employed, as a way of demonstrating their commitment to OpenStack. They then would use those numbers for their sales teams as a selling point for their OpenStack-based offerings.

Open Source is a difficult sell for most companies; they certainly understand the benefit when they use it, but have a much harder time justifying the cost of paying their employees to work on something that is used by everyone, even their competitors. So they came up with ways of selling their particular spin on OpenStack, and used these contribution number to impress customers. So when that failed to generate the type of revenue that was expected, out came the axe.

I believe that many of these companies encouraged the development of these small peripheral projects because it would be easier for one of their employees to achieve core status, and possibly get elected PTL, which their marketing departments would use in an attempt to prove that company’s OpenStack-ness.

I don’t agree that there is anything that OpenStack itself needs to do. Rather, the companies who are contributing to OpenStack need to better understand the nature of open source development, and focus on those areas that will make OpenStack as a whole richer and more reliable, instead of gaming the system to make themselves look important. So please stop saying that this is the fault of the Big Tent.

Pair Development

If you’ve worked on large open source projects, one of the difficulties is dividing the workload. The goal, of course, is to spread it out so that every developer has a workload that will keep them busy, and everyone is working in sync towards a common goal. This isn’t easy in practice, as there is no top-down authority to hand out assignments and keep everyone on track, as there is in a corporate development environment. It requires a good deal of communication among the members of the team, as well as a good deal of trust.

This problem was brought to light recently in the Nova community. The issue was with the subteam working on the scheduler/placement engine, of which I’m a member. During the Newton development cycle, there was a significant bottleneck due to the fact that one person, Chris Dent, was responsible for a large chunk of work in designing and coding the Placement API and underlying engine, while the rest of us could only help by doing reviews after the code was written. And this isn’t a new thing: during Mitaka, it was Jay Pipes who was the bottleneck with the development of the Resource Providers concept, and in Liberty, it was Sylvain Bauza with the huge amount of work he did to integrate the Request Spec into Nova. Don’t get me wrong: I’m not criticizing any of these people, as they all did great work. Rather, I am expressing frustration that they bore the brunt of the load, when it didn’t have to be that way. I think that it is time to try a different approach in Ocata.

I propose that we use Pair Development. No, not Pair Programming – that’s an entirely different thing. Pair Development is when each “chunk” of work is not undertaken by a single developer, but rather to two. They discuss the path they want to take ahead of time, and instead of splitting the work, they both work on the same patches at the same time. Wait, you say – won’t this slow things down? I don’t believe that it will, for several reasons. First, when discussing a design, having multiple sets of eyes will reduce the number of dead ends, in the same way that bugs are reduced in pair programming by having both developers review the code as it is being written. Second, when a reviewer finds an issue with a patch, either developer can make the fix. This is an even greater benefit if the two developers are in different, but overlapping, time zones.

We also have as evidence the week before the most recent Feature Freeze: the placement stuff needed to get in before FF, and so a whole group of us pulled together to make that happen. Having a diverse set of eyes uncovered several edge cases and inconsistencies in the code, and those were resolved pretty quickly. We used IRC mostly, but had a Google Hangout at least once a day to discuss any outstanding, unresolved matters, so that we would all be on the same page. So yeah, the time pressure helped instill a bit of urgency in us all, but I think that it was having all of us own the code, not just Chris, that made things happen as well as they did. I know that I was familiar with the code, having reviewed much of it before, but now that I had to change it and test it myself, my understanding grew much deeper. It’s amazing how deeper you understand something when you touch it instead of just look at it.

Another benefit of pair development is that it provides much more continuity when one of the developers takes some time off. Instead of the progress getting put on hold, the other member of the development pair can continue along. It will also help to have more than one person know the new code intimately, so that when a behavior surfaces that is not expected, we aren’t depending on a single person to figure out what’s going on.

So for Ocata, let’s figure out the tasks, and make sure that each has two people assigned to it. I will wager that come the end of the cycle, it will help us accomplish much more than we have in previous releases.

Is Swift OpenStack?

There has been some discussion recently on the OpenStack Technical Committee about adding Golang as a “supported” language within OpenStack. This arose because the Swift project had recently run into some serious performance issues, which they solved by re-writing the bottleneck process in Golang with much success. I’m not writing here to debate the merits of making OpenStack more polyglot (it’s no secret that I oppose that), but instead, I want to address the issue of Swift not behaving like the rest of OpenStack.

Doug Hellman summarized this feeling well, originally writing it in a pastebin, but then copying it into a review comment on the TC proposal. Essentially, it says that while Swift makes some efforts to do things the “OpenStack Way”, it doesn’t hesitate to follow its own preferences when it chooses to.

I believe that there is good reason for this, and I think that people either don’t know or forget a lot of the history of OpenStack when they discuss Swift. Here’s some background to clarify:

Back in the late ’00s, Rackspace had a budding public cloud business (note: I worked for Rackspace from 2008-2014). It had bought Slicehost, a company with a closed-source VPS system that it used as the basis for its Cloud Servers product, and had developed a proprietary object storage system called NAST (Not Another S Three: S3, get it?). They began hitting limits with NAST fairly soon – it was simply too slow. So it was decided to write a new system with scalability in mind that would perform orders of magnitude better than NAST; this was named ‘Swift’ (for obvious reasons). Swift was developed in-house as a proprietary software project. The development team was a small, close-knit group of guys who had known each other for years. I joined the Swift development team briefly in 2009, but as I was the only team member working remotely, I was at a significant disadvantage, and found it really difficult to contribute much. When I learned that Rackspace was forming a distributed team to rewrite the Cloud Servers software, which was also beginning to hit scalability limits, I switched to that team. For a while we focused on keeping the Slicehost code running while starting to discuss the architecture of the new system. Meanwhile the Swift team continued to make strong progress, releasing Swift into production in the spring of 2010, several months before OpenStack was announced.

At roughly the same time, the other main part of OpenStack, Nova, was being started by some developers working for NASA. It worked, but it was, shall we say, a little rough in spots, and lacked some very important features. But since Nova had a lot of the things that Rackspace was looking for, we started talking with NASA about working together, which led to the creation of OpenStack. So while Rackspace was a major contributor to Nova development back then, from the beginning we had to work with people from a wide variety of companies, and it was this interaction that formed the basis of the open development process that is now the hallmark of OpenStack. Most of the projects in OpenStack today grew out of Nova (Glance, Neutron, Cinder), or are built on top of Nova (Trove, Heat, Watcher). So when we talk about the “OpenStack Way”, it really is more accurately thought of as the “Nova” way, since Nova was only half of OpenStack. These two original halves of OpenStack were built very differently, and that is reflected in their different cultures. So I don’t find it surprising that Swift behaves very differently. And while many more people work on it now than just the original team from Rackspace, many of that original team are still developing Swift today.

I do find it somewhat strange that Swift is being criticized for having “resisted following so many other existing community policies related to consistency”. They are and always have been distinct from Nova, and that goes for the community that sprang up around Nova. It feels really odd to ignore that history, and sweep Swift’s contributions away, or disparage their team’s intentions, because they work differently. So while I oppose the addition of languages other than Python for non-web and non-shell programming, I also feel that we should let Swift be Swift and let them continue to be a distinct part of OpenStack. Requiring Swift to behave like Nova and its offspring is as odd a thought as requiring Nova et. al. to run their projects like Swift.

OpenStack Ideas

I’ve written several blog posts about my ideas for improving OpenStack, with a particular emphasis on the Nova Scheduler. This week at the OpenStack Summit in Austin, there were two other proposals put forth. So at least I’m not the only one thinking about this stuff!

At the Tuesday keynote, Intel demonstrated a version of OpenStack that was completely re-written in Go. They demonstrated creating 10,000 containers and 5,000 VMs in under a minute. Pretty impressive, right? Well, yeah, except they gave no idea of what parts of Nova were supported, and what was left out. How were all those VMs scheduled? What sort of logging was done to help operators diagnose their sites? None of this was shown or even discussed. It didn’t seem to be a serious proposal for moving OpenStack forward; instead, it seemed that it was a demo with a lot of sizzle designed to simply wake up a dormant community, and make people think that Intel has the keys to our future. But for me, the question was always the same one I deal with when I’m thinking about these matters: how do you get from the current OpenStack to what they were showing? Something tells me that rather than being a path forward, this represents a brand-new project, with no way for existing deployments to migrate without starting all over. So yeah, kudos on the demo, but I didn’t see anything directly useful in it. Of course Go would be faster for concurrent tasks; that’s what the language was designed for!

The other project was presented by a team of researchers from Inria in France who are aiming to build a massively-distributed cloud with OpenStack. Instead of starting from scratch as Intel did, they instead created a driver for oslo.db that mimicked SQLAlchemy, and used Redis as the datastore. It’s ironic, since the first iteration of Nova used Redis, and it was felt back then that Redis wasn’t up to the task, so it was replaced by MySQL. (Side note: some of my first commits were for removing Redis from Nova!) And being researchers, they meticulously measured the performance, and when sites were distributed, over 80% of the queries performed better than with MySQL. This is an interesting project that I intend on following in the future, as it actually has a chance of ever becoming part of OpenStack, unlike the Intel project.

I still hold out hope that one day we can free ourselves of the constraints of having to fit all resources that OpenStack will ever have to deal with into a static SQL model, but until then, I’m happy with whatever incremental improvements we can make. It was obvious from this Summit that there are a lot of very smart people thinking about these issues, too, and that fills me with hope for the long-term health of OpenStack.

Distributed Data and Nova

Last year I wrote about the issues I saw with the design of the Nova Scheduler, and put forth a few proposals that I felt would address those issues. I’m not going to rehash them in depth here, but summarize instead:

  • The choice of having the state of compute nodes copied back to the scheduler over RPC was the source of the raciness observed when more than one scheduler was running. It would be better to have a database be the single source of truth.
  • The scheduler was created specifically for selecting hosts based on basic characteristics of VMs: RAM, disk, and VCPU. The growth of virtualization, though, has meant that we now need to select based on myriad other qualities of a host, and those don’t fit into the original ‘flavor’-based design. We could address that by creating Resource classes that encapsulated the knowledge of a resource’s characteristics, and which also “knew” how to both write the state of that resource to the database, and generate the query for selecting that resource from the database.
  • Nova spends an awful lot of effort trying to move state around, and to be honest, it doesn’t do it all that well. Instead of trying to re-invent a distributed data store, it should use something that is designed to do it, and which does it better than anything we could come up with.

But I’m pleased to report that some progress has been made, although not exactly in the manner that I believe will solve the issues long-term. True, there are now Resource classes that encapsulate the differences between different resources, but because the solution assumed that an SQL database was the only option, the classes reflect an inflexible structure that SQL demands. The process of squeezing all these different types of things into a rigid structure was brilliantly done, though, so it will most likely do just what is needed. But there is a glaring hole: the lack of a distributed data system. Until that issue is addressed, Nova developers will spend an inordinate amount of time trying to create one, and working around the limitations of an incomplete solution to this problem. Reading Chris Dent’s blog post on generic resource pools made this problem glaringly apparent to me: instead of a single, distributed data store, we are now making several separate databases: one in the API layer for data that applies across the cells, and a separate cell database for data that is just in that cell. And given that design choice, Chris is thinking about having a scheduler whose design mirrors that choice. This is simply adding complexity to deal with the complexity that has been added at another layer. Tracking the state of the cloud will now require knowing what bit of data is in which database, and I can guarantee you that as we move forward, this separation will be constantly changing as we run into situations where the piece of data we need is in the wrong place.

When I wrote last year, in the blog posts and subsequent mailing list discussions, I think the fatal mistake that I made was offering a solution instead of just outlining the problem. If I had limited it to “we need a distributed data store”, instead of “we need a distributed data store like Apache Cassandra“, I think much of the negative reaction could have been avoided. There are several such products out there, and who knows? Maybe one of them would be a much better solution than Cassandra. I only knew that I had gotten a proof-of-concept working with Cassandra, so I wanted to let everyone know that it was indeed possible. I was hoping that others would then present their preferred solution, and we could run a series of tests to evaluate them. And while several people did start discussing their ideas, the majority of the community heard ‘Cassandra’, which made them think ‘Java’, which soured the entire proposal in their minds.

So forget about Cassandra. It’s not the important thing. But please consider some distributed database for Nova instead of the current design. What does that design buy us, anyway? Failure isolation? So that if a cell goes down or is cut off from the internet, the rest can still continue? That’s exactly what distributed databases are designed to handle. Scalability? I doubt you could get much more scalable than Cassandra, which is used to run, among other things, Netflix and the Apple App Store. I’m sure that other distributed DBs scale as well or better than MySQL. And with a distributed DB, you can then drop the notion of a separate API database and separate cell databases that all have to coordinate with each other to get the information they need, and you can avoid the endless discussions about, say, whether the RequestSpec (the data representing a request to build a VM) belongs in the API layer (since it was received there) or in the cell DB (since that’s where the instance associated with it lives). The data is in the database. Write to it. Query it. Stop making things more complicated than they need to be.