OpenStack Nova Mid-cycle Meetup, Day 2

The second day of the mid-cycle meetup was very different than the first (for a summary of that, please see yesterday's post). While there was a set agenda that the group as a whole went through on Day 1, today was more or less broken out into ad-hoc groups who were working on a particular issue; many of these were groups of 1. So this post will be a lot shorter than yesterday's, since I don't know just what went on in each of those groups. Many of the groups were focused on patches that were very close to being ready that a lot of other work was depending on, with the goal of giving them that final push they needed to get them merged. I listened in on many of these discussions, mostly to learn more about that particular part of the codebase, since I didn't have enough familiarity to help with the coding side of things. I also spent a lot of time reviewing the changes that were being pushed, which is also an excellent way to learn, as you not only can see the code, but you can read the insights of the other reviewers about the changes.

In the afternoon we had several of the nova-spec cores review my spec on changing how the scheduler gets instance information. I know that some people dread having their work examined and criticized, but I happen to love it. The discussions uncovered several things that needed to be accounted for that had never come up in all the prior back-and-forth on the spec, so I spent a lot of the rest of the afternoon incorporating their suggestions into a revised version, and pushed that up before the day was done. It also shows how these in-person meetings can get so much more accomplished than our typical remote tools such as email and IRC, and why the summits and mid-cycles are critical to attend.

OpenStack Nova Mid-cycle Meetup, Day 1

I'm here in Palo Alto, California, for the mid-cycle meetup of the OpenStack Nova team. For those of you unfamiliar with the concept, the OpenStack community worldwide gets together every 6 months at a Summit to collectively celebrate what we've accomplished, and to plan what we'll be working on for the next 6 months. During the months that follow, though, it's easy for things to slide off to the side, or for other things to creep up and get in the way of continued progress. So many of the programs that make up OpenStack plan on getting together about halfway through the process so that we all get an idea of the progress we've made, and can discuss and potentially solve any of the issues that would prevent us from completing the work we set out to do for this cycle.

For the Nova team, we set out several things as the priorities that we would be focusing on: the next generation of the Cells design (cells v2); the continued development of Nova Objects; cleaning up the interface between the Scheduler and Nova so that scheduler may eventually be split out; the v2.1 API (microversions); functional testing; nova-network migration; no downtime upgrades; as well as working on the number of bugs we have, and improving our testing infrastructure. The meeting today started with the people heading up each of those tasks giving an update on their progress.

First up was Cells v2. It's moving along well, but not as fast as they would like. One of the big things was getting the CI testing working with cells, which currently cause most tests to fail. Progress has been made on disabling these tests for now, with the goal of fixing them so that our CI tests with cells on, which will be the standard once this work is complete. Cells are now a configurable option, and the tests now run with it off. By turning this back on, and adding the fixed tests in, we can eventually be confident that any new feature in Nova will work right away in a deployment using cells.

There has been good progress with the Objects work, but the biggest problem is that the first item to be objectified, Flavors, is a hairy mess, and required a bunch of changes to undo all the hacks that made flavors work in the past. Once completed it will bring a lot more sanity to flavors (which is a concept I believe should die in a fire, but I fought it years ago and lost, so we're stuck with it now).

On the Scheduler front, we only had one outstanding spec (mine, of course!), and lots of code up for review. The series of patches to detach Service from Compute Node is the top priority, as so much of the later patches depend on these changes.

None of the principal movers on the v2.1 API was able to make the mid-cycle, but they did fill in some of their progress information on our shared etherpad. The testing integration is nearly done, but one possible problem is support for v2.1 in novaclient.

Functional testing is aiming to get a dozen or so test patterns defined that others can use as the basis for writing future functional tests. There probably won't be much more than that in the Kilo timeframe, but the hope is that going forward these can help make funcitonal testing more pervasive.

There is a bunch of work being done for the nova-network to neutron migration, but one thing that everyone working on this wanted to make clear is that while they will be creating some tools to help deployers who want to make the switch, there will not be a single "click it and forget it" single-button migration in the near future. One other issue brought up is that while we are telling everyone who is deploying OpenStack to use Neutron and not nova-network, devstack still uses nova-network. This is poor dogfooding, so it was agreed that we will start to move devstack to use Neutron.

The zero-downtime migrations was interesting: the idea is that instead of running the current SQLAlchemy migrations which require taking the database offline, The new expand/contract approach will compare the defined structures in code with the current database, and if there is a discrepency, create the new structures (expand), migrate the data over, and then later remove the old, unneeded structures (contract). The first code patches to accomplish this have been working, although a lot of work remains to update the tests accordingly.

That was just the morning! The afternoon started with a whiteboard discussion I had asked for where we could identify just what we expect the interface between Nova and the (separated) Scheduler to look like. We did get into a little bit of implementation details at times, but overall we clarified the flow of messages between the two, and defined where the responsibility for ensuring that each build request succeeds should go. A lot of the discussion focused on how we can make the overall process bulletproof, which some saw as a tangent, but I think that this is what is needed: figure out what a solid, robust scheduling solution should look like, and though we aren't going to get there in this cycle, or even the next, we can make sure that we're moving towards that design.

The remainder of the day was largely focused on discussing process: how the Nova project is run. Was enough information communicated about what the priorities were? Were the various channels of communication being used well? How can we help the few Nova core reviewers handle the huge number of reviews more effectively? Everyone seemed to have their own preference (e.g., email vs. IRC), but no one had any concrete suggestions about what needs to change. It was pointed out that while the loads are high, they haven't been getting worse, so there is some measure of stability.

I'm looking forward to Day 2, where we plan on breaking into smaller groups to focus on pushing through as many of the critical patches we can while we're all in the same room. We'll see how that goes!

The OpenStack Big Tent and Magnum

One of the most heavily-attended design summit events at last week's OpenStack Summit in Paris was on Magnum, a proposed service for containers that would integrate into the Nova compute service. It seems that any session at any conference these days that involves Docker attracts a lot of interest, as Docker is an amazing new way of approaching how we think about virtualization and achieving efficiencies of scale.

Disclaimer: I know Adrian Otto, the leader of the Magnum project, from my days at Rackspace, and genuinely like him. I have no doubt that he would be able to put together a team that can accomplish all that he is setting out to do with this project. My thoughts and concerns about Magnum would be the same no matter who was leading the project.

The goal of the Magnum session was to present its concept and proposed architecture to the Nova ganttteam, with the hope of being designated as the official Docker project in OpenStack. However, there was a lot of push back from many members of the Nova team. Some of it had to do with procedural issues; I learned later that Magnum had been introduced at the Nova mid-cycle meetup, and the expectations set then had not been met. I wasn't at that meetup, so I can't personally attest to that. But the overall sentiment was that it was just too premature to settle on one specific approach to something as important and fast-moving as Docker. While I support the idea of Magnum and hope that it is a wild success, I also think that world of Docker/containers is moving so fast that what looks good today may look totally different 6 months from now. Moving such a project into OpenStack proper would only slow it down, and right now it needs to remain as nimble as possible.

I wrote a little while ago about my thoughts on the current discussions on the Big Tent vs. Layers vs. Small Core (Simplifying OpenStack), and I think that the Magnum effort is an excellent example of why we need to modify the approach to how we handle projects like this that add to OpenStack. The danger of the current Big Tent system of designating a single effort as the official OpenStack solution to a given problem is that by doing so we might be discouraging some group with a different and potentially better solution from pursuing development, and that would short-change the OpenStack ecosystem long-term. Besides, a little competition usually improves overall software quality, right?

OpenStack Paris Summit – Growing Up

I've just returned from the 5-day-long OpenStack Summit, and after a very long day of travel, my brain is still slightly crispy, but I wanted to record some impressions of the summit. Since it was held in Paris, there are a lot of non-technical experiences I may write about, but for now I'll limit my thoughts to those concerning OpenStack.

For those who don't know my history, I was one of the original OpenStack developers who began the project in 2010, and participated in all of the early summits. After two years I changed roles in my job, which meant that I was no longer actively contributing to OpenStack, so I no longer was able to attend the summits. But now that I'm back as a full-time contributor in my role at IBM, I eagerly anticipated re-acquainting myself with the community, which had evolved since I had last been an active member.

First, let me say how impressive it is to see this small project we started grow into the truly international phenomenon it has become. The sheer number of people and exhibitors who came to Paris to be involved in the world of OpenStack was amazing: the latest count I saw was over 4,600 attendees, which contrasts with around 70 at the initial summit in Austin.

Second, during my hiatus away from active development on OpenStack many of the active core contributors to Nova have moved on, and a whole new group has taken their place. In the months leading up to the summit I got to know many of them via IRC and the dev email list, but had never met them in person. One thing about OpenStack development that has always been true is that it's very personal: you get to know the people involved, and have a good sense of what they know and how they work. It is this personal familiarity that forms the basis of how the core developers are selected: trust. There is no test or anything like that; once you've demonstrated that you contribute good code, that you understand the way the various parts fit together, that you can take constructive criticism of your code, and that you can offer constructive criticism on others' code, eventually one of the existing core members nominates you to become core. The other cores affirm that choice if they agree. Rarely have I seen anyone nominated for core who was rejected by the group; instead, the reaction usually is along the lines of "oh, I thought they were core already!". As one of my goals in the coming year is to once again become a core Nova developer, getting to meet much of the current core team was a great step in that direction.

And lastly, while the discussions about priorities for the Kilo cycle were lively, there was almost none of the polarizing disagreements that were part of Nova's early days. I believe that Nova has reached a maturity level where everyone involved can see where the weak points are, and agree on the need to fix them, even if opinions on just how to do that differed. A great example was the discussion on what to do about Cells: do we fix the current approach, or do we shift to a different, simpler approach that will get us most, but not all, of what the current code can do, but with a cleaner, more maintainable design. After a few minutes of discussion the latter path was chosen, and we moved on to discussing how to start making that change. While I miss the fireworks of previous summit sessions, I much prefer the more cooperative atmosphere. We really must be growing up!

Default to Respect

If you know me, you know that I have a sense of humor that can be risqué at times (ok, perhaps crude would be a better description!). I'm also known to engage in the predominantly male form of communication that involves bonding by insulting each other: put downs, the dozens, whatever you want to call it. I also hold very opinionated positions on politics and religion, and enjoy engaging in lively discussions about them.

Yet when I am in a group of people I do not know very well, I do none of these things. Why? Because I am aware of their potential for offending people, or at the very least, making them feel uncomfortable. So I default to respect.

In programming, a default value is one that is used unless specifically overridden. Setting your default to respect means that unless you are certain that everyone within earshot (or who can otherwise observe you) knows you well enough to properly interpret your words or actions, limit yourself to those words or actions that do not require special interpretation; those that show respect for the people around you. Failing to do this is one of the biggest sources of the problems in the tech community when it comes to how women and other under-represented groups are treated. At conferences, or online, guys (yes, it's a guy problem) act as they would normally do when they are within their tight-knit group of friends, and say/write/do something that is interpreted as offensive or even hostile. When their poor choices are pointed out, they get defensive, using the excuse that their intention was not to offend, so no one should take it badly. Or they attack, claiming that the person who pointed out their behavior is too "politically correct" (at best), or an over-sensitive bitch (if the reporter is a woman). These attacks all too frequently cross the line from name-calling to outright threats.

But refraining from sexual references or racial stereotypes is not being "politically correct"; it's a sensible default value. Maybe later you might get to know these people better, and more importantly, they'll get to know you better. Only then when you make a crude joke will they know that you mean no harm. But until then, the only sensible approach is to default to respect. The practice of a conference having (and enforcing!) a Code of Conduct is really a way of defining these sensible defaults for people who apparently never learned them growing up. It is encouraging to see them become more common than not, for that will help our communities "grow up" and become more inclusive. The days of the tech world being an old boys club are quickly drawing to a close, although it can't happen fast enough for me.

So does that mean that you need to muzzle yourself? No, of course not. There are plenty of places where you can express yourself; hell, if you follow me on Google+, you'll see that I'm not at all shy about stating my opinions. It's totally appropriate there, because if you don't like what I'm writing or the way I write it, you don't have to follow me; there are plenty of other people you might like better. But a conference or an online forum is a community vehicle, and filling them with potentially hostile or offensive words or actions means that we will turn away many who would have otherwise helped the community grow better. We all suffer when otherwise talented and interesting people choose not to engage in our communities because they do not feel welcome. So do us all a favor and when you are in a community situation, set your default to respect.