May 2016 – Walking Contradiction

No, I’m not talking about history – this is about my cycling ride on Saturday. I participated in the 2016 Tour de Cure San Antonio, and completed the 103-mile course. I’ve only ridden a century (a 100-mile ride) once before, and my attempts at doing another were thwarted twice: once, a year later, when the entire ride was washed out by heavy thunderstorms, and then again at last year’s Tour de Cure, when they closed the century course early due to thunderstorms.

Start of ride — Lining up for the start of the ride (at 7am)!

Well, this year’s ride had its share of thunderstorms, too, but fortunately they were at the end. The day started off overcast and threatening-looking, but nothing came of all those clouds. About 30 miles into the ride the sun burst through, and I was hoping that it would stick around for a while. However, we only got to enjoy the sunshine for an hour or so until the clouds returned. It kept looking darker and darker as the ride progressed, and then at the rest stop at mile 80 there were event officials warning that a little ways up the road it was already raining heavily. They had vehicles that would shuttle you and your bike to the finish line if you didn’t want to ride through the storm, but that wasn’t what I had set out to do. What’s a little water, anyway?

To be honest, I was feeling pretty drained after 80 miles. When you sweat while cycling, the breeze against you dries it quickly, so after a few hours it feels like a salty crust. My leg muscles also felt like they had begun to run out of energy. But I set out to continue the ride anyway, and sure enough, about a mile later the skies opened up. Within minutes I was soaked from my helmet to my shoes. Oddly enough, though, it was actually re-invigorating! And once you’re wet, more rain isn’t getting you any wetter, so I rode on. The loud cracks of thunder sounded great, like music for a film I was starring in. Yeah, it felt pretty dramatic!

So I made it to the finish. The first time I did a century I was struggling – hard. I wasn’t even running on fumes then; hell, I would have loved to have had some fumes at that point. I had to stop several times in that last 30 mile loop to regain enough strength to keep going. So completing that ride was a matter of sheer will power. This year it was different: sure, I was tired during the ride, and a bit stiff afterwards, but when I got within a few miles of the finish, I found another gear and sprinted my way in.

Crossing the finish line after 103 miles!

I think that there were several differences this year. I had trained much better this time, so my legs were better able to keep going for the distance. It was also much cooler, with temperatures in the 70s (instead of around 90F). And the rain, while making some aspects uncomfortable, certainly helped to refresh me. Finally, the course this year didn’t have very many severe hills. It had lots of climb, but nothing compared to the earlier course, which featured several killer hills.

posing with medal — Posing with my medal after finishing the ride, soaking wet!

There are three sets of people I want to thank: first, the American Diabetes Association, for organizing this event and making it run so smoothly – you’re really doing great work! Second, to the members of the ProFox online community for generously donating to support me. Together we raised $500! And finally, of course, to my wonderful wife Linda, who encouraged me every step of the way, and even drove back home to get my water bottles that I had forgotten. Hey, it was 6 in the morning, and my brain hadn’t caffeinated enough yet!

Linda and Ed — Linda and I, just before the start of the ride

With my recent posts I seem to have confused people, and instead of helping us all see a better solution, I’ve made things murkier. So mea culpa.

The confusion comes from mentioning two distinct and mostly unrelated problems in different posts: the issues with the current Nova Scheduler regarding resource modeling and scalability, and the problem with fragmented data in the Cells V2 design. Because I proposed Cassandra as a solution to the first, many assumed that I was promoting it as the cure-all for everything in Nova. That’s not the case, so let me start with the focus on the cells issue.

The design of Cells V2 has a globally-available database, and separate database instances in each cell. The rationale was that this limits the failure domain, so if a single cell’s DB (or any other local service) goes down, the rest of my cloud will still operate normally. While this is a big advantage for the message queue, it comes at a high cost for data, as it will be difficult now to get a view of, say, a user’s resources across cells. Users don’t see (and can’t specify) the cell for their instance, so it is important to keep that global view. The response to my criticism was split between “yeah, that’s a bad idea” and “look, we can add this additional dependency and layer of complexity to fix it!”. The ROME approach to replacing MySQL with Redis was an interesting approach, but further discussion on the email list pointed to a much better choice (IMO): Vitess. Vitess would provide the failure isolation without having to fragment the data. So I would prefer to see everything moved to a single database, and if failure isolation and redundancy is important for the database, add a tool like Vitess to handle that. I don’t think that Cells V2 is a bad idea; quite the opposite is true. My only concern was the data design and the implications of that design on everything else in Nova.

Now to get back to the Scheduler, my proposal for Cassandra was based on two things: fast, reliable data availability without duplication and syncing, and the difficulty of modeling very different resource types in a single, inflexible relational design. Those were the biggest problems facing the Scheduler, and as the long-term plan is to separate the Scheduler into its own service so that it can support an even greater number of resource types, it seemed like settling on a static resource model now was going to lead to huge technical debt in the future. I had hoped to spur a discussion about that, and it certainly did. But let me make clear that I don’t think those arguments apply to Nova as a whole.

So again, mea culpa. Let’s keep the discussions going, because even though there has been some negative energy released in the process, the overall impact has been quite positive. I had never heard of Vitess before, and had no idea that it allowed YouTube to be able to use MySQL to handle the data loads it does. It’s exciting to see all these incredibly smart people with different technical backgrounds work together to come up with better and better solutions.

Month: May 2016

The Second Century

Mea Culpa and Clarification