Creativity and AI Models

Since I’ve retired I’ve taken up reviving a project that I co-wrote over 20 years ago, bringing it into the modern age. The project is Dabo: a framework for writing cross-platform, database-centric apps. Nowadays things like this are done over the web, but in 2004 the state of web applications was pretty limited as far as UI flexibility and database integration. And there were many companies who ran their business on a networked application rather than over the internet, so the market for desktop apps was there.

The project was written using the tools of the day: Python 2.4, wxPython 2.x for the UI classes, along with the database drivers of that era. So bringing Dabo to modern times would take a lot of work. Updating to Python 3.10+ was straightforward enough; I had already worked on several 2->3 projects over the years. Updating the database drivers was also straightforward; most had current versions with the same interface, and the others had been replaced with new products that also shared a similar interface (thanks, Python DB-API!).

The biggest hurdle by far is updating to the current state of wxPython. This is a Python wrapper around the wxWidgets project, which is a C++ cross-platform framework for creating native GUI widgets. wxPython inherits a lot of the C++ feeling and style, and as a result is very un-Pythonic. So when we wrote Dabo, we made our UI layer a wrapper around wxPython. In other words, it’s a wrapper of a wrapper of a C++ library.

You’ve probably never heard of wxPython or wxWidgets – they are relatively obscure these days. And that’s where I think there’s a problem.

I was trying to debug a crash that happened in a part of an app: clicking a button on one tab of a pageframe ran some code, and then switched to a different page. Without fail, switching the page caused a crash with no Python traceback. I asked Claude to debug this for me. It spent a lot of time reading through my codebase, then the entire Dabo framework, and then the entire wxPython project before it could understand where the fault lay. It then confidently pronounced that it understood the issue, and coded the fix. I ran that fix, and it also crashed. Fed that crash report back into Claude, and it went through the same “thinking” steps before pronouncing the new fix. Well, you can guess where this is going. I went through this loop six more times until it finally got a fix that worked. The trouble is, it was one of the ugliest hacks I had ever seen: removing several event bindings, switching the page, and then restoring those bindings.

I’ve seen a lot of reviews of vibe coding where they say to consider Claude and others as “junior developers”, but I would never expect such an ugly hack from anyone, no matter how inexperienced; they would have come to me and said they can’t figure it out, so could I help them with it?

I had an idea as to what caused the crash, as I’ve seen similar crashes before and they almost always involved the event loop. Briefly, events fire all the time, and the framework handles them in the order they are received. So in a case like this, where the code changed the active page, under the hood it fired several events that control updating the UI. These events can sometimes conflict with others that happen around the same time, and cause a weird appearance at best, and a system crash at worst. The trick is to use the “call after” invocation, which instead of firing the events immediately, tells the app to wait until all pending events are processed before handling this one.

I told Claude that its “fix” was unnecessarily hackish and ugly, and had it revert those changes. (Tip: always have tools like Claude work on a development branch just in case it blows up like this). I then changed the one line that set the active page to use the call after design, and the crash went away.

Now I’m not trying to disparage tools like Claude; I’ve used them successfully in the past. What I think is different this time is unfamiliarity: there just isn’t that much code out there that uses wxPython for them to get sufficiently trained on in order to determine the correct approach. You can get great results with apps written in Python, JavaScript, Rust, Go, etc., because there are tons of repos in GitHub in those languages for LLMs to train on. There just isn’t enough training material for wxPython.

Which brings me to the main thought that resulted from this: creativity. How can we ever expect LLMs to come up with something new? They are designed to draw on what’s already been created, and if it hasn’t seen something yet, it is very unlikely to ever come up with that. I’m guessing that there was nothing in the crash report that had a link to a fix using call after, or else Claude would have come up with it as a solution. So in the next few years – the era of “vibe coding”, where LLMs generate the majority of new code – how do we expect new solutions to come about? New versions of the models will be trained on increasingly greater proportions of LLM-generated code; an inbreeding process that can only make newer models more repetitive.

Given this, can we ever expect an LLM to be creative? To propose an approach that has never been done before?

Day 44: Employed Once More!

After 3 1/2 months of unemployment, during which I submitted countless job applications, became a regular on LinkedIn, learned the routines of the Texas Unemployment Benefits system, and sat through numerous interviews, I’m excited to report that I have a new job!

In a couple of weeks I will be starting at Nvidia as a Senior Python Developer, working on the tools for their GPU cloud. I’ve met the other people on my team during the video interview process, and they all seem like a bright bunch, so I can’t wait to start working with them!

It’s been difficult these last few months. It started with the pandemic and subsequent lockdown, which has affected everyone. Then came the layoff, with DataRobot letting 25% of its workforce go, including yours truly. It really wasn’t much consolation that I was only 1 of the 40 million or so in the US who lost their job in those few weeks – it still hurt.

Still, I have had it better than most. My wife still had her job, which was super-important financially. We also had some savings, so we weren’t living paycheck-to-paycheck like so many Americans have to. And it did give me some free time to work on my photoviewer software, and practice my newly-discovered sport of disc golf. It also gave me the chance to perfect my sourdough bread technique (yeah, I know – how cliché!). But there is only so much to do when largely confined to the house.

Which is why I started this daily writing exercise. Not just to fill the time, but to get down some of the thoughts that have been in my head for a while, and polish my rusty writing skills. And while it’s been difficult to always find something to write about, I have noticed that writing itself is feeling more fluid.

I will continue this daily project until I start the job on July 20. After that, I will continue to write, but just not on a daily basis. Going through this exercise has helped me enjoy writing more, and improved my ability to let a piece out into the wild without first obsessing with endless editing. That is probably the best thing I’ve gotten out of it.

Day 43: Scarcity and Value

How do you price art?

There is one aspect of economics that everyone understands: the law of Supply and Demand. It’s pretty obvious: useful/desirable things will be valued more highly than stuff that isn’t as in demand, and scarce things will cause people to offer to pay more.

I’ve found something
No one else is looking for
I’ve found something
That there’s no use for
And what’s more
I’m keeping it to myself

Wire, Single K.O.

For this discussion let’s assume that the art in question is “desirable”, so that there is a certain level of demand for it. The determinant for price will therefore be how scarce it is.

There is a fundamental difference between a painting, in which the creative effort results in a single item, and a recording of a performance, which can be duplicated and replayed an infinite number of times. The artist can only sell their painting once, but can sell as many videos as people want.

This same issue comes up with media such as photography and print making: there really is no limit to the number of copies of a single art work that can be made. In the days of negatives, the act of making a positive print was itself part of the creative process, because the printer (usually the photographer) had to have a feel for how to balance overall exposure with local dodging and burning. The great photographer Edward Weston trained his son Cole to learn his precise printing techniques, so that Cole could continue to make prints that would be as close to the artist’s vision as possible. So while in theory an infinite number of prints could be made, there is a practical limit.

But digital photography throws all of that out the window. The artist can make whatever corrections or other changes they want to the digital file, which can then be reproduced without loss forever. So how does one determine a price for something like this?

I’ve recently begun to submit my work to several galleries, and have had some success – just yesterday I got notice that one of my photos was accepted for a show! But I’ve seen several Calls for Entry for exhibits that have a requirement that any submitted work be part of a limited edition. A Limited Edition is when the artist decides that there will only ever be a certain number of prints made, and each print is “numbered” so that they buyer knows that they are one of the few owners of that piece.

I call bullshit.

Art’s value is in the piece itself. If it moves you, makes you think, or just is stimulating to look at, it has value. The fact that only a few other people can enjoy that particular piece doesn’t change the experience; it just creates an artificial scarcity to prop up prices that otherwise can’t be justified.

Paintings are scarce, by their very nature. Digital photographs are not.

I’m not playing this game. Sure, this might keep me out of some galleries, but those are probably not compatible in spirit with me. With a calibrated monitor, I can create a digital file that can be printed exactly the same anywhere in the world. If you like my work and want a print, I will sell you a print. I won’t say “sorry, but I’ve sold all the prints I can make of that image. You’ll have to find one from some art dealer or collector”.

The digital transformation calls for new ways of thinking about art. The music business learned that lesson with the advent of the .mp3 file. The photographic business will need to grow to accommodate this new digital reality.

Day 35: Chopped Candidate

Have you ever watched the TV show Chopped? If you haven’t, it’s a competition among 4 chefs. There are 3 rounds, and after each round, one of them is chopped (eliminated), until one remains. The winner gets a cash prize. This would seem like a good way to determine who is the best of the group, right?

The problem is how the competition is run: each round the chefs are given a basket of “mystery” ingredients that they can’t see until the round begins. And more often than not, the basket contains, shall we say, “odd” combinations. One such basket contained blood orange syrup, the African spice blend ras el hanout, hot cross buns, and lamb testicles. The chefs can add other staple ingredients, but those four flavors have to be featured prominently in the result.

And if that isn’t difficult enough, there is a time limit that is always ridiculously short. The chefs had 20 minutes to create an appetizer from the basket I described above: 20 minutes to create a recipe, determine what other ingredients to add, prepare and cook the food, and then plate it for a beautiful presentation.

I must confess that I find the show very entertaining, and have watched countless episodes. And I’m not alone: the show has been running for 44 seasons over the past 11 years. But let me ask you: if you were opening a restaurant, would this be the way you would select your head chef? I would hope not! Any restaurant that would spring surprises on their chefs and expect them to deliver first-rate food in impossibly short time limits wouldn’t last very long.

Which brings me to the point of all this: if you are interviewing for a programmer, do your interviews actually determine how well they would be able to work in your team? How positive their contribution will be?

Making a candidate live code a solution to a problem they’ve never seen before in a short period of time with people watching their every keystroke is the software development equivalent to being on Chopped. I certainly hope that your work environment isn’t anything like that. So why would you think that a live coding session in an interview tells you anything about their potential?

What artificial scenarios like Chopped or live coding interviews do is test a candidate’s ability to handle stress. Personally, I’ve never had a problem with live coding, but then again I’ve never had test anxiety in school, either. I’ve seen many talented developers choke under those circumstances, but that doesn’t mean that I wouldn’t want to have them on my team.

What does it say about your company as a place to work if the bar they have to clear is how well they can handle high levels of stress?

When I first started interviewing candidates when I was at Rackspace, the standard was to have one interviewer do a live coding challenge, and another ask one of those bizarre, abstract brainteasers (“Walk us through your thoughts…”). Once again, these practices just show how nervous someone is in what is already an inherently stressful situation. That link includes a juicy quote:

These types of questions are likely to frustrate some interviewees so watch out for those who aren’t willing to play the game. It’s an interview after all and you make the rules.

Mark Wilkinson, head of recruitment, Coburg Banks

It’s all a game to him, and if asking questions with no right answers eliminates potentially good candidates, tough. It sounds like he is more interested in seeing who can tolerate being bullied than finding the best people for his company.

After sitting through some of these types of interviews at Rackspace, I campaigned internally to change these practices, because I saw some intelligent and capable candidates get flustered and end up looking dumb. I found that there are better ways to determine if someone is a good addition to your team. Perhaps I’ll elaborate more about these in a future post…

Day 29: Death to Superman

Have you ever worked with a large team on a complex project? Usually there is a mix of experience levels, and those with more experience create the application design as well as the workflow that everyone will use. They also serve as the disseminators of information, especially when a new member joins the team. They are the resources that help everyone become more productive

At least that’s how it’s supposed to work. In a company with a good development culture, knowledge is freely shared, and the goal of the senior developers is to help create new senior developers.

In other situations, there is a different dynamic: the overall knowledge of the project is in the brains of a select few developers, and they consider the intricacies of the application their domain. Often it isn’t a group, but rather a single individual in that role. They begin to act like Superman: swooping in to save the day in a manner that only they can do.

This is common in companies without a healthy developer culture. Typically there is a sole developer assigned to create an application, so of course they are the only one who knows how it works. Or the project could have started with a small team, and eventually everyone leaves the project except for one. Other teams that need that functionality need to go through this person, who is now the bottleneck, the gatekeeper. As new people are added to the team, this one developer keeps them dependent on him (yeah, it’s usually a man) by only sharing bits of knowledge only as needed, and not educating the new members. He tends to treat the other developers as inferior, and as a result, no one else feels competent to handle the work that Superman can do.

Back in the ’90s when I was a junior developer I was placed on a team that had exactly that dynamic. Someone would have a great idea, but nobody would act on it until that one lead dev signed off on it. People were even a little afraid to say that they thought it was a good idea, because if this Superman figure didn’t like it, he wouldn’t simply explain why. Instead he’d make you feel dumb for not understanding every implication your change would have.

Of course, when he wanted to change something, he just did it without involving anyone else. It wasn’t unusual to come in one day to find the part of the code you’d been working on had been changed, or sometimes even deleted. Needless to say, there was a general unhappiness on the team.

After a few months, Superman started throwing his weight around with our boss, taking the attitude “you can’t afford to lose me”. It worked for a while, but after one particularly obnoxious outburst, our boss called his bluff, and Superman quit on the spot and stormed out. Everyone on the team was both relieved that the source of tension was gone, and also afraid of how much more work this would mean for us. We were all afraid that the project would founder, and we would have to re-hire Superman, who would then be even more insufferable.

To our surprise, it wasn’t all that bad at all. Everyone started exploring the code base a bit more, now that we didn’t have Superman to supply that knowledge and make those changes. We started talking among ourselves about things we thought needed to be changed, and team members who were always quiet began to speak up more. The entire dynamic of the team changed for the better. And instead of the project falling apart without Superman to lead the way, it got better. Maybe no single person knew the entire code base like he did, but we all learned a lot more, and with people working together, got more done. We divvied up the code so that each person was responsible for learning that part well enough to be a resource to the others. Knowledge was once again being shared.

So while it’s good to have some knowledgeable people on a team to serve as guides for the newer members, it can become toxic with the wrong people and the wrong environment. If you’re on a team with such a toxic member (or members), don’t worry about what would happen if they left the project. Inevitably, the team will be better off without them. Speak with your manager if they aren’t already aware of the situation, and try to come up with a plan to spread the knowledge around better. And if you are told that they think things are fine the way they are, that’s a very strong signal that it’s time to update your resume. It’s not worth the mental toll to remain in a toxic environment.