Widgets Are Lego Bricks (and Other Things People Are Sleeping On) — with Vincent Warmerdam

It's just more coincidental than you might think. But it is conscious coincidental in the sense of, oh, that sounds interesting. I'm consciously going to move in that direction.

So widgets are like Legos. That's really the way that I like to think about it.

The human in the loop: data science and AI

To me, it's when the humans involved a little bit more. So like if you have a, oh, people have to use the app and like the innards don't really matter. You know, the react app thing sounds a bit more plausible. But when dealing with data, I mean, usually the whole problem with data science is that you want to understand the story that's in the data set. And if you understand it, then you can make better decisions. That's kind of the plan that we have. But in order for you to understand the data, the human has to be involved somehow. Like you have to be able to read a chart or like inspect or whatnot. So the whole point of these widgets is to make that part easier.

One of the, I mean, one of the big questions on my mind these days, and I'm obviously interested in what you think about it, is that in a world where people essentially stop writing code and they're just prompting, like what do you think, just looking at the current stack of tools, like the widgets and, you know, the notebook. And I mean, I think it's good that Marimo was built during, in this era already. So like, I assume that you've all been thinking hard about this, but like, what do you think the average user workflow looks like in two years or three years where like essentially nobody's writing any Python code anymore.

So like, part of it is the psychology of it. I feel like a little bit, right? Like, oh, we don't know. It feels kind of scary. And the whole thing with Claude, at least to me, is sometimes there's a wave of like, oh my God, it can do so much. But then there's a phase where I see it feel horribly that I kind of feel grounded again. So there's that psychology of it is happening in the background as that's happening in the back of my mind.

I just want to be honest about it. I don't exactly know what's going to happen, but if I think about my days in consulting, I'm like, what's the thing that really went wrong most of the time? It's that people just didn't know what was in the data and they were making all sorts of weird, like bad decisions because of it. And I don't know if the LLM is going to do everything and if it's going to be more or less bad decisions, but something in my mind is basically saying like, the magic of the notebook is it's also a nice way to debug.

There is something about, let's make a chart that should be a summary. And then just staring at the chart for five minutes can immediately give you this, oh, hang on, that should not happen. Why is that line going down in December? It should be going up. So it's a debugging thing. And in that sense, you could argue all the code is not necessarily the most important thing. The most important thing is the understanding. The notebook and the code are a tool in order to get to the data understanding bits, so to say.

The ChickWeight dataset and why chickens matter

The best example I have of this is in R, my favorite dataset, the ChickWeight dataset. So one column is the diet. The other column is the time. The other column is the chicken. And the other column is the weight of the chicken. So over time, you see all these chickens gaining weight. That's the whole idea. And you can train machine learning models on it. You can predict, given this diet, given this time, how fat is the chicken or how not fat is the chicken.

But there's one thing wrong if you're going to do modeling that way. And that is the fact that some chickens actually die prematurely. And the only way that the model can be made aware of that is if you, the human, are aware of that. Because if you're going to calculate some sort of regression average on timestamp 10, how is the model going to be aware of the fact that some chickens died at timestamp five for that specific diet? Well, we could have an LLM automate this. But this is my main example of things will go wrong if you don't understand what's in the dataset.

And variants of this story happen all over the place. So for me, the notebook is basically this environment where I can do my best to prevent stuff like that from going wrong.

It's one of Hadley's quotes that I remember. The cool thing about visualization is it doesn't scale, but it can still surprise you. It's that surprising nature. It's basically this debugging tool, but for numeric stuff. That's the way that I experienced it. And even if the code is going to be maybe less valuable, the debugging numeric aspect of it definitely won't become less important.

So one thing I do a lot now is I try to just reproduce academic articles with vibe coding. Here's an article, here's an LLM. I want to understand this article as quickly as possible. And then the notebook again becomes this artifact where I can look at charts and I can try to understand if I understand the technique of the paper, so to say. That's, in my mind, the essence of what the notebook is all about. It's not so much about the code per se. It's more this artifact that helps me think. Or reason, I guess, is the better word.

Vibe-coded science and confirmation bias

Yeah, I'm definitely pretty AI-pilled these days. I feel definitely this sense, especially for the data science ecosystem, I feel like data science is one of the domains where the most judgment and nuance is present. It isn't like building a to-do list app. Doing data science is not the same as building a CRUD app. And so the choice of models and the decisions involved, there is a certain art and taste involved with choosing techniques and building a scientific process that tries to eliminate your predispositions or biases about what you think the analysis should, what results the analysis should have.

And what I've been hearing from people who are doing vibe-coded science is that essentially the models are kind of tuning into what you're trying to prove or what you're trying to demonstrate. And then they're subtly building data science that essentially tries to prove or fit itself to the assertion that you're trying to make, which is essentially amplifying your personal bias. I feel like this is exactly what we don't want. We don't want the LLMs essentially doing sycophantic science. Like, oh, what are you trying to prove here? Let me come up with an analysis that confirms your...

you know, it's like a more intense version of confirmation bias, which is a classic problem in statistics and science in general. So, I don't know, like it's in, yeah, in this new world where people don't want to, like, don't want to write code anymore.

You could argue though that, I'm assuming in your case, you do, I mean, you've written pandas and a bunch of things, so you can tell yourself, like, okay, I've got a taste, right? Like, there's, I need this to happen in a certain way, otherwise it's BS. Like, you can make claims like that quite comfortably, I think, right?

I mean, in my, in my mind, there's a couple of very human habits that I tend to rely on. But at least the thing that I taught myself, a chick weight example again, right? If I've learned anything from the chick weight dataset is that if I make a chart, I need to stop for five minutes and look at the damn thing. Cause otherwise you're not going to notice the line that stops in the middle where a chicken might've died. Like that, that's kind of the lesson.

And that also means that if you're vibe coding, okay, one chart at a time, I need to stare at the chart for a couple of minutes before I move on. I'm not going to, if I do serious work, generate entire notebooks from scratch, because then you're not going to be part of the story of what's happening there. And I don't know, like some of this does feel a little bit more like an added human psychological attitude thing than necessarily a LLMs can correct each other. Cause again, it's all about me trying to understand the story in the dataset. And if I remove myself from that equation, I'm going to be in the world of hurt.

I need to be involved somehow. And maybe the solution then at least the most plausible one to me is let's only generate three cells at a time, not, not any faster. And when you reach a milestone, make sure you understand everything that's been happening and then you move on.

I need to be involved somehow. And maybe the solution then at least the most plausible one to me is let's only generate three cells at a time, not, not any faster. And when you reach a milestone, make sure you understand everything that's been happening and then you move on.