<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
    <channel>
      <title>Ceejbot&#x27;s notes</title>
      <link>https://blog.ceejbot.com</link>
      <description>This internet thing seems to have taken off.</description>
      <generator>Zola</generator>
      <language>en</language>
      <managingEditor>ceejceej@gmail.com (C J Silverio)</managingEditor>
      <webMaster>ceejceej@gmail.com (C J Silverio)</webMaster>
      <copyright>© 2022-2026 C J Silverio</copyright>
      <atom:link href="https://blog.ceejbot.com/rss.xml" rel="self" type="application/rss+xml"/>
      <lastBuildDate>Wed, 08 Apr 2026 12:00:00 +0000</lastBuildDate>
      <item>
          <title>Writing design docs</title>
          <link>https://blog.ceejbot.com/posts/design-docs/</link>
          <pubDate>Wed, 08 Apr 2026 12:00:00 +0000</pubDate>
          <author>ceejceej@gmail.com (C J Silverio)</author>
          <guid>https://blog.ceejbot.com/posts/design-docs/</guid>
          <description>How to figure out what to do and tell your colleagues at the same time</description>
          <content:encoded><![CDATA[<p>A couple of years ago I <a rel="noopener external" target="_blank" href="https://blog.ceejbot.com/posts/systems-analysis-rubric/">wrote a blog post</a> on how to write a design document. This is a high-octane version of that blog post, refined for the era of short attention spans like the one I have now. In this post, I tell you what a design document is and what it <em>isn’t</em>, and give you a way that works for me to write one, only with fewer words. I’m doing this because I have a hard time getting anybody to read the previous one.</p>
<p>Maybe I should record a TikTok?</p>
<hr />
<p>First: A design document is not the deliverable you care about. The deliverable you care about is <em>shared understanding</em> of a problem and <em>alignment</em> on a solution. The act of writing the document forces you to think clearly, surfaces what you don’t know, and gives your colleagues something concrete to react to.</p>
<p>Second: If you skip the document and jump straight to code, you <em>might</em> make all the same decisions — you’ll just make them implicitly, alone, and without the benefit of other people’s knowledge. Writing the document makes the decisions explicit and your reasoning reviewable. Who knows, you might learn something crucial that changes your decisions.</p>
<h2 id="what-s-in-a-design-doc">What’s in a design doc</h2>
<p>Include all of the following, in whatever volume makes sense for the project:</p>
<ol>
<li>
<p><strong>Problem statement.</strong> What’s wrong, or what’s missing? What change in system behavior do you want? What does success look like? What problems are you explicitly <em>not</em> solving?</p>
</li>
<li>
<p><strong>Background and research.</strong> What exists today and how does it work? What did you learn by investigating? Include data: scale, growth rates, failure modes, constraints. A reader who doesn’t live in this part of your system or codebase should be able to follow this.</p>
</li>
<li>
<p><strong>Project values.</strong> What do you care about most? Resilience? Speed of delivery? Privacy? Cost? Simplicity? Write these down, because you will need them when you evaluate tradeoffs. Every interesting engineering decision involves giving up something to get something else. Without agreed-upon values, every tradeoff discussion devolves into opinion.</p>
</li>
<li>
<p><strong>Options considered.</strong> What approaches did you evaluate? You don’t need an exhaustive list — two or three real contenders, with their tradeoffs stated honestly against your project values, is enough. If you only considered one option, say so and explain why.</p>
</li>
<li>
<p><strong>Recommended solution, with rationale.</strong> What did you choose, and <em>why</em>? The rationale should trace back to the project values and the tradeoffs you surfaced. A reader should be able to disagree with a value and see exactly how that would change the conclusion.</p>
</li>
<li>
<p><strong>Open questions.</strong> What don’t you know yet? What needs more research? What decisions are you deferring, and why? Admitting what you don’t know is a sign of rigor, not weakness.</p>
</li>
</ol>
<h2 id="how-to-write-it-widen-the-circle">How to write it: widen the circle</h2>
<p>Don’t write the document in isolation and then present it to a crowd. Widen the conversation in stages:</p>
<p><strong>Start with yourself.</strong> Take notes. Do research. Sketch the problem statement. This is where you discover what you don’t know.</p>
<p><strong>Talk to one or two trusted people.</strong> Not to present a solution to them, but to <em>learn</em>. Explain the problem, listen to what they ask, take note of what you can’t answer. Their questions are more valuable than their opinions! This is the conversation that does the most for you. It’ll take that problem statement and put a fine edge on it.</p>
<p><strong>Show it to domain experts.</strong> The people who know the systems you’re touching, the data you’re handling, the users you’re affecting. They will find the gaps in your research and the flaws in your assumptions. Revise including what you learn.</p>
<p><strong>Show it to stakeholders before any public review.</strong> Nobody likes surprises. Give the people who care about the outcome a chance to read and react privately before you put it in front of the wider group. This isn’t politics — it’s respect for their perspective and time. A rule to live by: don’t surprise people in meetings.</p>
<p><strong>Then open it up.</strong> By the time the document reaches a wider audience, it should have survived several rounds of honest questioning. The public discussion becomes productive because the obvious gaps are already filled.</p>
<h2 id="common-failure-modes">Common failure modes</h2>
<p><strong>Jumping to the solution.</strong> The document opens with “we will use Tool X” and spends all its energy on implementation details. The reader can’t tell if Tool X is the right choice because the problem was never stated clearly enough to evaluate alternatives. Or, solution Y is described in detail, but the reader doesn’t know why solution Y is the one that got all the detail.</p>
<p><strong>Missing the why.</strong> Every design choice has a reason; write down the reason. If you can’t articulate it, you haven’t finished thinking. “It seemed simpler” is a reason — write it down. “We didn’t consider alternatives” is also worth writing down, because it tells a reviewer where to push.</p>
<p><strong>Not stating your values.</strong> Without project values, tradeoff discussions become arguments about personal preference. Two engineers can look at the same tradeoff and reach opposite conclusions — both reasonably — if they’re optimizing for different things. Make the optimization targets explicit.</p>
<p><strong>Writing for approval instead of understanding.</strong> If the goal is to get the document approved with minimal friction, you’ll unconsciously hide the hard tradeoffs and open questions. If the goal is shared understanding, you’ll surface the difficult decisions. Writing for approval produces documents that look polished and collapse under scrutiny. Writing for understanding produces documents that look rough and hold up.</p>
<h2 id="the-inevitable-recap">The inevitable recap</h2>
<p>This is the part where I tell you what I just told you, only while swinging a big loud hammer.</p>
<p>The act of writing the document is more important than the document itself. Once you start implementation, the document will drift out of sync with reality; this is normal! What lasts is the shared understanding you and your team have because you worked through the problem together. You are all certain that you’re solving the right problem with the right tradeoffs. The shared understanding is what matters most.</p>
<p>And yes, this works just as well if your team is an agent. Don’t skip the part where <em>you</em>, the human, understand the problem.</p>
]]></content:encoded>
      </item>
      <item>
          <title>Future shock</title>
          <link>https://blog.ceejbot.com/posts/future-shock/</link>
          <pubDate>Tue, 03 Mar 2026 20:36:39 +0000</pubDate>
          <author>ceejceej@gmail.com (C J Silverio)</author>
          <guid>https://blog.ceejbot.com/posts/future-shock/</guid>
          <description>Software development in the age of gen AI</description>
          <content:encoded><![CDATA[<p>I wrote this talk for my employer, to follow a talk given the previous week by the product organization about how they intended to change their work to use the new generation of LLMs. Nobody asked me to. I requested the slot myself – the entire hour, speaking to the entire engineering organization of a bit over 300 people – and I was grateful they had enough trust in me to say yes and let me say what I wanted sight unseen.</p>
<p>There is of course a context in which I wrote this that deeply influenced what I chose to say: the audience is this specific group of people in this moment. This talk might not be something you need to hear or want to hear.</p>
<p>I’ll set some scene here. The industry is health-care adjacent. A software startup has recently been merged alongside enterprise companies that are not software startups <em>at all</em>. There is great discomfort on the non-startup side of the company about the changes they see in engineering. They just got access to Claude and were told in an earlier all-hands that it was somebody’s opinion that if they didn’t know how to use these tools they’d be unemployable.</p>
<p>Meanwhile, on the startup side, the engineers have been using LLM coding tools with open eyes for some time and are well aware of what they can do and what they can’t, well aware that the economic issues are going to be sticky, that this is not general AI, and that running inference and training models is destroying the environment. My team’s nickname for Claude is “the shoggoth” (thank you, James). We know the ride is going to be bumpy.</p>
<p>We also know that something really interesting happened in January, when Anthropic released Claude Opus 4.6, and that Claude got a <em>lot</em> better. It turned a corner in a way that is deep. If you are basing your opinion on anything you did before then, you need to retest and reconsider your opinion. Also: this technology is not going away.</p>
<p>In the middle of this mess: a talk. For me, the subtext is taking care of my people.</p>
<p><em>(There will be no video of me delivering the talk, because my slides were text-only sans branding. Sorry! I know people asked for it. This blog post remains employer-scrubbed.)</em></p>
<p>What follows is a blog version of the talk, drawn from the prose drafts I wrote, mashed up with my final slides and speaker notes. It includes material I cut for time. At one point I start using “Claude” as my word for “LLM tools”. You can fill in whichever tool you use there – we use Claude Code at my employer, so that’s the name I used.</p>
<h2 id="two-shocks">Two shocks</h2>
<p>Hello! It’s nice to talk to you again. Most of you have heard from me recently in my mode as manager of the core platform team. My title is officially “principal engineer”, however, and it’s nice to be able to wear the principal engineer’s hat now and then. I’m doing that today. When I wear this hat, I get to share things I know about this job with the people around me, which I think is important. I’ve been doing this for more than 30 years, so I’ve learned a few things along the way.</p>
<p>In particular, today I want to share with you some things I’ve been thinking recently about what’s been going on around us with our profession. I’m going to get a little philosophical, but I’m also going to give some advice, and hopefully I’ll have something useful to say to everybody in this room, not just engineers. Okay, ready?</p>
<p>Let’s talk about <em>future shock</em>. Alvin Toffler coined the term in a 1965 article for <em>Horizon</em> magazine, then wrote a book with that title in 1970, elaborating on the idea.</p>
<blockquote>
<p>“With future shock you stay in one place but <strong>your own culture changes</strong> so rapidly that it has the same disorienting effect as going to another culture”
—Alvin Toffler, 1970</p>
</blockquote>
<p>It’s no coincidence that the 1960s moved somebody to reflect on this concept, because that decade featured a lot of drastic cultural change in a short amount of time. It went from suits and ties and short hair in 1960 to beards and long hair and the counter-culture in 1970. Americans sure felt the cultural whiplash.</p>
<p>I think we’re seeing <em>two</em> kinds of culture shock happening at our company right now. We’re getting a double dose.</p>
<p>First, there’s the shock of two different <em>engineering cultures</em> meeting. We use different operating systems. We use different programming languages. We use different words for “person who writes software”: developer vs engineer.</p>
<p>These are surface details that signal deeper cultural divides underneath: Silicon Valley startup culture versus enterprise software culture. In the country I come from, engineers have status and power. We expect transparency and flat hierarchies. We move fast. We set out to change the world and we expect to do it sometimes. We don’t look at a comfortable established business and say, “looks great to me!” We push. And to people from the outside, that pushiness sometimes gets a far less polite label.</p>
<p>But even with these two foreign countries meeting, this isn’t the big culture shock. The big culture shock– the one that’s really the future shock– is hitting both sets of strangers.</p>
<p>Programming culture is changing around all of us, rapidly. We didn’t ask for it. We didn’t do anything to cause it. Our culture simply changed on us. LLM coding tools arrived, and changed our profession.</p>
<p>What I want to do today is give you all a way to cope with this shock by putting another frame around it, and give us all another way to look at it. Technological change has happened before, and is going to happen again, and we have some patterns to apply. I know one metaphor people have been using is that it’s the biggest change since compilers, and yeah, it is. But I don’t think that gets at how disruptive this is going to be.</p>
<p>I think generative AI is the Industrial Revolution of computing, with <em>all</em> the implications you can imagine from that.</p>
<ul>
<li>It’ll cause unpredictable and large <em>economic</em> change.</li>
<li>The software industry will be <em>permanently</em> changed.</li>
<li>Certain <em>categories of jobs</em> will vanish, and others appear.</li>
<li>The timescale will be vastly more <em>compressed</em>.</li>
<li>We sure are burning a lot of <em>coal</em>. (It’s environmentally destructive.)</li>
</ul>
<p>Economic upheaval comes hand-in-hand with misery, of course. The word <em>Dickensian</em> exists to remind us how bad the real Industrial Revolution got. We’ll see some of this, too.</p>
<p>Because we’re in the middle of this transformation, I can’t predict where it’s going to land. I’m not a futurist. This is one of the reasons it’s causing so much anxiety, of course.</p>
<p>So your feelings are real. The shock and anxiety are real. You’re reacting to something that is in fact shocking. If you feel an urge to slow it down, or pretend it’s not happening, I get it. I also really get the impulse to respond like it’s another silly VC hype-fest like cryptocurrency or blockchain whatever: a stupid trend you can wait out.</p>
<p>And yes, there’s fear in the air. You’ve heard that if you don’t learn these tools, you’ll be unemployable. That sounds pretty awful to me. What if none of us have jobs at the end of this?</p>
<p>Here’s what I know about history. The industrial revolution didn’t eliminate human labor. It changed what human labor <em>meant</em>. Human labor still mattered at the end of it. The people who adapt to major technological shifts don’t just survive—they become <em>more</em> valuable, not less.</p>
<p>This technological shift is changing what developing software means.</p>
<p>I think the genie of LLMs has escaped the bottle, just like the genie of the Internet had by 2000. The concept of the Internet – of connecting all the world’s computers – <em>survived</em> the dot-com collapse and the popping of a hundred badly-conceived business bubbles. You can get groceries delivered today, by companies who understand logistics the way Webvan did not in 2000.</p>
<p>Gen AI and large language models will survive whatever happens to any specific company or any funding bubble. I have my own bets on which players will survive and which will fail – incredibly destructively – but whatever happens, <em>the concept</em> will stay with us, more sensibly and sustainably. Forever.</p>
<p>I’m here to tell you it’ll be okay. You are smart. You can adapt. I believe in you.</p>
<p>Now I’m going to share with you what I think you should <em>do</em>.</p>
<p>So if we’re living through an <strong>industrial revolution</strong>… How do you survive one? You <em>learn to use the new tools.</em> You learn what they’re <em>good for</em> and what they’re <em>not</em> good for. And then you do things that were <em>never possible before.</em></p>
<p>(Look, it’s obvious. You go steampunk.)</p>
<p>What I’m going to do now is spin this entire discussion around. I want to change the frame that everybody puts on these tools. Claude is not a parrot for churning out thousands of lines of code and making programmers obsolete. Not even close. I’ll tell you what it is instead.</p>
<h2 id="the-reframe">The reframe</h2>
<div style="position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;">
  <iframe
    src="https://www.youtube.com/embed/NjIhmzU0Y8Y"
    style="position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;"
    allowfullscreen
    title="YouTube video"
  ></iframe>
</div>
<p>Claude is a bicycle of the mind.</p>
<p>No more, and no less, just like a computer is. It’s more <em>accessible</em>, because the interface is human language, not a programming language. Today’s computers are orders of magnitude more capable than the computers Jobs was talking about. Put those two things together, and you have the revolution. But it’s still the same thing at heart: an amplifier for the human mind.</p>
<ul>
<li>It amplifies your <em>capabilities</em>.</li>
<li>The more capable <em>you</em> are, the <em>more</em> is amplified.</li>
<li>It can amplify <em>disasters</em> as well as successes.</li>
<li>Your <em>blast radius</em> is larger, for good and for ill.</li>
</ul>
<p>It is a <em>personal</em> amplifier, not a generic one. It’s my bicycle, and I’m the one riding it, going further because of the amplification. I still have to pedal! The bicycle goes in the direction I choose! But I’m going further and more efficiently than I could on foot.</p>
<p>When Steve Jobs said we were at the very early stages of this tool, and that the enormous changes were nothing compared to what’s coming, he didn’t know what shape it would take – he couldn’t. But I think this is another one of those leaps forward. The bicycle of the mind is more powerful than it was, but that’s still what it is.</p>
<h2 id="try-some-things">Try some things</h2>
<p>Here are some of the things it can amplify your ability to do.</p>
<ul>
<li>Researching new topics.</li>
<li>Spiking on potential ideas.</li>
<li>Searching an entire corporate knowledge system.</li>
<li>Revising your product requirements doc.</li>
<li>Refactoring a large codebase.</li>
<li>Validating work through review and testing.</li>
<li>And yes, implementing features.</li>
</ul>
<p>It’s not about 20x the number of lines of code you write, though you <em>can</em> do that. It’s about doing things you <em>wouldn’t have attempted</em> at all.</p>
<ul>
<li>A refactoring you’d never have had time for.</li>
<li>An analysis you’d never have bothered with.</li>
<li>A prototype you’d never have built.</li>
</ul>
<p>Yes, you go faster at existing tasks, but more importantly, the tool expands what’s possible for you. You can ride a century on a bike and be back for dinner.</p>
<p>The amplification goes beyond software, of course, and we’ll get better at using the tools to do things beyond writing code. These are large <em>language</em> models. They amplify our ability to do interesting things to whatever we can <em>represent</em> as text. Code just happens to be text, so it’s where we started.</p>
<p>Representing information as text-like-things is the underlying concept of the digital era. Your ability to manipulate any data at all has just been amplified. The implications are <em>profound</em> and I don’t think they’ve fully sunk in everywhere yet. I’m still reeling about this one.</p>
<p>If the medium is the message, the medium of the LLM is all information. Eventually.</p>
<p>Would you like to know what I think about this? Well, if you know me, you might be able to guess.</p>
<p>I think this is <em>unbelievably awesome</em>.</p>
<p>I love writing software. Claude Code is the best tool for thinking about software and writing software I’ve used in my life. The ability to write software to manipulate data is <em>power</em>. Claude extends my power unimaginably far. And know this: my skill set is critical to this extension. Everything I’ve learned over my career <em>matters</em>.</p>
<p>I’ve been doing this for nearly thirty-five years, and this is the most exciting time to write software I’ve seen. Yes, there are pleasures lost here. I love the flow state of writing code, just like I love printing black and white photographs in the darkroom. I don’t have to lose either one. I can still do both for pleasure. Black and white photography isn’t part of anybody’s journalism workflow any more, but people still do it. There’s a place for it all.</p>
<p>I’ll say one thing further, sincerely. My employer right now is a fantastic place to experience this change. We’ve had access to these tools since the beginning, when they were honestly terrible. We have access to the <em>best</em> of the tools today. We have leadership that wants us to take advantage of these tools pragmatically, in whatever way makes sense, not blindly demanding that we use them or else. We are encouraged to experiment and share the results of our experimentation. So my theory is: dive in. Take advantage of this opportunity. Learn some things. Figure it out here, because this is great.</p>
<p>(Psst! Take advantage of this. It’s a resume bonanza. Don’t tell anyone I said this.)</p>
<h2 id="the-advice">The advice</h2>
<p>Let’s move into some practical advice for people in the virtual Zoom room. I’m going to make some suggestions and predictions next, for:</p>
<ul>
<li>engineers</li>
<li>QA people</li>
<li>managers</li>
<li>product designers</li>
<li>executive types</li>
<li>the whole organization</li>
</ul>
<p>These suggestions will be my best ideas for today. If they’re bad ideas in a year, that’s fine. Things are changing fast.</p>
<p>If this all <strong>just happened</strong> to you, if you’re on a Windows/.NET stack and none of these tools feel familiar yet, that’s not a failing. The tools are <em>newer</em> in your world. The learning curve is <em>real</em>. My team has been working on ways to use these tools effectively for nearly a year now. We have a big head start.</p>
<p>Remember this: your underlying skills—understanding systems, debugging, thinking about edge cases—those transfer completely. Everything you know about how to do your job is immediately useful.</p>
<p>Here’s one more thing I want to say before I go into advice. Being <strong>told</strong> to use a tool you don’t understand, on a timeline you didn’t choose, is terrible. I know. I <em>see</em> you.</p>
<p>The way through is <em>together</em>: pair with someone who’s further along. Ask for help. That’s strength, not weakness. You will have to do the work, but you won’t do it alone.</p>
<h3 id="advice-for-everybody">Advice for <strong>everybody</strong></h3>
<p>Start using Claude Code today.</p>
<ul>
<li>You <em>do not need to ask</em> permission.</li>
<li>You <em>already have</em> permission.</li>
<li><em>Use it</em> for whatever work you’re doing.</li>
<li>If anybody tells you not to use it, they’re <em>wrong.</em></li>
</ul>
<p>Somebody saying “I think that might not work here exactly; try this instead” is saying something different from “don’t use it.” If somebody is saying not to use it, we need to talk to that person.</p>
<p><em>Go for it.</em></p>
<p>Try Claude in the terminal.</p>
<p>The experience embedded in IDEs feels limited in comparison, particularly for context management. This might change some day, but right now, the terminal shows you more and gives you customization options you don’t have in IDE chat windows. I know that the terminal is an adjustment for some of us, particularly if you are new to MacOS or Linux: there is no version that works with PowerShell as I write this. However, I strongly suggest powering through this learning curve. You won’t get this capability any other way.</p>
<p>Here’s a <em>warning</em> for everybody. We’ve already learned what happens when we trust the LLM without verifying, even with good intentions. The tools make it <em>easy</em> to produce a lot of code fast. They do not make it <em>safe</em> to skip evaluating what it did. Nobody is exempt. Don’t make our security team have to file disclosure reports, or clean up our messes.</p>
<p>Two facts:</p>
<ul>
<li>We are responsible for what we ship.</li>
<li>We can’t skip evaluating the work.</li>
</ul>
<p>We all have large blast radii and we need to find a way to make our amplified selves safer than they are right now.</p>
<p>Claude is not <strong>magical.</strong> You must tell it:</p>
<ul>
<li><em>why</em> you are doing something</li>
<li><em>who</em> you’re doing it for</li>
<li>the <em>values</em> you bring to choosing trade-offs</li>
<li>the <em>business value</em> of the work</li>
<li>how <em>quality</em> is defined, or what <em>done</em> looks like</li>
</ul>
<p>This is why humans will <em>always</em> be present in the loop. Let me say it again: you will always be in the loop to tell Claude why.</p>
<h3 id="advice-to-engineers">Advice to engineers</h3>
<p>All eyes are on you, and you probably feel the pressure. So get <em>creative!</em> Writing code is not the only thing to do! It’s not even the most important thing!</p>
<p>I’m going to tell you the world’s biggest secret about writing software. Are you ready? Here it is.</p>
<p>The hard part of writing software has <em>never ever</em> been about typing the code. The world thinks this is what we spend all our time doing; LeetCode interviews spend all their time testing for this; and it’s the thing I think is least important when I’m hiring people and when I’m trying to fix software development workflows.</p>
<p>The hard parts go like this:</p>
<p><em>talking &gt; thinking &gt; typing</em></p>
<p>in order of importance, volume, and difficulty.</p>
<p>We’ve automated away the easy part—the typing—and some of the thinking. But you can never automate away the talking and decision-making. Humans will <em>always</em> have to do this. Developing software is a multidisciplinary team sport. This is where creativity and skill come in. These tools contain no machine intelligence – not yet, anyway. It’s still just a bicycle, and it doesn’t go anywhere interesting unless you’re steering it and pushing the pedals.</p>
<p>Okay, so now we’re typing code at truly scary speeds. <em>Why aren’t we shipping more software?</em></p>
<p>Ask where the bottlenecks are now that the typing part is faster. Do we just have engineers sitting idle and frustrated faster than ever? Look <em>upstream</em> and <em>downstream</em>.</p>
<ul>
<li>Upstream: are <em>product decisions</em> happening fast enough?</li>
<li>Downstream: can you <em>ship</em> what you build? Can you test it?</li>
<li>Identify obstacles. Work with the rest of your organization to remove them.</li>
</ul>
<p>This is probably where you think I’m about to tell you that writing great specs is where it’s at these days, and yeah, great specs help. Garbage in, garbage out will always be a valid maxim. But honestly, I think we’re past “write great specs” as being your number one engineering job as of Opus 4.6. We’re now at a place where agent orchestration and brainstorming skills mean that Claude can work <em>with you</em> to produce that full specification. You definitely need to provide all the information that comes from the humans-talking phase of spec-writing plus some of your thinking work. Your own hard-won skills can skip over quite a lot of the thinking and back and forth: I do quite a lot of telling Claude exactly what I already know a solution is shaped like.</p>
<p>You will work <em>with</em> Claude to write that spec, and then Claude will use its own spec to do the work.</p>
<p>Your skills as an engineer matter intensely, by the way. If you don’t know:</p>
<ul>
<li>the criteria for decomposing software into modules, you’ll have bad module boundaries.</li>
<li>how to threat-model, your software will have security flaws.</li>
<li>how important error handling is, it will be error-prone.</li>
<li>how to make software resilient, it will be fragile.</li>
<li>how to identify edge cases, they won’t be handled.</li>
</ul>
<p>This is why product managers aren’t vibe-coding production software, bless their hearts.</p>
<p>What about the less-experienced programmers among us? We have a responsibility to them, in my opinion. I mean, where do people think senior programmers come from? This is, as best as I have seen through the years, a profession we learn through apprenticeship and mentoring. It takes around ten years to turn a programmer with some decent skills into an engineer you can trust to ship production systems. (I’ve seen some people learn a lot faster. I envy them.)</p>
<p>I think that we, like any specialized athlete, need to build on a base of core strength. We need to do foundational exercises before we do the specialized ones. We have to know what programming is ourselves, what good architecture is, and all the things I mentioned above, and we have to learn those by doing them. If we don’t know what those are, we cannot evaluate what our LLM has done with them, or request that our LLM does them, or catch it when it short-cuts on doing them.</p>
<p>So if you are just starting out or want to gain more experience for whatever reason, <em>pair program</em> with more experienced people. Pair-programming has the highest information transfer rate of any activity I’ve ever done. It’s monstrously good at leveling people up.</p>
<p><em>Ask Claude to teach you things.</em> You don’t have to ride a fully electrified bicycle. You can ask Claude to structure a task so that it coaches you through it, or sets you progressively more challenging tasks in a programming language, or with a concept. Have Claude walk you through one of the past years of Advent of Code, with hints when you get stuck. It’s pretty good at this! Don’t be just a passive user of this thing. Engage with it and see what you can make it give you. You might find yourself delighted.</p>
<p>And turning this around, if you have a staff-and-up title, leveling up the people around you is part of your job. Set up office hours. Say yes when somebody on your team asks to pair-program with you on a problem. This is one of the most fun parts of my job, far more fun than all the management paperwork.</p>
<p>Pair program to learn how to use these tools. Watch another Claude user use it their way. Have them talk you through what they do, which plugins they use, how they think about it. Then swap, and show them what you do! Lift each other up. One of us built a great screencast tool to share sessions: watch some! Learn from the smart people you work with! (Pro tip: I think you’re smart and I want to learn from you. This is not BS. I have learned from every single colleague I’ve ever spent time with.)</p>
<p>Above all else be an <strong>engineer</strong>!</p>
<p>Humans are <em>tool builders.</em> That’s what programming is. Trick out your bicycle.</p>
<blockquote>
<p>Are you a witch or aren’t you, Hermione? Cast the spell! —Ronald Weasley</p>
</blockquote>
<h3 id="advice-to-qa-people">Advice to QA people</h3>
<p><em>You’re engineers!</em> Everything I just said applies to you. Claude amplifies your abilities too.</p>
<p>Specifically, I think you should leverage Claude to <em>sharpen</em> your work.</p>
<ul>
<li>Code analysis, risk analysis: use Claude to target your scarce attention on areas that need testing more than others.</li>
<li>Implement testing you’ve been putting off.</li>
<li>Use Claude to turn specifications into test suites.</li>
<li>Use Claude to write missing specs from the code. The specs might be wrong in some places, but at least they exist.</li>
</ul>
<p>A call to action for QA: You have a unique viewpoint. We need that viewpoint to <em>balance out</em> product design and engineering. If they’re writing more specs and shipping more product, you had better be out there being pessimistic at equal volume. Your ability to break things needs to be amplified too, or we’ll ship bad work.</p>
<h3 id="advice-to-managers">Advice to managers</h3>
<p>Now I take off my principal engineer hat and put on my manager hat. It’s real talk time, colleagues. I think we need to be more scared than our people are, because I think the manager’s role is the one that is going to transform the most.</p>
<p>Think about the systemic effects of development teams being single engineers with groups of agents. Think about what changes when <em>one person</em> is capable of doing all of the implementation work for a complex project in a shorter time than you thought possible. This is what keeps me up at night the most. <em>What’s going to change when everybody is going this fast?</em></p>
<p>You will no longer be involved in that project in the small. You won’t be thinking about process, control, decision-making, sprints, sprint tasks.</p>
<p>You will be thinking about enabling that engineer to act in the large: career-building, delegation, trust, teaching the kinds of coordination skills that engineer will need to move fast. You’ll want your people operating as autonomously as you can enable them to act, and you’ll be coordinating things yourself at the <em>project</em> level. This is what I’m doing with my team today, and I think many other teams are not far away from this.</p>
<p>I return to this metaphor often through the years: the best role for a manager is as support. You’re a curling sweeper, clearing the ice in front of the stone for your people.</p>
<p>If you’re a tech-lead-flavored manager, you are the ideal engineer today. Every engineer today is a <em>tech lead</em> for a team that includes them + agents, and you already have the <em>project-management</em> and <em>coordination</em> skill sets that many engineers will need to learn. You might be too valuable as an engineer to stay as a manager. (I’m not sure if that’s good news or bad news for you. I guess it depends on how you feel about management right now.)</p>
<p>Learn and observe; don’t prescribe. Things are changing too rapidly for you to standardize on any processes prematurely. Avoid building process debt that will slow people down the moment anything changes, and it will change, I guarantee you, and more rapidly than you expect. Instead, learn to use the tools and identify the problems that need solving. You see things engineers don’t, because you’re above the fray and watching your whole team work. Use that.</p>
<p>Here are some creative LLM uses for managers to try:</p>
<ul>
<li>Automate your busywork? <em>Eliminate</em> your busywork!</li>
<li>Write job descriptions by having Claude <em>interview you</em> first about what you need. Then write the job description based on what you learned.</li>
<li>Do the same for your interview script. (You interview using a script to reduce bias, right?)</li>
<li>Have Claude collect your employee brag files for upcoming review cycles. Goodbye recency bias.</li>
<li>Synthesize information across teams and spot coordination gaps.</li>
<li>Have Claude fix your calendar. (I had Claude do this the other day. It lectured me about maker mode vs manager mode. It chided me. It was awful, and I had to <em>screenshot</em> my calendar to do it, but my days are a lot better now.)</li>
</ul>
<p>The theme of all of the above: Claude is a free personal assistant, the one your company will never hire for you, but whom you still need desperately.</p>
<p>A call to action for managers:</p>
<ul>
<li>Give your team the breathing room to learn.</li>
<li>Protect their time for experimenting with Claude.</li>
<li>Promote sharing tips and tricks.</li>
<li>Be encouraging.</li>
<li>This investment will pay off.</li>
</ul>
<h3 id="advice-to-product-managers">Advice to product managers</h3>
<p>Stop trying to be engineers: writing code is not the hard part. I have a 35-year head start on you, and you aren’t going to catch up. Instead, lean into what makes you <em>distinct</em> from engineering, and use Claude to extend your ability to do <em>what makes you special</em>.</p>
<p><em>Let go of the processes of the past.</em> The sprint is a thing of the past, and you can no more run engineering than managers can. Product development will happen faster than that. You will not be handing engineers sprint tasks. You will be giving them <em>product goals</em>.</p>
<p><em>Write amazing specs/PRDs/whatever you want to call them.</em> Broaden your research. Improve your product specifications. Use Claude to analyze weaknesses and improve them. You no longer have an excuse for hasty work or incomplete specs. Include those wireframes, those interaction sketches. Write that persona description. And yes, do that prototype that runs on your laptop to show off how it works with mock data.</p>
<p><em>Focus on information transfer to engineering.</em> I cannot say this one often enough. What you want to work on is aligning with the engineers working on your products so tightly that they will finish your sentences for you when you’re talking about the product you’re working on together.</p>
<p>The reality is that products are too complex for any one person to make all the decisions on. There are hundreds of edge cases you won’t think of, error conditions you don’t want to slog through, and variations on UX interactions that need to be made to work. You rely on other people making those decisions the same way you would. In the future, you will rely on a handful of people instructing LLMs to make those decisions the way you would. They need to know what’s in your head.</p>
<p>The people executing the mission must fully understand the mission. Your product specs are the specs we need to be sweating on right now – or feel free to substitute in whatever you call the document you use to transfer information to engineering. The quality of that information transfer is maybe even more critical than it’s ever been, and it has <em>always</em> been critical.</p>
<p>Align with engineering as fully as possible about <em>what</em> you’re building, <em>why</em> you’re building it, and <em>who</em> it’s for. Claude is going to make them <em>too fast</em> to align along the way. Do it up front.</p>
<h3 id="advice-for-upper-management">Advice for <strong>upper management</strong></h3>
<ul>
<li>You should be using Claude too. Yes, <em>you, person with the C title.</em></li>
<li>Learn the tools yourself, and understand why this is a revolution, not business as usual.</li>
<li>Explode the obstacles in front of your organization and let it move.</li>
<li>Promote experimentation.</li>
<li>You need the amplification it brings your company. Your competitors are already doing it.</li>
</ul>
<p>Here’s a call to action for the whole organization:</p>
<p>Connect all corporate information. All of it. Make all of it accessible to Claude. Slack, GitHub, Notion, Jira, the on-call incident system, Grafana, calendars, email, anything. Make everything accessible to Claude, so you can do everything described above. (I’d like not to have to do a few clever things to get my calendar into Claude.) Allow every workflow to be transformed. Allow all information to be fed into the transformations. Rethink everything.</p>
<p>But what about security? Yes, there are genuine security constraints, especially in healthcare. But “we need to be careful” is different from “we can’t move at all.” If security requirements are making this <em>impossible</em>, that’s a problem to <em>solve</em>, not a reason to stop.</p>
<p>If we’re not doing this, our competitors are. Assume that we’re already falling behind.</p>
<h2 id="at-long-last-a-conclusion">At long last, a conclusion</h2>
<p>The industrial revolution took decades. We won’t have that luxury. This is going to take a handful of years at best. (Disclaimer: I don’t know. My futurism qualifications are nil.) But I know that I want to be here with the tools in my hands, building whatever comes next. I’m not done yet; I want another 35 years of doing this! That is where I’ve always wanted to be, and why I’ve worked where I have through my career: at startups, on the bleeding edge.</p>
<p>This is <em>opportunity</em>. This is <em>exciting</em>.</p>
<p>None of us know where this is going to end up. I am making a few informed guesses here. It will change more and faster before it settles.</p>
<p>But know that the human is still in the loop. It’s you on that bicycle of the mind. It’s you being amplified by the tool. Make yourself worth amplifying.</p>
<p>You can choose whether you’re a power chord or a buzzing string.</p>
<hr />
<p>Many thanks to Chris Dickinson for coal and condors.</p>
]]></content:encoded>
      </item>
      <item>
          <title>x	Why was this fun (while work was not)?</title>
          <link>https://blog.ceejbot.com/posts/why-was-this-fun/</link>
          <pubDate>Fri, 16 Jan 2026 02:33:07 +0000</pubDate>
          <author>ceejceej@gmail.com (C J Silverio)</author>
          <guid>https://blog.ceejbot.com/posts/why-was-this-fun/</guid>
          <description>Lessons learned from a side project that was fun, and how I might use them to make work more fun for the teams I lead.</description>
          <content:encoded><![CDATA[<p>Let’s start by clarifying what I mean by the question.</p>
<hr />
<p>I’m resurrecting a years-old nearly-finished blog post draft here. What’s funny to me is the time period when I wrote it. I was working for an utter car-crash of a company, one that I am absolutely certain in retrospect existed only to raise money serially and keep its founder’s friends employed. Any time the product looked like it might ship, they pivoted. (I do not care about this. It’s on the VC who falls for it, IMO.)</p>
<p>O Silly Valley, how silly you are.</p>
<p>I was fairly miserable in the brief time I worked there. This side project was a light in the darkness at the time, and apparently I wrote about it.</p>
<hr />
<p>Last month, over the course of two weekends, I designed, implemented, and shipped a little command-line tool. What this tool does is not particularly important, except to note that it solved a problem I had repeatedly encountered while working on some internal tools for my job. I understood the problem domain well, and it was indeed an irritating problem, and solving it will make my next set of shell scripts easier to deal with. Also I was somewhat annoyed that I’d just spent a month writing tools in bash<sup class="footnote-reference" id="fr-1-1"><a href="#fn-1">1</a></sup> and not in Rust, a language I had wanted to be using. I scratched two itches at once by writing this tool in Rust. I went some extra miles with this one: I wrote tests. I wrote docs. I wrote examples. I automated its release process. I published it on crates.io, something I rarely bother to do with my Rust projects. And then I learned how to create a <a rel="noopener external" target="_blank" href="https://docs.brew.sh/Taps">homebrew tap</a> so I could make pre-built executables available to people who don’t have Rust installed.</p>
<p>I do not expect anyone but myself to use this tool, so all of this work was “wasted work”, in some sense. If I’d done all this in my workplace and then had it all thrown away, I’d have felt discouraged. But here, in a side project, it was pure fun. It was fun to read the documentation for libraries I ended up not using. It was fun to install tools that were vaguely related and try them out.</p>
<p>It was fun to write some code, realize it was all wrong, and rewrite it again more tightly. And then to go back to it again later and tighten it some more. To write comments noting that something I’d written wasn’t great and I’d get back to it. Maybe.</p>
<p>It was fun to polish it all up and finish it. It was fun to hit a level of completeness that I am rarely able to reach in my employment, even when I’m the person running a team and encouraging polish and completeness.</p>
<p>Why was this entire process so much more fun than anything I experience during normal day jobs? <sup class="footnote-reference" id="fr-2-1"><a href="#fn-2">2</a></sup></p>
<p>I let myself blurt out some answers to that one here:</p>
<ul>
<li>It scratched my itch, not somebody else’s.</li>
<li>I knew what I was building and why.</li>
<li>I knew who I was building it for.</li>
<li>I didn’t have to deal with <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/Law_of_triviality">bike-shedding</a> and what-about-ery from people second-guessing design decisions.</li>
<li>There was nobody else’s taste to consider as I designed.</li>
<li>I was free to make decisions; no need to socialize them and deal with disagreement.</li>
<li>I could explore the problem space freely. Dead ends were fine.</li>
<li>There was no time pressure.</li>
<li>My work sessions weren’t interrupted by context-switching into different work. (They were interrupted by non-work concerns, like feeding hungry cats.)</li>
<li>The requirements didn’t change unexpectedly.</li>
<li>When the project goals did shift, the entire team of one understood why and were aligned with the need to change.</li>
<li>There was no scope creep. When an acquaintance suggested an interesting possible feature, I was free to reject the feature.</li>
<li>I could pursue my own standards for quality.</li>
<li>How often do I have time in a paid job to write second and third drafts of code? But that’s when code (and prose writing) starts to get good.</li>
</ul>
<p>There’s lots of overlap in those, but I’ll let them stand without editing because the repetition helps the themes emerge. None of those themes are surprising to me, by the way, and I suspect they’re mostly not surprising to you.</p>
<p>Why, I ask myself, is work almost never this fun? Should work be this fun? Is it possible, even?</p>
<h2 id="motivations">Motivations</h2>
<p>A while ago I watched a video with some very cool illustrations for a TED-talk-ish lecture by Dan Pink: <a rel="noopener external" target="_blank" href="https://www.youtube.com/watch?v=u6XAPnuFjJc">The surprising truth about what motivates us</a>. It’s only about 10 minutes long. Don’t let the TED-talk-ness stop you; go watch it.</p>
<p>You back? Cool. Let’s talk about that lecture. Now, there are things in I would argue with–twelve years on I’m a lot more cynical about open source than Pink was then–but that’s not the heart of his point. The heart of his is point is what motivates humans to do complex tasks:</p>
<ul>
<li>autonomy</li>
<li>mastery</li>
<li>purpose</li>
</ul>
<p>I note, in retrospect, that all three of those needs show up in my blurts about reasons why this project was more fun than a lot of work I’ve done for pay. I’m going to use them to give structure to the rest of this discussion. How did my personal project provide them? How does work not? <em>Can</em> work provide them?</p>
<h2 id="autonomy">Autonomy</h2>
<p>Autonomy is somewhat at odds with the need for coordination and cooperation that dominates engineering in workplaces.</p>
<p>When I mention “what-about-ery”, what I mean is objections from somebody whose values are different from mine. Or who is thinking about edge cases I’m not. Maybe they’re right and maybe they’re not; the values misalignment means our calls about what to do are different. Maybe we both put effort into resolving it and both come out feeling okay. That’s more energy invested than I would invest in considering and then rejecting an idea in a solo project. “Maybe I should do this,” I think, then “nah, not right now.” Energy investment over.</p>
<p>Dealing with disagreements is a normal part of group work. Disagreements are conflict, even if often only mild conflict, and they must be resolved. Resolution takes energy. Sometimes our colleagues are not great at negotiating disagreement; sometimes we’re the ones who aren’t great at it. Either way, in work contexts we must spend this time and energy, because if we do not resolve these conflicts, we <em>as a team</em> don’t commit fully to the decisions the team makes. (I gesture in the direction of <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/The_Five_Dysfunctions_of_a_Team">The Five Dysfunctions of a Team</a>.)</p>
<p>Nobody else’s taste: This one is interesting. I have some fairly strong preferences about naming things and how I want to write text that users read, and how I want to present data. I like a particular clean prose style (aside from the parentheticals). I often have to suppress that taste when working in groups. It is not helpful to fuss about variable names in pull request reviews, unless those names are very misleading. I’ll fuss about then if I’m pairing with you, then decline to die on that hill if you push back. This is good manners. But when I’m by myself, I can rename that variable just to see how the code reads, editing to my satisfaction. Or I can leave in that ridiculous joke that will make me laugh when I find the code again in a couple of years.</p>
<p>Autonomy is not fully at odds with the idea of working on teams. Aligned teams that are clear about their goals and the values they’re holding as they work toward their goals are not going to feel much tension here. The team might feel autonomy as a group and work together on making decisions. Or they might delegate well so that people can work on separate projects with the freedom to make decisions. But I think working in teams does requiring losing some autonomy. This is a tradeoff; it buys you the satisfaction of working on a larger project than you can do on your own.</p>
<h3 id="alignment">Alignment</h3>
<p>I noticed a couple of things there about changing requirements. I <em>did</em> change requirements for the tool midway through! A friend made a suggestion–how about json output? I thought about this and realized it would fit nicely into my goal of using toml-sourced data in shell scripts. Select complex data from a toml file, emit it as json, and pipe that into <code>jq</code> for further processing. I accepted the suggestion and added the feature. Doing that work made me think harder about what “emitting values in a form bash can consume immediately” meant, and the whole thing got a little better.</p>
<p>Aligning a team of one is a lot easier than aligning a team of five. Or a company of fifty.</p>
<p>Alignment is, however, critical for any group of people working on a software project together. (Probably any project, but software is the thing I’ve spent my life doing.) Leaders have to put in a lot of work here. Dicta from above don’t produce good results.</p>
<h2 id="mastery">Mastery</h2>
<p>My blurts above call out specifically that I learned to do some new things with this project.</p>
<p>I got to practice writing Rust instead of writing bash. Now, all the bash I’ve been writing for my day job has in fact made my bash a lot better. I have quoting bugs less often. But… that wasn’t what I wanted to get better at. It was the right choice for the problem in the moment.</p>
<h2 id="purpose">Purpose</h2>
<p>This is the scratching my own itch experience. I knew why I wanted it. I giggled my head off when I used the tool as part of <a rel="noopener external" target="_blank" href="https://github.com/ceejbot/tomato/blob/latest/justfile#L36">its own release process</a>. It was doing what it was supposed to do! I had succeeded! I had the tool I wanted!</p>
<p>That feeling of delight is something that’s kept me writing software despite all the nonsense in the industry itself, despite toxic work environments, despite bad project management, despite bad product design, despite sociopathic company founders, despite failures in the market. (That’s a litany of the worst of it; my career has mostly featured environments that were good, with the occasional awful standout.)</p>
<h2 id="of-course-it-was-fun">Of course it was fun!</h2>
<p>I land on: of course it was fun! Of course revising it over the years since I wrote it was fun! I scratched my own itch. I have used it in every open-source project I’ve written since that moment. I made changes in the API and was happy about them. It’s useful to me. I don’t care if it’s useful to anybody else.</p>
<p><em>This is why I write software.</em></p>
<p>The first program I ever wrote was a D&amp;D character generator, written in pen in a spiral bound notebook because I’d just read a BASIC manual but didn’t have a computer to run programs on. I was scratching my own itch about something I was having fun with at the time. When I finally got to type that program into an actual computer—a couple of years later, when my parents bought me an Apple ][+—watching it run was a delight. Years after that, when I was a somewhat older kid at a <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/General_Magic">now-kinda-famous Silicon Valley startup</a>, exercising code for my first feature there, I felt the same absurd joy. I sat there for half an hour, making the device do the thing I’d implemented over and over.</p>
<p><em>That was fun</em>.</p>
<p>So why was writing this tool fun?</p>
<p>Because it was my work. Done for me. To my level of quality. Solving my problem. At my pace. And I got to giggle watching ti work.</p>
<h2 id="lesson">Lesson</h2>
<p>Are you not having fun? Okay. I get it. Work is a total drag. Maybe you should find another job? But if you can’t find your <em>jouissance</em> in your work-work, find it in your personal projects. Remind yourself why you write software. Go do something silly and useful only to you, and have fun. Find it on ths sly at work: do something that makes <em>you</em> happy in that work project. Revise the code to your satisfaction. Pause and look at the feature you just implemented and delight in how cool it is that you made something that didn’t exist before now “happen” in some virtual way.</p>
<p>Giggle at it.</p>
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn-1">
<p>bash is very good at orchestrating other command line tools and connecting output together in the famed unix pipelines. That’s what shell scripting languages are for! It’s a nightmare in many other ways. When’s the last time you messed up bash quoting? Exactly. It was, however, the right choice for the project in front of me; doing the work in any other tool would have been work work with a less-maintainable result. <a href="#fr-1-1">↩</a></p>
</li>
<li id="fn-2">
<p>I’m not picking on my current day job here. I’m thinking of all of them across the years. ETA much later: lol. That day job was hilariously bad. <a href="#fr-2-1">↩</a></p>
</li>
</ol>
</section>
]]></content:encoded>
      </item>
      <item>
          <title>Private homebrew taps</title>
          <link>https://blog.ceejbot.com/posts/private-brew-tap/</link>
          <pubDate>Sun, 15 Jun 2025 12:01:30 +0000</pubDate>
          <author>ceejceej@gmail.com (C J Silverio)</author>
          <guid>https://blog.ceejbot.com/posts/private-brew-tap/</guid>
          <description>How to build a private homebrew tap to distribute internal tools.</description>
          <content:encoded><![CDATA[<p>I recently built a few things to make it easier for my team to distribute the tools they’ve been building to the entire engineering organization. I had the advantage of knowing that everybody was on Mac laptops running very close to the latest release. I had some idea of their general habits and blind spots after 18 months of observation. There were a handful of problems I needed to solve here, and a private tap looked like it would solve all of them.</p>
<p>The problems were:</p>
<p><strong>Discoverability.</strong> We can document things all we want using Notion or whatever the corporate information system of the moment is, but unless somebody thinks to go hunting, they won’t find the documentation. <code>brew search our-github-org</code> is easy to remember, however, because everybody is already using homebrew.</p>
<p><strong>Automated installation.</strong> It’s straightforward to install the latest GitHub release of something in a shell script, if you can get the person to run the shell script. Running <code>brew install foo</code> is even easier. It’s also easy to say this out loud to somebody if you can’t type it.</p>
<p><strong>Updates.</strong> We can mostly coax people into onboarding into new repos by making a <a rel="noopener external" target="_blank" href="https://just.systems"><code>just setup</code></a> recipe conventional in as many of them as we touch, but only mostly. People definitely don’t remember to run setup scripts again to update. <code>brew upgrade</code>, on the other hand, is something they will run occasionally.</p>
<p><strong>Signed Mac executables.</strong> We’re building some tools that are notarized Mac executables with installers. Brew knows how to install all the things.</p>
<p>The obstacles to using a private tap are mostly that distribution of not-open-source software is of zero interest to the Hombrew project; they’re a package manager for open source for MacOS (and Linux too). You have to build this yourself. Fortunately, it’s not hard.</p>
<p>By Fediverse friend request, I share my approach. Here’s the outline:</p>
<ul>
<li>Use a <a rel="noopener external" target="_blank" href="https://github.com/orgs/Homebrew/discussions/574">download strategy</a> that can fetch assets from release in private repos.</li>
<li>Publish internal tools to your tap with formulas that mark them as using this new strategy.</li>
<li>Show people the slight bit of magic needed to tap a private repo.</li>
</ul>
<p>You can do an Internet search for articles about how to write a download strategy that uses <code>curl</code>.  I chose to write a download strategy that uses <code>gh</code>, <a rel="noopener external" target="_blank" href="https://cli.github.com">the GitHub cli</a>. I’ve been using <code>gh</code> in actions for a while because it’s so easy to use for certain tasks, like, well, downloading artifacts from private repos.</p>
<p>If you’ve maintained your own tap before, skip down to <a href="https://blog.ceejbot.com/posts/private-brew-tap/#downloading-release-assets-by-strategy">the strategy section</a> and snag that. Read on for the full details if making brew taps is new to you.</p>
<h2 id="get-gh-installed">Get <code>gh</code> installed</h2>
<p>Get everybody set up with <code>gh</code>, because they’ll need it installed to use this strategy.  Using <code>gh</code> as a credentials helper for GitHub makes the next step more convenient, but is not required. <a rel="noopener external" target="_blank" href="https://github.com/cli/cli#installation">GitHub has good installation instructions.</a> You can script it with some human action needed like this:</p>
<pre class="giallo" style="color-scheme: light dark; color: light-dark(#3C3836, #EBDBB2); background-color: light-dark(#F9F5D7, #1D2021);"><code data-lang="plain"><span class="giallo-l"><span>brew install gh</span></span>
<span class="giallo-l"><span>gh auth login</span></span>
<span class="giallo-l"><span>gh auth setup-git # optionally</span></span></code></pre><h2 id="the-tap-repo">The tap repo</h2>
<p>Now get your tap repo set up.</p>
<ul>
<li>
<p>Create a private repo in your organization named <code>your-org/homebrew-tap</code>. You can leave out the “homebrew” part of the name in most of the usage instructions, because it is implied. Naming the repo unambiguously is merely a convention.</p>
</li>
<li>
<p>Make sure everybody who needs to install tools from it has at least read access. Probably you have a GitHub team that’s “all developers” or the equivalent.</p>
</li>
<li>
<p>Make sure everybody who needs to publish tools by hand to it has at least write access.</p>
</li>
<li>
<p>Create a fine-grained GitHub token that can clone the repo and write new commits to it. Make this available as an <em>organization</em> secret, so any repo can use it in a release workflow.</p>
</li>
<li>
<p>Finally, get everybody to tap it, using one of these invocations:</p>
</li>
</ul>
<pre class="giallo" style="color-scheme: light dark; color: light-dark(#3C3836, #EBDBB2); background-color: light-dark(#F9F5D7, #1D2021);"><code data-lang="shellscript"><span class="giallo-l"><span style="color: light-dark(#B57614, #FABD2F);">brew</span><span style="color: light-dark(#79740E, #B8BB26);"> tap</span><span style="color: light-dark(#79740E, #B8BB26);"> your-org/tap</span><span style="color: light-dark(#79740E, #B8BB26);"> git://github.com/your-org/homebrew-tap.git</span></span>
<span class="giallo-l"><span style="color: light-dark(#928374, #928374);font-style: italic;">#</span><span style="color: light-dark(#928374, #928374);font-style: italic;"> to use ssh to access it</span></span>
<span class="giallo-l"><span style="color: light-dark(#B57614, #FABD2F);">brew</span><span style="color: light-dark(#79740E, #B8BB26);"> tap</span><span style="color: light-dark(#79740E, #B8BB26);"> your-org/tap</span><span style="color: light-dark(#79740E, #B8BB26);"> https://github.com/your-org/homebrew-tap</span></span>
<span class="giallo-l"><span style="color: light-dark(#928374, #928374);font-style: italic;">#</span><span style="color: light-dark(#928374, #928374);font-style: italic;"> or if you&#39;re authing with the gh helper:</span></span>
<span class="giallo-l"><span style="color: light-dark(#B57614, #FABD2F);">brew</span><span style="color: light-dark(#79740E, #B8BB26);"> tap</span><span style="color: light-dark(#79740E, #B8BB26);"> your-org/tap</span></span></code></pre>
<p>Now we have everybody tapping an empty cask. Let’s fill it.</p>
<h2 id="formula-files">Formula files</h2>
<p>These are predictable. Generating them is a template rendering problem that can be solved in a number of ways. <a rel="noopener external" target="_blank" href="https://github.com/ceejbot/formulaic/blob/latest/src/formula.rb">Here’s a template I use for typical formulas.</a> You can write some yourself by looking at examples.</p>
<p>Generating them by hand is, however, de trop. This is what computers are for, and especially what workflows that generate GitHub releases are for.</p>
<p>You can find actions to do this by <a rel="noopener external" target="_blank" href="https://github.com/search?q=topic%3Aactions+homebrew&amp;type=repositories">searching on GitHub</a>. Here’s where you’ll want the API token you generated above with the ability to make commits to the tap repo. You’ll do a set of builds in a workflow, create the release, upload assets, use whatever you’ve chosen to generate a formula file, then finally commit the formula file to your tap repo.</p>
<p>Here’s an example <a rel="noopener external" target="_blank" href="https://github.com/ceejbot/tomato/blob/latest/.github/workflows/release.yml#L169">tap update step</a> from one of my repos. Note the two access tokens passed in as env vars. One is that token that can commit to your tap repo; this needs to be provided to <code>gh</code> via the env var <code>GH_TOKEN</code>. The other is the token GitHub generates for the workflow run, which has access to the tool repo. You’ll want to to read release information from the GitHub api with <code>gh</code>, which I found a million times easier to do than clunking my way through workflow step inputs and outputs. (Your mileage may vary.)</p>
<p>In fact, you can probably use the <code>gh</code> build-in gotemplate formatting feature to make it emit a formula file, if you can cope with really long inline template strings.</p>
<pre class="giallo" style="color-scheme: light dark; color: light-dark(#3C3836, #EBDBB2); background-color: light-dark(#F9F5D7, #1D2021);"><code data-lang="plain"><span class="giallo-l"><span>gh release view -R org/repo --template &quot;some long template string here&quot;</span></span></code></pre>
<p>The twist here is that we want to change up the standard template by adding a custom download strategy to our tap.</p>
<h2 id="downloading-release-assets-by-strategy">Downloading release assets (by strategy)</h2>
<p>Here’s the download strategy itself. You can either embed this into each formula (the lazy way, which I chose), or put it into a file in your tap repo and then require that file in each formula.</p>
<pre class="giallo" style="color-scheme: light dark; color: light-dark(#3C3836, #EBDBB2); background-color: light-dark(#F9F5D7, #1D2021);"><code data-lang="ruby"><span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">require</span><span style="color: light-dark(#7C6F64, #A89984);"> &quot;</span><span style="color: light-dark(#79740E, #B8BB26);">download_strategy</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span></span>
<span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">require</span><span style="color: light-dark(#7C6F64, #A89984);"> &quot;</span><span style="color: light-dark(#79740E, #B8BB26);">utils/formatter</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span></span>
<span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">require</span><span style="color: light-dark(#7C6F64, #A89984);"> &quot;</span><span style="color: light-dark(#79740E, #B8BB26);">utils/github</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span></span>
<span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">require</span><span style="color: light-dark(#7C6F64, #A89984);"> &quot;</span><span style="color: light-dark(#79740E, #B8BB26);">system_command</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">class</span><span style="color: light-dark(#B57614, #FABD2F);"> GitHubCliDownloadStrategy</span><span style="color: light-dark(#7C6F64, #A89984);"> &lt;</span><span> CurlDownloadStrategy</span></span>
<span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">  require</span><span style="color: light-dark(#7C6F64, #A89984);"> &quot;</span><span style="color: light-dark(#79740E, #B8BB26);">utils/formatter</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span></span>
<span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">  require</span><span style="color: light-dark(#7C6F64, #A89984);"> &quot;</span><span style="color: light-dark(#79740E, #B8BB26);">utils/github</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span></span>
<span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">  require</span><span style="color: light-dark(#7C6F64, #A89984);"> &quot;</span><span style="color: light-dark(#79740E, #B8BB26);">system_command</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">  def</span><span style="color: light-dark(#B57614, #FABD2F);"> initialize</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#076678, #83A598);">url</span><span style="color: light-dark(#7C6F64, #A89984);">,</span><span style="color: light-dark(#076678, #83A598);"> name</span><span style="color: light-dark(#7C6F64, #A89984);">,</span><span style="color: light-dark(#076678, #83A598);"> version</span><span style="color: light-dark(#7C6F64, #A89984);">,</span><span style="color: light-dark(#AF3A03, #FE8019);"> **</span><span style="color: light-dark(#076678, #83A598);">meta</span><span style="color: light-dark(#7C6F64, #A89984);">)</span></span>
<span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">    super</span></span>
<span class="giallo-l"><span style="color: light-dark(#928374, #928374);font-style: italic;">    #</span><span style="color: light-dark(#928374, #928374);font-style: italic;"> Extract owner and repo from the URL</span></span>
<span class="giallo-l"><span style="color: light-dark(#076678, #83A598);">    match_data</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span style="color: light-dark(#7C6F64, #A89984);"> %r{</span><span style="color: light-dark(#AF3A03, #FE8019);">^https?://github</span><span style="color: light-dark(#8F3F71, #D3869B);">\.</span><span style="color: light-dark(#AF3A03, #FE8019);">com/</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#AF3A03, #FE8019);">?&lt;owner&gt;</span><span style="color: light-dark(#7C6F64, #A89984);">[</span><span style="color: light-dark(#AF3A03, #FE8019);">^/</span><span style="color: light-dark(#7C6F64, #A89984);">]</span><span style="color: light-dark(#AF3A03, #FE8019);">+</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#AF3A03, #FE8019);">/</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#AF3A03, #FE8019);">?&lt;repo&gt;</span><span style="color: light-dark(#7C6F64, #A89984);">[</span><span style="color: light-dark(#AF3A03, #FE8019);">^/</span><span style="color: light-dark(#7C6F64, #A89984);">]</span><span style="color: light-dark(#AF3A03, #FE8019);">+</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#AF3A03, #FE8019);">/releases/download</span><span style="color: light-dark(#7C6F64, #A89984);">}</span><span style="color: light-dark(#7C6F64, #A89984);">.</span><span style="color: light-dark(#B57614, #FABD2F);">match</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">@</span><span style="color: light-dark(#076678, #83A598);">url</span><span style="color: light-dark(#7C6F64, #A89984);">)</span></span>
<span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">      return</span><span style="color: light-dark(#9D0006, #FB4934);"> unless</span><span> match_data</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">      @</span><span style="color: light-dark(#076678, #83A598);">owner</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span> match_data</span><span style="color: light-dark(#7C6F64, #A89984);">[</span><span style="color: light-dark(#7C6F64, #A89984);">:</span><span style="color: light-dark(#8F3F71, #D3869B);">owner</span><span style="color: light-dark(#7C6F64, #A89984);">]</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">      @</span><span style="color: light-dark(#076678, #83A598);">repo</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span> match_data</span><span style="color: light-dark(#7C6F64, #A89984);">[</span><span style="color: light-dark(#7C6F64, #A89984);">:</span><span style="color: light-dark(#8F3F71, #D3869B);">repo</span><span style="color: light-dark(#7C6F64, #A89984);">]</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">      @</span><span style="color: light-dark(#076678, #83A598);">filename</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span> File</span><span style="color: light-dark(#7C6F64, #A89984);">.</span><span style="color: light-dark(#B57614, #FABD2F);">basename</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">@</span><span style="color: light-dark(#076678, #83A598);">url</span><span style="color: light-dark(#7C6F64, #A89984);">)</span></span>
<span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">  end</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">  def</span><span style="color: light-dark(#B57614, #FABD2F);"> fetch</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#8F3F71, #D3869B);">timeout</span><span style="color: light-dark(#7C6F64, #A89984);">:</span><span style="color: light-dark(#8F3F71, #D3869B);"> nil</span><span style="color: light-dark(#7C6F64, #A89984);">)</span></span>
<span class="giallo-l"><span>    ohai </span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span><span style="color: light-dark(#79740E, #B8BB26);">Downloading </span><span style="color: light-dark(#7C6F64, #A89984);">#{</span><span style="color: light-dark(#79740E, #B8BB26);">url</span><span style="color: light-dark(#7C6F64, #A89984);">}</span><span style="color: light-dark(#79740E, #B8BB26);"> using GitHub CLI</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span></span>
<span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">    if</span><span> cached_location</span><span style="color: light-dark(#7C6F64, #A89984);">.</span><span style="color: light-dark(#B57614, #FABD2F);">exist?</span></span>
<span class="giallo-l"><span>        puts</span><span style="color: light-dark(#7C6F64, #A89984);"> &quot;</span><span style="color: light-dark(#79740E, #B8BB26);">Already downloaded: </span><span style="color: light-dark(#7C6F64, #A89984);">#{</span><span style="color: light-dark(#79740E, #B8BB26);">cached_location</span><span style="color: light-dark(#7C6F64, #A89984);">}</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span></span>
<span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">    else</span></span>
<span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">      begin</span></span>
<span class="giallo-l"><span>          temporary_path</span><span style="color: light-dark(#7C6F64, #A89984);">.</span><span style="color: light-dark(#B57614, #FABD2F);">dirname</span><span style="color: light-dark(#7C6F64, #A89984);">.</span><span style="color: light-dark(#B57614, #FABD2F);">mkpath</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span style="color: light-dark(#928374, #928374);font-style: italic;">          #</span><span style="color: light-dark(#928374, #928374);font-style: italic;"> note path hack</span></span>
<span class="giallo-l"><span style="color: light-dark(#B57614, #FABD2F);">          system_command</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span><span style="color: light-dark(#79740E, #B8BB26);">/opt/homebrew/bin/gh</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span><span style="color: light-dark(#7C6F64, #A89984);">,</span><span style="color: light-dark(#8F3F71, #D3869B);"> args</span><span style="color: light-dark(#7C6F64, #A89984);">:</span><span style="color: light-dark(#7C6F64, #A89984);"> [</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">                &quot;</span><span style="color: light-dark(#79740E, #B8BB26);">release</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span><span style="color: light-dark(#7C6F64, #A89984);">,</span><span style="color: light-dark(#7C6F64, #A89984);"> &quot;</span><span style="color: light-dark(#79740E, #B8BB26);">download</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span><span style="color: light-dark(#7C6F64, #A89984);">,</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">                &quot;</span><span style="color: light-dark(#79740E, #B8BB26);">-R</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span><span style="color: light-dark(#7C6F64, #A89984);">,</span><span style="color: light-dark(#7C6F64, #A89984);"> &quot;</span><span style="color: light-dark(#7C6F64, #A89984);">#{</span><span style="color: light-dark(#7C6F64, #A89984);">@</span><span style="color: light-dark(#076678, #83A598);">owner</span><span style="color: light-dark(#7C6F64, #A89984);">}</span><span style="color: light-dark(#79740E, #B8BB26);">/</span><span style="color: light-dark(#7C6F64, #A89984);">#{</span><span style="color: light-dark(#7C6F64, #A89984);">@</span><span style="color: light-dark(#076678, #83A598);">repo</span><span style="color: light-dark(#7C6F64, #A89984);">}</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span><span style="color: light-dark(#7C6F64, #A89984);">,</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">                &quot;</span><span style="color: light-dark(#79740E, #B8BB26);">--pattern</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span><span style="color: light-dark(#7C6F64, #A89984);">,</span><span style="color: light-dark(#7C6F64, #A89984);"> &quot;</span><span style="color: light-dark(#7C6F64, #A89984);">#{</span><span style="color: light-dark(#7C6F64, #A89984);">@</span><span style="color: light-dark(#076678, #83A598);">filename</span><span style="color: light-dark(#7C6F64, #A89984);">}</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span><span style="color: light-dark(#7C6F64, #A89984);">,</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">                &quot;</span><span style="color: light-dark(#79740E, #B8BB26);">-D</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span><span style="color: light-dark(#7C6F64, #A89984);">,</span><span style="color: light-dark(#7C6F64, #A89984);"> &quot;</span><span style="color: light-dark(#7C6F64, #A89984);">#{</span><span style="color: light-dark(#79740E, #B8BB26);">temporary_path</span><span style="color: light-dark(#7C6F64, #A89984);">}</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">             ]</span><span style="color: light-dark(#7C6F64, #A89984);">,</span><span style="color: light-dark(#8F3F71, #D3869B);"> print_stderr</span><span style="color: light-dark(#7C6F64, #A89984);">:</span><span style="color: light-dark(#8F3F71, #D3869B);"> true</span><span style="color: light-dark(#7C6F64, #A89984);">)</span></span>
<span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">      rescue</span><span style="color: light-dark(#076678, #83A598);"> ErrorDuringExecution</span></span>
<span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">          raise</span><span style="color: light-dark(#076678, #83A598);"> GitHubCliDownloadStrategyError</span><span style="color: light-dark(#7C6F64, #A89984);">,</span><span style="color: light-dark(#7C6F64, #A89984);"> &quot;</span><span style="color: light-dark(#79740E, #B8BB26);">GitHub CLI download failed for: </span><span style="color: light-dark(#7C6F64, #A89984);">#{</span><span style="color: light-dark(#79740E, #B8BB26);">url</span><span style="color: light-dark(#7C6F64, #A89984);">}</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span></span>
<span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">      end</span></span>
<span class="giallo-l"><span>      cached_location</span><span style="color: light-dark(#7C6F64, #A89984);">.</span><span style="color: light-dark(#B57614, #FABD2F);">dirname</span><span style="color: light-dark(#7C6F64, #A89984);">.</span><span style="color: light-dark(#B57614, #FABD2F);">mkpath</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span style="color: light-dark(#928374, #928374);font-style: italic;">      #</span><span style="color: light-dark(#928374, #928374);font-style: italic;"> Find the downloaded file in the temporary path</span></span>
<span class="giallo-l"><span style="color: light-dark(#076678, #83A598);">      downloaded_file</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span> Dir</span><span style="color: light-dark(#7C6F64, #A89984);">[</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span><span style="color: light-dark(#7C6F64, #A89984);">#{</span><span style="color: light-dark(#79740E, #B8BB26);">temporary_path</span><span style="color: light-dark(#7C6F64, #A89984);">}</span><span style="color: light-dark(#79740E, #B8BB26);">/*</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span><span style="color: light-dark(#7C6F64, #A89984);">]</span><span style="color: light-dark(#7C6F64, #A89984);">.</span><span style="color: light-dark(#B57614, #FABD2F);">first</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">      if</span><span> downloaded_file</span></span>
<span class="giallo-l"><span>          FileUtils</span><span style="color: light-dark(#7C6F64, #A89984);">.</span><span style="color: light-dark(#B57614, #FABD2F);">mv</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span>downloaded_file</span><span style="color: light-dark(#7C6F64, #A89984);">,</span><span> cached_location</span><span style="color: light-dark(#7C6F64, #A89984);">)</span></span>
<span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">      else</span></span>
<span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">          raise</span><span style="color: light-dark(#076678, #83A598);"> GitHubCliDownloadStrategyError</span><span style="color: light-dark(#7C6F64, #A89984);">,</span><span style="color: light-dark(#7C6F64, #A89984);"> &quot;</span><span style="color: light-dark(#79740E, #B8BB26);">Downloaded file not found in </span><span style="color: light-dark(#7C6F64, #A89984);">#{</span><span style="color: light-dark(#79740E, #B8BB26);">temporary_path</span><span style="color: light-dark(#7C6F64, #A89984);">}</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span></span>
<span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">      end</span></span>
<span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">    end</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span>    symlink_location</span><span style="color: light-dark(#7C6F64, #A89984);">.</span><span style="color: light-dark(#B57614, #FABD2F);">dirname</span><span style="color: light-dark(#7C6F64, #A89984);">.</span><span style="color: light-dark(#B57614, #FABD2F);">mkpath</span></span>
<span class="giallo-l"><span>    FileUtils</span><span style="color: light-dark(#7C6F64, #A89984);">.</span><span style="color: light-dark(#B57614, #FABD2F);">ln_s</span><span> cached_location</span><span style="color: light-dark(#7C6F64, #A89984);">.</span><span style="color: light-dark(#B57614, #FABD2F);">relative_path_from</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span>symlink_location</span><span style="color: light-dark(#7C6F64, #A89984);">.</span><span style="color: light-dark(#B57614, #FABD2F);">dirname</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#7C6F64, #A89984);">,</span><span> symlink_location</span><span style="color: light-dark(#7C6F64, #A89984);">,</span><span style="color: light-dark(#8F3F71, #D3869B);"> force</span><span style="color: light-dark(#7C6F64, #A89984);">:</span><span style="color: light-dark(#8F3F71, #D3869B);"> true</span></span>
<span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">  end</span></span>
<span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">end</span></span></code></pre>
<p>As you can see from the horrible hack noted in the comment, I didn’t bother figuring out how paths are set up for homebrew scripts. Maybe you know more than I do.</p>
<p>You’ll want to edit the part of your formula file where you list available asset files by OS and architecture to mention the strategy like this:</p>
<pre class="giallo" style="color-scheme: light dark; color: light-dark(#3C3836, #EBDBB2); background-color: light-dark(#F9F5D7, #1D2021);"><code data-lang="ruby"><span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">if</span><span> OS</span><span style="color: light-dark(#7C6F64, #A89984);">.</span><span style="color: light-dark(#B57614, #FABD2F);">mac?</span><span style="color: light-dark(#9D0006, #FB4934);"> &amp;&amp;</span><span> Hardware</span><span style="color: light-dark(#7C6F64, #A89984);">::</span><span>CPU</span><span style="color: light-dark(#7C6F64, #A89984);">.</span><span style="color: light-dark(#B57614, #FABD2F);">arm?</span></span>
<span class="giallo-l"><span>    url    </span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span><span style="color: light-dark(#79740E, #B8BB26);">https://github.com/ceejbot/codefact/releases/download/v1.0.2/codefact-aarch64-apple-darwin.tar.gz</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span><span style="color: light-dark(#7C6F64, #A89984);">,</span><span style="color: light-dark(#8F3F71, #D3869B);"> using</span><span style="color: light-dark(#7C6F64, #A89984);">:</span><span style="color: light-dark(#076678, #83A598);"> GitHubCliDownloadStrategy</span></span>
<span class="giallo-l"><span>    sha256 </span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span><span style="color: light-dark(#79740E, #B8BB26);">0e0d03a2f787f6d875ff02ce91cf495cc95878ace96b9a9c8f3073a6a9688b44</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span></span>
<span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">end</span></span></code></pre><h2 id="test-it">Test it</h2>
<p>Get some people to test the whole process and make sure everything works. Verify that your docs are clear enough that even the people who break everything can manage to make it work. Test your release workflows. Test updates.</p>
<p>You’re done.</p>
<h2 id="autogenerating-formula-files-rust-bins-only">Autogenerating formula files (Rust bins only)</h2>
<p>As I advised you to do above, I looked for GitHub actions to generate formula files for me from a template. There’s one that looks heavily used, but I immediately had difficulties with python’s <code>requests</code> module with it. My patience for debugging python packaging problems is about zero these days, so I banged out a Rust cli tool to write exactly the formula file that the late <code>cargo-dist</code> used to write for me. This tool, <code>formulaic</code>, also has a flag for specifying using the <code>gh</code> strategy instead of figuring it out automatically.</p>
<p>This tool reads the latest GitHub release and the Cargo manifest file it’s given the path to, which means it only works for Rust executables. It’s a quite predictable thing that fills out a template. <a rel="noopener external" target="_blank" href="https://github.com/ceejbot/formulaic">The source is on GitHub</a>. You can see its <a rel="noopener external" target="_blank" href="https://github.com/ceejbot/homebrew-tap/blob/latest/Formula/formulaic.rb">self-generated formula file</a> in my own Homebrew tap. Snag and edit to taste if it saves you some time, so long as you also share your changes. (See the license.)</p>
<h2 id="random-shasum-trivia">Random shasum trivia</h2>
<p>GitHub seems to have recently started adding sha 256 sums to release asset data, so they don’t have to be calculated the way I’m doing it. Unless you’re very cautious, that is. You can use <code>gh</code> to get the shasum of any specific asset:</p>
<pre class="giallo" style="color-scheme: light dark; color: light-dark(#3C3836, #EBDBB2); background-color: light-dark(#F9F5D7, #1D2021);"><code data-lang="plain"><span class="giallo-l"><span>gh release view -R ceejbot/tomato --json assets | jq -r &#39;.assets[] | .name, .digest&#39;</span></span></code></pre>
<p>Here’s something I learned while observing that for some reason there are three different ways to get a sha256 sum of a file out of the box on MacOS. There are two output variations, because of course there are.</p>
<table><thead><tr><th>command</th><th>style</th><th>origin</th></tr></thead><tbody>
<tr><td><code>sha256sum README.md</code></td><td>linux mode</td><td>compiled</td></tr>
<tr><td><code>sha256 README.md</code></td><td>bsd mode</td><td>compiled, same exec</td></tr>
<tr><td><code>shasum -a 256 README.md</code></td><td>linux mode</td><td>perl, CPAN</td></tr>
<tr><td><code>shasum -a 256 --tag README.md</code></td><td>bsd mode</td><td></td></tr>
</tbody></table>
<p>Note that Homebrew wants the bare hex string, without filename decoration of any kind. If you use the github-generated shasum, you’ll need to trim the <code>sha256:</code> prefix.</p>
]]></content:encoded>
      </item>
      <item>
          <title>Modern terminal environment</title>
          <link>https://blog.ceejbot.com/posts/modern-shell/</link>
          <pubDate>Sun, 12 Jan 2025 13:17:11 +0000</pubDate>
          <author>ceejceej@gmail.com (C J Silverio)</author>
          <guid>https://blog.ceejbot.com/posts/modern-shell/</guid>
          <description>In which I reject modernity and embrace being effective.</description>
          <content:encoded><![CDATA[<p>I read Julia Evans’s blog post on <a rel="noopener external" target="_blank" href="https://jvns.ca/blog/2025/01/11/getting-a-modern-terminal-setup/">“What’s involved in getting a modern terminal environment”</a> and got very excited because there are lots of great comments in that blog post, and I have a few more of my own, and a little meta-commentary.</p>
<h2 id="the-terminal-and-the-shell">The terminal and the shell</h2>
<p>Are there good modern answers for terminal software? Sort of. There are certainly many options. I still use <a rel="noopener external" target="_blank" href="https://iterm2.com">iTerm2</a>. I try other fancy new terminal software but end up back at iTerm2 every time for the combination of Macintosh features and overall performance. The Electron ones are sluggish enough that I feel it. <a rel="noopener external" target="_blank" href="https://www.warp.dev">Warp</a> is the magical one I tinker with sometimes. I loathe the idea of online sharing features or LLM auto-completion in my terminal, so I pretend those don’t exist and that it’s free software. The existence of those features for money in Warp might either please or distress you. It is, however, truly a modern take on what a terminal experience can be.</p>
<p>Windows didn’t have any good terminal programs other than the built-in one in VSCode until recently. The default Terminal program is now just fine. I use <a rel="noopener external" target="_blank" href="https://github.com/wez/wezterm">wezterm</a> when I’m using Windows. This requires customization with Lua and is not particularly modern or magical or command-aware or anything like that, but it is zippy and therefore better than VSCode. (I note, in passing, that Windows is a pretty important environment for lots of programmers, and Rust treats Windows as a first-class target, so Rust projects <em>can</em> do nice things for Windows users if their authors wish.)</p>
<p>The shell is no contest. Use the <a rel="noopener external" target="_blank" href="https://fishshell.com">fish shell</a>, as Julia recommends. You nushell users are a special sort of human; you may continue being you.</p>
<p>If your fingers type <code>!!</code> and <code>!$</code> enough to miss those bash-isms, install <a rel="noopener external" target="_blank" href="https://github.com/oh-my-fish/oh-my-fish">oh-my-fish</a> and get the <code>bang-bang</code> package. There are some other nice things there to snag.</p>
<p>Like Julia, I install every base16 theme there is. I change my theme colors and monospaced typeface about once a year, to keep my brain thinking everything looks different. I have no idea if this is helpful to anything or not, but I like to pretend it is. Get a <a rel="noopener external" target="_blank" href="https://www.nerdfonts.com">Nerd font variation</a> to keep the prompt looking right. Some good ones to consider: Cascadia Code, Iosevka, Fira Code, Monaspace.</p>
<h2 id="oxidize-everything">Oxidize everything</h2>
<p>Rust gave us all a systems programming language with ergonomics considerably better than the hand-held table saw that is C++, and the terminal experience is better for it. Look for modern Rust variations of everything, and for a handful of neat Golang tools as well.</p>
<p>Start by aliasing <code>ls</code> to <a rel="noopener external" target="_blank" href="https://github.com/eza-community/eza?tab=readme-ov-file#eza">eza</a>. You will not look back.</p>
<p>I stopped fussing about prompt setup when I found <a rel="noopener external" target="_blank" href="https://starship.rs">Starship</a>. Well, you have to fuss, but only once and then you’re done forever. Customize with toml once, put the toml into your dotfiles repo, then you have a prompt that works with any shell you are using at the moment, in whatever terminal in whatever environment.</p>
<p>You have lots of options for reading text in the terminal that aren’t just cat with one of two pager variations we’ve had for 35 years. Why not <a rel="noopener external" target="_blank" href="https://github.com/charmbracelet/glow">read styled markdown</a>?</p>
<p>Install the fuzzy-finder <a rel="noopener external" target="_blank" href="https://junegunn.github.io/fzf/">fzf</a> and its integration with your chosen shell (which of course is <code>fish</code>, finally a shell for the 90s). This is a subtle enhancer for everything if you start thinking of it as a part of how you find and select things. Many other tools come with <code>fzf</code> integrations built in directly, or you can shell-script it on up. You can get <a rel="noopener external" target="_blank" href="https://github.com/BurntSushi/ripgrep?tab=readme-ov-file#ripgrep-rg">ripgrep</a> (which you should be using too, come to think of it) into the mix to get <a rel="noopener external" target="_blank" href="https://github.com/junegunn/fzf/blob/master/ADVANCED.md#using-fzf-as-interactive-ripgrep-launcher">find file in project</a> with fzf.</p>
<h2 id="editors">Editors</h2>
<p>I don’t edit text in the terminal. For most editing, I use <a rel="noopener external" target="_blank" href="https://zed.dev">zed</a> and <code>zed</code>’s terminal integration. I don’t have any interest in “collaborating” with stochastic parrots, but I do have a very high interest in snappy editors with excellent language server integration. I switched to <code>zed</code> from VSCode a couple of years ago and haven’t looked back.</p>
<p>I do still edit some files in the terminal, out of very old habit. When I edit dotfiles and other system configuration files, I type <code>vi</code> and open them up right there. Why <code>vi</code>? Because I am old and why type three characters when two is enough? <code>vi</code> dates back to Bill Joy, if I recall, and that’s 40 years ago.</p>
<p>Editing text is not a modern thing to do inside the terminal, I think. But is modernity really what we’re going for? After all, people still use <code>vim</code> and <code>neovim</code> to write software effectively every day.</p>
<h2 id="i-say-something-about-goals">I say something about goals</h2>
<p>“Everything affects everything else” is true and so is the fact that nothing is perfectly consistent with everything else. It was all implemented at different times over decades by cats who resisted herding, or who were working in a slightly different context than the other cats, with different libraries. And all these grumpy cats were doing the essentially stupid thing of kind-of emulating a dead hardware terminal from a dead microcomputer company that turned into ANSI eventually. That it works at all is nice; that we can get it to do decent things is surprising. I’m not sure what it means that it’s still by far the most effective way to get programming work done for many people.</p>
<p>So the terminal is a weird mess, yup. Changing your setup is disruptive if you do it rarely. You get practice in setting things up if you throw things up into the air often, though, and that’s why I do it. I try new things far more often than I choose to integrate them into my daily shell workflows. I can’t tell you how many times I’ve tried shell history things that promise to revolutionize my shell experience that ended up driving me to distraction within 15 minutes. It’s at least, um, twice? Once? Once for sure. I found all of the above things and a lot more that I am restraining myself from listing here by experimenting a little bit every so often with things I hear about.</p>
<p>Changing my setup isn’t the goal, really. Finding things that are worth integrating from the seething stew of modern software, that’s my goal. What makes them worth integrating? Well, it’s not modernity, not primarily anyway, as I say above.  Modernity doesn’t select modal editors. Modernity doesn’t reach for the terminal. Modernity abandons the VT100 and sets up a mouse and windows.</p>
<p>So we’re not trying to be modern. We’re trying to be effective. Terminals and <code>vim</code> are <em>effective</em> and, in the hands of experts, <em>powerful</em>. Aim for the set of tools that make you effective. Select new tools based on how they’ll make your more effective at whatever it is you’re doing in the shell. Try things, reject some, integrate others. Tell your friends about the good stuff. (Tell me about the good stuff, too, thanks.)</p>
]]></content:encoded>
      </item>
      <item>
          <title>Understanding Software</title>
          <link>https://blog.ceejbot.com/posts/understanding-software/</link>
          <pubDate>Fri, 29 Mar 2024 21:58:26 +0000</pubDate>
          <author>ceejceej@gmail.com (C J Silverio)</author>
          <guid>https://blog.ceejbot.com/posts/understanding-software/</guid>
          <description>A slightly sanitized version of a company presentation I made today.</description>
          <content:encoded><![CDATA[<p>Nothing I said in this presentation will be shocking to any readers of this blog, but my audience here was the entire company. I wanted to let a group of non-programmers know what we do and how everybody contributes to the work of making useful software.</p>
<p><a href="/files/understanding-software.pdf">PDF version of the rendered slides</a></p>
<p>The Markdown version follows! The <code>SN:</code> indicates my speaker notes.</p>
<hr />
<h1 id="understanding-software">Understanding <strong>software</strong></h1>
<h2 id="and-how-it-comes-to-be">and how it comes to be</h2>
<p>– @ceejbot</p>
<p>SN: A note about the slides: they’re anchor points to call out important words or to remind you where we are in the presentation. You don’t have to let them fill a whole screen if you don’t want to. There aren’t any flashing lights or animations in the presentation, either.</p>
<hr />
<p><img src="/images/twitter-design.jpg" alt="Twitter’s design" /></p>
<p>SN: How does this sketch turn into a company that had thousands of developers, millions of daily users, and an effect on the entire world? At its starting point was nothing, and then software happened, and for about 15 years we had <em>something</em>. Politics, news, culture– all happened because of this software. I’ve always found this amazing– somebody has an idea, and this THING appears out of nothingness.</p>
<hr />
<h2 id="what-is-software">What is <strong>software</strong>?</h2>
<h2 id="how-do-we-build-it">How do we <strong>build</strong> it?</h2>
<h2 id="what-happens-afterward">What happens <strong>afterward</strong>?</h2>
<p>SN: Carl Sagan said if you want to make an apple pie, first you must invent the universe. We aren’t going to go that far back, but we are going to talk about these three questions. I need to caveat all of this: My answers to these questions come from a specific perspective– me &amp; my career experiences. I am not going to talk about how it’s done at Google or Facebook or other weird gigantic companies. Going to talk about how software is built by small to medium sized teams, in the Silicon Valley, ones that happen to have a lot of ex-Apple product influence.</p>
<hr />
<h1 id="this-company-writes-software">This company writes <strong>software</strong></h1>
<ul>
<li>Everyone here contributes to this work.</li>
<li>Everyone here would benefit from understanding how we do it.</li>
</ul>
<p>SN: We write a lot of software. I counted X meaningful lines the other day.</p>
<hr />
<h1 id="what-is-programming-anyway"><strong>What</strong> is <strong>programming</strong> anyway?</h1>
<p>SN: A traditional answer is that programming is typing long text files with instructions to make a computer do things. But when I’m sitting with my feet up on my desk, or when I’m pacing around my house muttering, or when I’m scribbling in my notebook, I’m also programming. I’m going to go to one of my favorite essays of all time for another answer.</p>
<hr />
<p>“[P]rogramming properly should be regarded as an activity by which <em>the programmers form or achieve a certain kind of insight, a theory, of the matters at hand.</em> This suggestion is in contrast to what appears to be a more common notion, that programming should be regarded as a production of a program and certain other texts.”</p>
<p>— Peter Naur, <a rel="noopener external" target="_blank" href="https://gist.github.com/onlurking/fc5c81d18cfce9ff81bc968a7f342fb1">“Programming as Theory-Building”</a>, 1985</p>
<p>SN: Peter Naur is the Naur of Backus-Naur Form, which some of the programmers in the audience might remember, and one of the designers of Algol, the extremely influential programming language. This is from a 1985 essay about what he’d learned about how to write and maintain and operate software. I think this is right on target. Let’s spend a moment looking at Naur’s theory of the program.</p>
<hr />
<p>Naur says a programmer who has the “theory of the program” can:</p>
<ol>
<li>Explain how the solution relates to the affairs of the world that it helps to handle.</li>
<li>Explain why each part of the program is what it is.</li>
<li>Respond constructively to any demand for a modification of the program so as to support the affairs of the world in a new manner.</li>
</ol>
<p>SN: Naur was writing an an earlier era, so he talks about single programs here. Today, we write many programs and connect them all together into software systems. What he called “the theory of the program” is what I would call “the model of the system”, but both phrases get at the heart of the concept.</p>
<hr />
<h1 id="software-is"><strong>Software</strong> is:</h1>
<ul>
<li>a lot of text files with instructions to computers (they matter!)</li>
<li>that express the authors’ understanding of a real-world problem</li>
<li>and their solution to that problem</li>
<li>(and the same for every building block they needed along the way)</li>
</ul>
<p>Programming is how we get there.</p>
<p>SN: And this is what we have to understand to function effectively. Let’s zero in on one part of that.</p>
<hr />
<h1 id="to-write-software-effectively">To <strong>write</strong> software effectively</h1>
<h1 id="you-must-understand">you must understand:</h1>
<ul>
<li>the <em>affair of the world</em></li>
<li><em>how</em> the program goes about solving it</li>
</ul>
<p>SN: The how is mind-bogglingly complex, and very few people working on any team project understand the whole thing. Some people who’ve been involved with it for a long time might have a better understanding than others, but it’s possible that nobody understands the whole thing.</p>
<p>SN: Now, I want to back up from the theory a little bit to talk about those text files. They do matter!</p>
<hr />
<h1 id="code-is-communication-with-computers-and-humans">Code is <strong>communication</strong> with computers and humans.</h1>
<ul>
<li>Code defines data (nouns) and functions (verbs).</li>
<li>We name things carefully because the names are meaningful to humans.</li>
<li>A program becomes a language of its own. (Hat-tip to Dijkstra.)</li>
</ul>
<p>SN: In the jargon of programmers, every complex system is a domain-specific language expressing our understanding of the problem.</p>
<hr />
<p>Can you guess what this code is supposed to do?</p>
<pre class="giallo" style="color-scheme: light dark; color: light-dark(#3C3836, #EBDBB2); background-color: light-dark(#F9F5D7, #1D2021);"><code data-lang="rust"><span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">fn</span><span style="color: light-dark(#B57614, #FABD2F);"> ch</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#427B58, #8EC07C);"> -&gt;</span><span style="color: light-dark(#B57614, #FABD2F);"> Result</span><span style="color: light-dark(#7C6F64, #A89984);">&lt;</span><span style="color: light-dark(#B57614, #FABD2F);">usize</span><span style="color: light-dark(#7C6F64, #A89984);">,</span><span style="color: light-dark(#B57614, #FABD2F);"> Error</span><span style="color: light-dark(#7C6F64, #A89984);">&gt;</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">{</span></span>
<span class="giallo-l"><span style="color: light-dark(#AF3A03, #FE8019);">    let</span><span style="color: light-dark(#076678, #83A598);"> a</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span style="color: light-dark(#B57614, #FABD2F);"> a</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#427B58, #8EC07C);">?</span><span style="color: light-dark(#7C6F64, #A89984);">;</span></span>
<span class="giallo-l"><span style="color: light-dark(#B57614, #FABD2F);">    Ok</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#076678, #83A598);">a</span><span style="color: light-dark(#427B58, #8EC07C);">.</span><span style="color: light-dark(#B57614, #FABD2F);">f</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#B57614, #FABD2F);">S</span><span style="color: light-dark(#427B58, #8EC07C);">::</span><span style="color: light-dark(#B57614, #FABD2F);">H6</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#427B58, #8EC07C);">.</span><span style="color: light-dark(#B57614, #FABD2F);">len</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#7C6F64, #A89984);">)</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">}</span></span></code></pre>
<p>SN: The programmers in the audience all guess that it’s getting the length of something, but they have no idea what that’s the length of, or what any of the other stuff does.</p>
<hr />
<p>Can you guess what this code is supposed to do?</p>
<pre class="giallo" style="color-scheme: light dark; color: light-dark(#3C3836, #EBDBB2); background-color: light-dark(#F9F5D7, #1D2021);"><code data-lang="rust"><span class="giallo-l"><span style="color: light-dark(#928374, #928374);font-style: italic;">///</span><span style="color: light-dark(#928374, #928374);font-style: italic;"> Count how many hedgies are in our zoo.</span></span>
<span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">fn</span><span style="color: light-dark(#B57614, #FABD2F);"> count_hedgehogs</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#427B58, #8EC07C);"> -&gt;</span><span style="color: light-dark(#B57614, #FABD2F);"> Result</span><span style="color: light-dark(#7C6F64, #A89984);">&lt;</span><span style="color: light-dark(#B57614, #FABD2F);">usize</span><span style="color: light-dark(#7C6F64, #A89984);">,</span><span style="color: light-dark(#B57614, #FABD2F);"> ZooInventoryError</span><span style="color: light-dark(#7C6F64, #A89984);">&gt;</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">{</span></span>
<span class="giallo-l"><span style="color: light-dark(#AF3A03, #FE8019);">    let</span><span style="color: light-dark(#076678, #83A598);"> animals</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span style="color: light-dark(#B57614, #FABD2F);"> fetch_all_animals</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#427B58, #8EC07C);">?</span><span style="color: light-dark(#7C6F64, #A89984);">;</span></span>
<span class="giallo-l"><span style="color: light-dark(#AF3A03, #FE8019);">    let</span><span style="color: light-dark(#076678, #83A598);"> hedgie_list</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span style="color: light-dark(#076678, #83A598);"> animals</span><span style="color: light-dark(#427B58, #8EC07C);">.</span><span style="color: light-dark(#B57614, #FABD2F);">filter_for</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#B57614, #FABD2F);">Species</span><span style="color: light-dark(#427B58, #8EC07C);">::</span><span style="color: light-dark(#B57614, #FABD2F);">Hedgehog</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#7C6F64, #A89984);">;</span></span>
<span class="giallo-l"><span style="color: light-dark(#B57614, #FABD2F);">    Ok</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#076678, #83A598);">hedgie_list</span><span style="color: light-dark(#427B58, #8EC07C);">.</span><span style="color: light-dark(#B57614, #FABD2F);">len</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#7C6F64, #A89984);">)</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">}</span></span></code></pre>
<p>SN: You probably have a good guess about what this means, even if you don’t know the specific programming language I’m using or any programming language at all. This code communicates to humans as well as computers. This might do the exact same thing as the previous code when run, but this version has an additional layer of useful meaning, and supports Naur’s theory-building better. (It could lie, and be about counting numbats, but we try not to do that.)</p>
<hr />
<h1 id="how-do-we-invent-that-specific-language-to-express-a-problem">How do we <strong>invent</strong> that specific language to express a problem?</h1>
<p>SN: One thing that I have learned is that no two software solutions of a problem ever look alike. I know what little I know about sudoku solving from the talk $colleague gave at a lunch and learn a couple of weeks ago. But if you gave me and $colleague the task of writing a sudoku solver, we’d write <em>completely</em> different programs. If you gave us the task of writing a solver together, we’d write something different again. This, btw, is very cool, because it says something about human minds that fascinates me. BUT despite the differences in end result, both of us would use a similar heuristic to get there.</p>
<hr />
<ol>
<li><strong>Understand</strong> the real-world problem.</li>
<li><strong>Analyze</strong> it from a software point of view.</li>
<li><strong>Imagine</strong> a solution.</li>
<li><strong>Align</strong> a team on the problem, the solution, and the values that shape the solution.</li>
<li><strong>Coordinate</strong> to express that understanding in code.</li>
<li>Get <strong>feedback</strong> and iterate.</li>
<li><strong>SHIP IT.</strong></li>
</ol>
<p>SN: There are no secrets here. It works this way for all problems in software, whether small or large. Some things are easier when you’re a team of one– it’s easy to align with yourself. That might be hard with a team of 20, and very hard indeed when your team is is larger than Dunbar’s number. But this is how it works. Let’s look at a simple example.</p>
<hr />
<p><img src="/images/Visicalc.png" alt="Visicalc.png" /></p>
<p>SN: This is Visicalc, the first spreadsheet software anybody remembers. 1977. (The first one was LANPAR in 1969.) This one invention sold personal computers to millions of small businesses and is a huge part of Microsoft’s revenue even today. Spreadsheets ate the world and run many businesses and are part of critical workflows everywhere. But somebody had to make the first one.</p>
<hr />
<ul>
<li><strong>Understand:</strong> the workflow of accountants.</li>
<li><strong>Analyze:</strong> These numbers and dates are data a computer can store; doing arithmetic on columns of data is something a computer can do.</li>
<li><strong>Imagine:</strong> What if we let people type numbers into boxes and the computer automatically did the math?</li>
<li><strong>Coordinate:</strong> 2 people in a room!</li>
<li><strong>Ship:</strong> LANPAR was 1969. It didn’t ship as we understand it; but Visicalc did.</li>
</ul>
<p>SN: The word “spreadsheet” comes directly from accounting. Let’s go broader, and apply the process to our shared endeavor.</p>
<hr />
<h1 id="step-1-understand-the-real-world-problem">Step 1: <strong>Understand</strong> the real-world problem</h1>
<ul>
<li>Who are our customers? What are they trying to do?</li>
<li>This is difficult! Our industry is complex!</li>
<li>This is why every company needs its subject-matter experts.</li>
<li>Everybody involved in designing and implementing the software does better the more they understand the people who’ll use that software and what they’re trying to do.</li>
</ul>
<p>SN: Our experts and our customer contact people keep programmers like me in touch with who we’re making tools for. I believe I speak for every person on the engineering team when I say that we all desperately want more understanding of our customers. Please! Talk to us!</p>
<hr />
<h1 id="we-share-what-we-understand">We <strong>share</strong> what we understand.</h1>
<ul>
<li>Writing and reading documents.</li>
<li>Talking to each other.</li>
</ul>
<p>SN: Once we understand something, we don’t leap to writing code. Instead we share that understanding.</p>
<hr />
<h1 id="step-2-analyze-the-problem">Step 2: <strong>Analyze</strong> the problem</h1>
<p>“To a person with a pencil, everything looks like a sentence. To a person with a TV camera, everything looks like an image. To a person with a computer, everything looks like data.”</p>
<p>—Neil Postman, “Five Things We Need to Know About Technological Change”</p>
<p>SN: Or more succinctly, the medium is the message, and the medium of software is data.</p>
<hr />
<h1 id="study-the-data">Study the <strong>data</strong></h1>
<p>The medium of software is information, or data. Software collects or generates data, then transforms that data via rules. The process of describing the data and writing the rules is what occupies us all day.</p>
<p>SN: Call out some of the nouns we track in data.</p>
<hr />
<h1 id="study-what-people-do-with-that-data">Study what people <strong>do</strong> with that data</h1>
<p>Data by itself is uninteresting. People are using it to do something. What?</p>
<p>SN: Talk about how our customers use their data.</p>
<hr />
<h1 id="step-3-ask-how-automating-that-with-software-would-help">Step 3: Ask how <strong>automating</strong> that with software would help.</h1>
<p>What if… we took a process that take weeks right now, and made it take minutes instead because software does the correlation for you?</p>
<p>SN: Marc Andreesen described this as “software eating the world”, and he should know. He invented the image tag, and that was enough for the web to eat the world.</p>
<hr />
<h1 id="deepen-that-computer-focused-analysis"><strong>Deepen</strong> that computer-focused analysis</h1>
<ul>
<li>What data would the software need to have available?</li>
<li>How will we get that data in a form we can use?</li>
<li>What would we need to do with that data to present useful information to humans?</li>
</ul>
<hr />
<h1 id="nouns-how-we-structure-our-data"><strong>Nouns</strong>: how we <strong>structure</strong> our data</h1>
<p><em>long list of nouns</em>: so much data!</p>
<p>SN: Talk about how subject-matter experts help us identify the data.</p>
<hr />
<h1 id="verbs-how-we-transform-that-data"><strong>Verbs</strong>: how we <strong>transform</strong> that data</h1>
<ul>
<li>we receive a lot of data, transform it, and run some truly complex analyses on it</li>
<li>we present that information to human beings in a form designed to help them make important decisions</li>
<li>server engineers, UI engineers, UX designers, data engineers, and data scientists are all involved in doing this</li>
</ul>
<p>SN: This is most of the work, right here. This is what the software <em>does</em>, its verbs.</p>
<hr />
<h1 id="step-4-align-a-team">Step 4: <strong>Align</strong> a team</h1>
<ul>
<li>on how you understand the problem</li>
<li>on the shape of your solution</li>
<li>on the values you bring to your solution</li>
</ul>
<p>SN: This is what our company meeting does. Every week, we talk about what our customers are trying to do and how well we’re solving their problems.</p>
<hr />
<h1 id="align-technically-on-the-details-of-our-solution">Align <strong>technically</strong> on the details of our solution</h1>
<ul>
<li>technical design choices</li>
<li>the details of how we represent our data</li>
<li>the building blocks of our software</li>
<li>what our architecture is</li>
<li>the values we use to decide among our options</li>
</ul>
<p>SN: What programming languages are we using? How are we storing our data? Of the countless ways we might write this, which way are we picking?</p>
<hr />
<h1 id="technical-alignment-comes-from"><strong>Technical</strong> alignment comes from:</h1>
<ul>
<li>Writing and reading documents.</li>
<li>Talking to each other.</li>
<li>Over and over (you don’t stop).</li>
</ul>
<p>SN: Alignment is an ongoing task. We must constantly communicate in person and via design documents to make sure we all understand the direction we’re going.</p>
<hr />
<h1 id="no-one-person-ever-understands-the-whole-thing">No one person ever understands the <strong>whole thing</strong></h1>
<p>Each one of us makes decisions that push the system in the right direction.</p>
<p>We must be in alignment, or those decisions might be at cross-purposes.</p>
<p>SN: Alignment is critical, because complex software is too big for any one person.</p>
<hr />
<h1 id="step-5-coordinate-to-write-all-those-text-files">Step 5: <strong>Coordinate</strong> to write all those text files.</h1>
<p>SN: DEEP SIGH. This is where all the trouble is. I could give an entire presentation on what we know about this part of it, from books people have written about their face-plants through the years. Today I’ll stick to sharing a couple of insights I hope will be useful.</p>
<hr />
<h1 id="software-development-methodologies-are-under-studied">Software development methodologies are <strong>under-studied</strong>.</h1>
<p>agile, scrum, kanban, waterfall, extreme programming, spiral, chaos, shape up, behavior-driven, lean, that weird UML-based thing, slow programming…</p>
<p>SN: All of those are real names for methodologies. Which ones result in measurable, repeatable productivity improvements? No idea. Nobody has studied this. There are a few things we do know, from looking at past projects. We do know it’s a team sport, and that communication is the core.</p>
<hr />
<blockquote>
<p>“Adding [human] power to a late software project makes it <strong>later</strong>.”
— Fred Brooks, <em>The Mythical Man-Month</em>, 1975.</p>
</blockquote>
<p>SN: Why? Because communication is, as we nerds like to say, an order N squared problem. Adding the 10th person to a project team adds 9 new lines of communication to worry about. This is a great book with a lot of great project insight, including the nugget that if it takes one woman nine months to deliver a baby, it does not follow that it would take 9 women one month to do it. And yet this is something the software industry keeps trying to do…</p>
<hr />
<h1 id="we-know-some-things-are-bad">We know some things are <strong>bad</strong></h1>
<ul>
<li>micromanagement is awful</li>
<li>long periods of crunch are actively destructive (and we have research here)</li>
<li>projects that never end wear people out</li>
</ul>
<p>SN: These things fall into the category of yeah, people are people.</p>
<hr />
<h1 id="and-some-things-are-good">… and some things are <strong>good</strong></h1>
<ul>
<li>Do <strong>write</strong> things down.</li>
<li>Do give people and teams appropriate <strong>autonomy.</strong></li>
<li>Do <strong>collaborate</strong> on the hardest work.</li>
<li>Do treat each other with <strong>kindness</strong> and <strong>respect.</strong></li>
<li>Do create <strong>emotional safety</strong>, so people can experiment and learn.</li>
</ul>
<p>SN: Huh, none of those things are about process meetings. All of these things are about enabling smart people to do their best work. Strange. Okay, let’s talk process for two more slides.</p>
<hr />
<h1 id="most-healthy-projects-do-something-agile-ish">Most healthy projects do something <strong>agile-ish</strong>.</h1>
<ul>
<li>Teams do best when they understand what they’re building, why they’re building it, and who they’re building it for.</li>
<li>Self-organization and autonomy are good.</li>
<li>Delivering working software frequently turns out to be good.</li>
<li>Communicating with the customer a lot is also good.</li>
<li>The details don’t matter much, so long as you’re talking to each other.</li>
</ul>
<p>SN: The Agile Manifesto is actually good.</p>
<hr />
<blockquote>
<p>There is no <strong>silver bullet.</strong>“
— Fred Brooks again</p>
</blockquote>
<p>SN: There is no single solution that works for every team in every moment.</p>
<hr />
<h1 id="step-6-get-feedback">Step 6. Get <strong>feedback</strong>.</h1>
<p>Feedback tells us if we’re on target or not. Spoiler: You’re almost never perfectly on target.</p>
<p>SN: Feedback loops are pretty important. We need to check on how we’re doing. We run retrospectives on incidents and on projects to see how we’re doing with our processes, and learn from our experiences. Do more of this? Less of that? Feedback loops are how learning happens.</p>
<hr />
<h1 id="can-t-we-just-get-it-right-the-first-time">Can’t we just get it right the <strong>first time?</strong></h1>
<p>Nope.</p>
<p>SN: And there’s a reason why we can’t.</p>
<hr />
<blockquote>
<p>“The <strong>map</strong> is not the <strong>territory.</strong>”
— Alfred Korzybski</p>
</blockquote>
<p>SN: Your mental model is not reality. The map is a model of the real world– the mountain and the terrain, and the trails across it. The map tells you a trail is there, but it does not tell you that the trail was washed out in a mudslide three days ago. We make our plans with the information we have, and then we learn from feedback how we’re wrong.</p>
<hr />
<h1 id="ways-our-map-is-wrong">Ways our map is <strong>wrong</strong></h1>
<ul>
<li>We didn’t understand the customer’s workflow.</li>
<li>We got our data models wrong.</li>
<li>We’re transforming our data incorrectly (or inefficiently).</li>
<li>We figured out a new approach along the way.</li>
<li>Teams didn’t align with each other, and their software doesn’t work together.</li>
<li>Software we rely on behaves unexpectedly.</li>
<li>We made mistakes while building things.</li>
</ul>
<p>SN: All of these things are guaranteed to happen, mostly at a small level, but sometimes with very big concepts. So we need feedback and take active steps to get that feedback.</p>
<hr />
<h1 id="feedback-from-testing">Feedback from <strong>testing</strong></h1>
<p>We test for many reasons!</p>
<ul>
<li>Does this one piece do what we want it to do?</li>
<li>Are all the complex pieces working together?</li>
<li>Does the system do what we expected?</li>
<li>(Did we get lost despite following our map?)</li>
</ul>
<p>SN: This is why we have QA.</p>
<hr />
<h1 id="feedback-from-our-customers">Feedback from our <strong>customers</strong></h1>
<ul>
<li>Is our system doing what our customers need?</li>
<li>(Did we reach our planned destination or did our map lie?)</li>
</ul>
<p>SN: The people who regularly talk to our customers are invaluable.</p>
<hr />
<h1 id="step-7-ship-it">Step 7. <strong>Ship it</strong>.</h1>
<p>Get it into the hands of customers as soon as it would be useful to them. Get revenue as soon as you’re able.</p>
<p>SN: The reality of Silicon Valley style software companies is that we all go into debt immediately to be able to pay salaries and AWS bills. We want to get out of that situation as soon as possible, so the company can keep doing its thing.</p>
<hr />
<blockquote>
<p>“Ship or <strong>die.</strong>” — Danger, Inc, internal motto, 2002</p>
</blockquote>
<p>SN: Before the team shipped the first Sidekick in 2002, we said this often to each other. This over-dramatic motto came from a maniacal focus on shipping, getting our product done and out there into people’s hands. But the catch is that you’re not done when you ship.</p>
<hr />
<h1 id="what-happens-after-you-ship">What happens <strong>after</strong> you ship?</h1>
<p>Staying alive with more software.</p>
<p>SN: So it’s great we shipped instead of dying, but now we gotta keep the software alive too. Software is never finished! We continue to modify it after we release it to the world.</p>
<hr />
<h1 id="most-of-the-cost-of-software-is-maintaining-it">Most of the <strong>cost</strong> of software is <strong>maintaining</strong> it</h1>
<p>Every line of code we write has a maintenance cost: people, time, thinking.</p>
<p>SN: Those half-million lines of code represent complexity that has to be understood.</p>
<hr />
<h1 id="living-software-systems-must-be-operated">Living software systems must be <strong>operated</strong>.</h1>
<ul>
<li>Software must be run to have meaning!</li>
<li>Keeping software running is an entire area of expertise.</li>
<li>Operations teams tend the software that runs the software to run the… oh no.</li>
</ul>
<p>SN: Text files on GitHub don’t do much by themselves.</p>
<hr />
<h1 id="living-software-systems-must-be-changed">Living software systems must be <strong>changed</strong>.</h1>
<ul>
<li>the world around us changes</li>
<li>new laws &amp; regulations, new practices from our customers</li>
<li>the context in which the software runs changes</li>
<li>the team maintaining the software changes over time</li>
</ul>
<p>The software must change in response.</p>
<hr />
<h1 id="changing-software-requires-understanding-it">Changing software requires <strong>understanding</strong> it</h1>
<p>Naur’s third point: A programmer with the theory of the system can “respond constructively to any demand for a modification of the system so as to support the affairs of the world in a new manner.”</p>
<p>SN: Let’s call back to Naur again– changing software requires understanding it. The more complex and voluminous the software, the more there is to understand.</p>
<hr />
<h1 id="success-can-be-a-catastrophe">Success can be a <strong>catastrophe</strong>.</h1>
<ul>
<li>we need to scale up from a few customers to many</li>
<li>we learn where we need to be flexible</li>
<li>we learn where our models were incomplete</li>
</ul>
<p>SN: A friend who was at Twitter during its early years describes implementing things that would get them through the next six months, by which time they’d have its replacement ready to go.</p>
<hr />
<h1 id="all-software-has-a-lifespan">All software has a <strong>lifespan</strong></h1>
<ul>
<li>the changes made to it slowly build up like plaque in arteries</li>
<li>the software in a big system usually gets replaced in pieces to keep the system itself working</li>
<li>the system of software itself lives a long time</li>
</ul>
<hr />
<h1 id="congratulations">Congratulations.</h1>
<h1 id="now-do-it-all-over-again-for-the-next-product">Now do it <strong>all over again</strong> for the next product.</h1>
<p>SN: You figured out how to eat this thing with software. You shipped. Your customers grumble sometimes, but they’re mostly happy. PHEW. Let’s do a fast recap.</p>
<hr />
<h1 id="recap-what-is-software">recap: what is <strong>software</strong>?</h1>
<ul>
<li>software is, yes, text files with instructions to computers</li>
<li>it’s also an expression of our understanding of a real-world problem</li>
<li>and an expression of our analysis from a computing perspective</li>
</ul>
<hr />
<h1 id="recap-how-do-we-build-it">recap: how do we <strong>build</strong> it?</h1>
<ul>
<li>there’s no perfect answer to this</li>
<li>building software requires a team to
<ul>
<li>align on their understanding</li>
<li>plan an approach</li>
<li>coordinate with each other</li>
<li>iterate in response to feedback</li>
</ul>
</li>
</ul>
<hr />
<h1 id="recap-what-happens-after-we-ship">recap: what happens after we <strong>ship</strong>?</h1>
<ul>
<li>software lives on long after we build it</li>
<li>most of its cost is maintenance</li>
<li>you have to understand it to maintain it</li>
<li>eventually we need to replace it</li>
</ul>
<hr />
<h1 id="and-that-s-how-we-turn-a-napkin-sketch-into-something-that-affects-the-physical-world">And that’s how we turn a <strong>napkin sketch</strong> into something that affects the physical world.</h1>
<hr />
<h1 id="questions"><strong>Questions?</strong></h1>
<p>SN: Stop sharing screen now.</p>
]]></content:encoded>
      </item>
      <item>
          <title>Accepting Work</title>
          <link>https://blog.ceejbot.com/posts/accepting-work/</link>
          <pubDate>Tue, 19 Dec 2023 14:10:00 +0000</pubDate>
          <author>ceejceej@gmail.com (C J Silverio)</author>
          <guid>https://blog.ceejbot.com/posts/accepting-work/</guid>
          <description>In which I drop some anvils from the sky about agile methodologies, when you should accept work on your task list, and when you shouldn&#x27;t.</description>
          <content:encoded><![CDATA[<p>For “you” in this document, read “you and your team”.</p>
<p>I link to some interesting reading on some of these anvils, but mostly I don’t. These are things I generally take as facts about the world, with the usual squishy “it depends sometimes” about some of them. I use agile methodology language, mostly, even though I like to say I really hate agile processes. Do I hate agile? Really? Let the anvils commence!</p>
<hr />
<p>Don’t let people outside the team assign work to the team. They may propose work, but you decide if you accept that work.</p>
<p>The rate at which you accept work must be less than the rate at which you finish work, or you will have infinite work.</p>
<p>Operational incidents and meetings count as work.</p>
<p>Bug-fixing counts as work.</p>
<p>Don’t accept work that you don’t understand. “Figure out this project well enough to estimate it” is acceptable work, as is “cooperate with a product designer to get design documents into a state where they describe acceptable work”.</p>
<p>Only rarely should you say no outright to work. If it’s not well-defined, push to define the work better. (Unclear requirements make for misery on both sides.) If your team has too much work already, push for prioritization. (Something has to give. It will always give in reality, whether people admit that in advance or not.)</p>
<p>Sometimes you need to communicate the consequences of your team taking on disruptive work and let your customer decide if the cost is worth it.</p>
<p>Technical design and research counts as work.</p>
<p>Estimation counts as work. The more time you spend on accurate estimation, the less time you spend on other work, such as implementation. This is often worth the time anyway, because sometimes the business needs it.</p>
<p>Tools are not a substitute for communication.</p>
<p>One point of the retro is to figure out what your true rate of finishing work is. If you finished less than you took on, then next time take on less work. If you finished more, cautiously take on a little more.</p>
<p>You probably do not spend enough time doing retros and planning for your next sprint. One hour every two weeks isn’t enough.</p>
<p>The more you understand the work, the better you do estimating it.</p>
<p>Corollary: You do best estimating work very similar to work you’ve done before. <sup class="footnote-reference" id="fr-similar-1"><a href="#fn-similar">1</a></sup></p>
<p>Another corollary: Estimates you make at the start of a project, when you know the least about it, are the most likely to be wrong. Build in feedback loops for estimates! Communicate with your customers as estimates change.</p>
<p>Don’t let your early estimates get turned into deadlines.</p>
<p>Sometimes the business itself has deadlines. Frequent delivery of working software is a survival tactic for deadlines.</p>
<p>Sometimes you get it wrong. Use the retro to figure out what you can learn from the mistake. Remember, <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/Map%E2%80%93territory_relation">the map is not the territory</a>. Sometimes that clearly-marked trail turns out to have been destroyed by a mudslide.</p>
<p><a rel="noopener external" target="_blank" href="https://erikbern.com/2019/04/15/why-software-projects-take-longer-than-you-think-a-statistical-model.html">High-uncertainty projects dominate software schedules.</a> The thing that’s late because the trail was washed out ends up making everything late. Maybe it was worth an advance scout? <sup class="footnote-reference" id="fr-metaphor-1"><a href="#fn-metaphor">2</a></sup></p>
<p>If hitting a date you provide matters, invest time in lowering uncertainty.</p>
<p><a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/The_Mythical_Man-Month">Fred Brooks</a> spoke truth about how communication overhead dominates work. Most complex software work can’t be parallelized or sped up the way businesses want to speed it up.</p>
<p>The fastest way to get projects done is to have an aligned team take on chunks at their own pace, without doing any planning other than technical planning. Nobody likes hearing this, but it’s a consequence of the overhead of estimating and bookkeeping.</p>
<h2 id="agiletm">Agile™</h2>
<p>Agile™ as practiced has little to do with the original principles of the movement. Those original principles might be summarized roughly as:</p>
<ul>
<li>You are building things for a customer. Talk to your customer.</li>
<li>Deliver frequently.</li>
<li>Build in feedback loops so you can figure out what you’re doing that’s working and what’s not.</li>
<li>Let teams self-organize. Trust them.</li>
<li>Change is inevitable, so plan for it.</li>
</ul>
<p>I wrote those points off the top of my head, so I went to the original to see how well I did at capturing its spirit. Not bad! <a rel="noopener external" target="_blank" href="http://agilemanifesto.org/principles.html">This is what the Agile Manifesto says.</a> Go read it! It’s short! Then weep at how far we have strayed from it. Also note what isn’t there: any rigidity about sprint lengths, planning poker, burndown charts, anybody other than the team itself deciding how to do things. It sounds pretty sensible to me, to be honest. (Maybe it’s only Agile™ that I dislike?)</p>
<p>What I also like about that original manifesto is the focus on sustainability. “Sprinting” isn’t mentioned. <em>Communication</em> sure is, though, and I’m 100% aligned with that. Talk to people involved in the project. Frequently. Even about bad news. <sup class="footnote-reference" id="fr-plants-1"><a href="#fn-plants">3</a></sup> It’s all about the communication.</p>
<p>And as you know, communication has its own overhead. I point back at Brooks, who says there’s no silver bullet.</p>
<h2 id="did-i-have-a-thesis">Did I have a thesis?</h2>
<p>Mostly I wanted to write down some things I take as fact about planning and estimation that are often at odds with how software organizations behave. I’ve been itching whenever I hear about teams “doing agile” or “getting scrum training”. My theory is that processes are never one size fits all. You can’t be dogmatic about them. Teams vary, and so do projects. Some teams write a lot; some teams talk a lot; some teams demo a lot. Some teams pair; some teams mob program. What’s more, teams vary over time even when their membership is mostly stable, because people change and learn.</p>
<p>The best process for any project is probably one you design in the moment for the team. You never have to do this in a vacuum, because there are lots of good processes to steal from, and your team is probably doing some set of things already that are effective for them.</p>
<p>Dogmatic adherence to a half-understood Agile methodology probably ain’t it. So go back to the original! It’s pretty good.</p>
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn-similar">
<p>The good news is that later in your career, after you’ve seen a lot and accumulated amusing war stories, you have many past projects to compare the current one to. It gets easier. <a href="#fr-similar-1">↩</a></p>
</li>
<li id="fn-metaphor">
<p>Okay, okay, I’ll stop abusing this poor metaphor. But if it’s high-uncertainty and important, it’s probably worth a code spike or a couple of weeks spent on research. <a href="#fr-metaphor-1">↩</a></p>
</li>
<li id="fn-plants">
<p>There’s a Michael Pollan “mostly plants” joke lurking here. <a href="#fr-plants-1">↩</a></p>
</li>
</ol>
</section>
]]></content:encoded>
      </item>
      <item>
          <title>A systems analysis rubric</title>
          <link>https://blog.ceejbot.com/posts/systems-analysis-rubric/</link>
          <pubDate>Sun, 10 Dec 2023 11:30:56 +0000</pubDate>
          <author>ceejceej@gmail.com (C J Silverio)</author>
          <guid>https://blog.ceejbot.com/posts/systems-analysis-rubric/</guid>
          <description>A very fancy name for &quot;how to write a useful design document&quot;.</description>
          <content:encoded><![CDATA[<p>This is a systems analysis document rubric I’ve written several variations on in recent years. I’ve genericized it a bit and updated it with my current thinking. The form of this document is something a team would have in their official processes library somewhere, as a guide to how to do analysis of a fresh problem. I’ve had this blog post sitting 90% finished for a year now, so hey, here it is!</p>
<p><strong>NB:</strong> I have come to believe that there is no one process that works for every team. The process that makes a team most effective is a process designed for that team, for their current project. Don’t be dogmatic about anything! Think about the true goal, which is to write good software that does what it needs to do, making its users happy while its authors have chill weekends. Take the ideas here and adapt them to what your team needs.</p>
<p>I no longer call this document an RFC, because I think this term comes with the implication of a slow-moving process, which has to solicit a lot of feedback because of its importance. This is perfect when you’re designing the fundamental protocols of the Internet; it is not quite what I find myself wanting my colleagues to do. I am using the term “system analysis rubric” as I think about this task right now, because systems analysis is where my head is, and what I see missing from a lot of problem-solving.</p>
<p>“Problem statement” might also be a good name for this document, although I think it’s good to explore possible solutions in them as well as problems. Coming to a clear problem statement is possibly the most important task you have when you’re thinking about changing something or making something new.</p>
<h1 id="design-documents-a-systems-analysis-rubric">Design documents: a systems analysis rubric</h1>
<p>A design document is a structured way to have and record a conversation about a problem. It is not appropriate for all problems you might be solving. The formality and length of the conversation depends on the scope and complexity of the problem. For a bug fix, you might need a short conversation with a single colleague, plus commentary in a commit message. For a major project, this process might take weeks to complete and you might write several of these documents.</p>
<p>While the process <em>does</em> produce a document, the document is not the most important result. The important result of the design process is the <em>exploration</em> of the problem that writing the document encourages. The conversation that accompanies the exploration aligns you and your team on an understanding of the problem. Yes, a design doc might describe a proposed solution, but this proposal is secondary to a team’s collective understanding of the problem to be solved.</p>
<p>I’m going to hammer on this point as I go here. The document exists to promote exploration and shared understanding of the problem. The document is a tool in service of a more important goal.</p>
<h2 id="the-widening-conversation">The widening conversation</h2>
<p>My design documents start as notes to myself. I attempt to structure my own thoughts about a problem by writing down what I’m thinking. The stakes are low; the document is so informal that it’s likely nothing more than bulleted lists of things that come to mind. As you go, your writing should tighten up and be more complete, but remember: the document is not the point. Don’t stress about sentence perfection. <sup class="footnote-reference" id="fr-grammar-1"><a href="#fn-grammar">1</a></sup></p>
<p>The audience for the design document changes as it matures. When you are writing your first notes about a problem, you might share them only with a pairing partner to get immediate feedback. As you gain confidence in your understanding of the problem, widen the audience for your document. Seek out feedback from domain experts and from your team as a whole.</p>
<p>Show your design document to its stakeholders in advance of any public discussion, to give them a chance to think and give you feedback. Follow the principle of least surprise. People can react badly to surprises even if they agree with the proposal in the main. If you can, avoid introducing complex technical topics in meetings. Meetings are best used to solidify alignment or discuss specific known open questions.</p>
<p>When you reach the step of sharing your proposal with the entire engineering organization, it will be a solid document that you feel confident about.</p>
<h2 id="the-process-of-exploration">The process of exploration</h2>
<p>Step one: Research.</p>
<ul>
<li>Investigate the background of the problem &amp; document the current solutions, if they exist.</li>
<li>Document why the current solutions are inadequate, if relevant.</li>
<li>Gather relevant product documentation, if it exists. A product requirements document is ideal, and this phase might be focused on collaborating on requirements with a product team.</li>
</ul>
<p>Step two: Write a clear problem statement.</p>
<ul>
<li>What change would you like to effect upon the system?</li>
<li>What is happening today that you’d like to be different after the work you’re considering?</li>
<li>What are the properties of a successful solution? How will you know it’s successful?</li>
<li>Identify constraints on the solution space. Development time? Budget? Performance? A fixed point of integration?</li>
<li>Why is this the right problem to solve now?</li>
<li>What problems are you choosing <em>not</em> to solve right now?</li>
<li>Refine your problem statement until the team aligns on it.</li>
</ul>
<p>Step three: Explore possible solutions.</p>
<ul>
<li>Identify and consider possible solutions.</li>
<li>Discuss tradeoffs inherent in the solutions. Evaluate them against the constraints.</li>
<li>Estimate costs of the solutions, in time / effort / complexity / maintenance / hiring.</li>
<li>If necessary, do spike implementations to test the validity of assumptions or the viability of a specific approach.</li>
</ul>
<p>Step four: Reach consensus on a solution that solves the stated problem while making acceptable tradeoffs.</p>
<p>Sometimes step four does <em>not</em> end in consensus on a solution, but instead ends in a decision to do further research. This is a good result and should not be treated as a negative by the team.</p>
<p>The design document should now be a document describing the problem, the research, and the possible solutions, and conclude with a plan of action. Congratulations! Archive the final version in the corporate wiki or in a docs folder for the resulting project. Its next audience is the person working on its replacement, who you’ve just given a good head start.</p>
<p>Now let’s review the parts again, in more detail.</p>
<h2 id="the-problem-statement">The problem statement</h2>
<p>You’ll start with something you think is a good problem statement, but you will <em>often</em> find that it doesn’t go into enough detail to support a good technical decision. Constraints might be missing. Stakeholders might disagree on what success looks like. Important implicit requirements might need to be unearthed.</p>
<p>The initial problem statement informs your research, but expect to change it. Push on it and iterate until you have something the team agrees on.</p>
<p>Among the constraints you implicitly take on for any project are your team’s <em>shared values</em>. If your team hasn’t discussed those values, now is a good time to do so. Your shared values are partly a reflection of your team’s personality and culture, and partly a reflection of where your business is. A team at a new startup trying to ship something quickly for survival might value a minimal solution that can be produced rapidly. The same team following up after a successful first ship might value flexibility instead. Make the implicit explicit and state any values that might affect this project.</p>
<h2 id="detail-on-the-research-step">Detail on the research step</h2>
<p>Do not short-change this step! This is critical to understanding the problem. Do the background research if there is extant code. Summarize that research, with relevant links, so your readers can also understand the context.</p>
<p>Answer scaling questions if they’re relevant. Gather numbers for today, a year from now, and as far in advance as a reasonable guess can be made. Does your solution to the problem have a lifespan? Don’t look beyond that lifespan if so.</p>
<p>For data being stored and manipulated, you might ask questions like these:</p>
<ul>
<li>How much data is being discussed? Is it large in total size or in quantity?</li>
<li>What actions are taken on this data? How often does it change? In what quantity?</li>
<li>Who is changing this data?</li>
<li>What are the constraints on data changes? Are there any conflict resolution requirements? Do operations need to be serializable (expensive) or will idempotency suffice (cheap)?</li>
<li>What happens if data mutations are lost?</li>
<li>How is this data expected to grow over time? Is it shardable if massive growth is expected?</li>
<li>If the data is very very large, the questions become more specialized. If you are not a data engineer, you might want to consult one.</li>
</ul>
<p>For APIs, the questions might look like this:</p>
<ul>
<li>What other systems are expected to call this API? To do what tasks?</li>
<li>What are the latency requirements?</li>
<li>Is this operation write heavy or read heavy?</li>
<li>How many requests/sec do we experience at peak? How will this number change over time in relation to business growth?</li>
<li>Does peak load differ from steady state load? When is the load heaviest? Does this correlate with other usage patterns in the system?</li>
<li>If you’re caching expensive work product, identify how you’ll be invalidating that cache. What fails if the cache is stale? (Do you really need a cache? Really?)</li>
</ul>
<p>Failure analysis is next. This topic can be where engineers shine, because we love discussing how things fall over.</p>
<ul>
<li>How might this system fail?</li>
<li>What are the consequences of failure for this system?</li>
<li>Should any of these failures be visible to or actionable by the end-user? If so, how should they be presented?</li>
<li>How should we handle the most important or unusual <em>invisible-to-users</em> errors? Retry? Escalate to human beings? Log and move on?</li>
</ul>
<p>What are the security concerns? Do a threat modeling exercise with security experts early, particularly if you’re doing something new or not handled by existing tools.</p>
<ul>
<li>Are you accepting untrusted user input? How do you need to handle it?</li>
<li>Who is allowed to perform these operations or see this data?</li>
<li>Are you managing data that needs to be protected or encrypted?</li>
<li>What would an attacker gain if one got access to your data or your API?</li>
<li>What would a person with bad motives do if they have normal access to this new functionality?</li>
</ul>
<p>The appropriate questions to ask depend on what your area of work is and what “affair of the world” it addresses. These questions are intended to get you started.</p>
<h2 id="problem-statement-slight-return">Problem statement (slight return)</h2>
<p>Come back to your initial problem statement. Can you sharpen it? Can you clearly define what a successful solution might look like now? If you’ve done the research, you probably can.</p>
<p>Don’t move forward until you have consensus that the problem statement is good.</p>
<h2 id="solutioneering">Solutioneering</h2>
<p>This is where programmers love to be. We are problem-solvers and we want to jump right to solving problems, especially if we can write code to do it. <em>Resist this urge.</em> Your solutions have a better chance of success if they are informed by a solid grasp of the problem you need to solve. Your second and third refinements of a solution are likely to be better than your first.</p>
<p>This step is often focused on navigating tradeoffs. The problem statement, if it’s sharp enough, gives you a good <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/Philosophical_razor">razor</a> to use to evaluate solutions against your success criteria.</p>
<p>What are the costs of a possible solution? How complex is it?</p>
<p>Give the solution a t-shirt size. Does it match the time budget the project has?</p>
<p>What are the risks in the solution? How might it fail to solve the problem or otherwise fail as a project?</p>
<p>What’s the solution’s blast radius? That is, how many other systems would be affected by the work? How many teams?</p>
<p>Does the solution introduce new technologies to the overall system, or does it leverage tools your team understands well? If it spends novelty points, do they buy you something worth the expense?</p>
<p>Does the solution align with the team’s values?</p>
<p>In many cases the right solution will feel good to the team discussing it, and you’ll reach consensus smoothly. When information, values, and understanding of the problem is shared, alignment is easy. If consensus is not happening, make an attempt to figure out why the team is not aligned. Is there a disagreement about values? An information disparity? Is more research needed? Bring in senior staff to help break stalemates. Bring in somebody from another team who has relevant experience. Remember that the project might need to move forward anyway because of business needs, and a half-good solution might be better than no solution in the short term.</p>
<h2 id="the-document-s-final-home">The document’s final home</h2>
<p>I end up making a <code>design</code> or <code>docs</code> subfolder in the code repo for these documents. Your organization might have an official home for documents that isn’t the repo. I suggest that you at least store a copy next to the code, where it will survive as long as the code does. The document will drift out of sync with reality the instant anybody starts implementing the plan, but that is fine. The document exists to help future maintainers understand what their predecessors were thinking at the time.</p>
<p>Remember: the act of writing the document is more important than the document. The sharp problem statement and shared understanding of the solution were the goals of the exercise. If it got you there, it was good enough.</p>
<h2 id="additional-reading">Additional reading</h2>
<ul>
<li>The <a rel="noopener external" target="_blank" href="https://github.com/rust-lang/rfcs">Rust RFC process</a> discusses the importance of the conversations.</li>
<li><a rel="noopener external" target="_blank" href="https://adr.github.io">Architectural Decision Records</a></li>
<li><a rel="noopener external" target="_blank" href="https://philcalcado.com/2018/11/19/a_structured_rfc_process.html">A Structured RFC Process</a> by Phil Calçado talks about the benefits of widening the circle of review.</li>
</ul>
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn-grammar">
<p>Correctness in these details can make an unconscious impression on readers that matters, so if you have the time, hey, spell-check yourself. The opposite side of this is that you as a reader of design documents need to set aside your own fussiness about spelling and grammar, should you have any, especially if the author is not a native speaker of the language they’re writing in. These things are to the side of the problem. <a href="#fr-grammar-1">↩</a></p>
</li>
</ol>
</section>
]]></content:encoded>
      </item>
      <item>
          <title>Multi-factor panacea</title>
          <link>https://blog.ceejbot.com/posts/multi-factor-panacea/</link>
          <pubDate>Mon, 10 Oct 2022 10:00:00 +0000</pubDate>
          <author>ceejceej@gmail.com (C J Silverio)</author>
          <guid>https://blog.ceejbot.com/posts/multi-factor-panacea/</guid>
          <description>Multi-factor auth does not save you from having to vet your dependencies.</description>
          <content:encoded><![CDATA[<p>Context: <a rel="noopener external" target="_blank" href="https://twitter.com/substack">@substack</a> <sup class="footnote-reference" id="fr-1-1"><a href="#fn-1">1</a></sup> deleted his github account, which includes a lot of foundational source code from the early days of node. The speculation (and it is only speculation as far as I know, though with some foundation in his tweets) is that he did so because of the MFA requirements being imposed on some NPM package maintainers.</p>
<p>Here’s my take. It’s not very hot and is probably marginally more informed than many, but it’s also probably worth what you paid for it. I started writing it as a series of tweets, which is why there are some extremely terse phrasings here.</p>
<h2 id="proxies">Proxies</h2>
<p>Okay, I’ll weigh in on this one, because I have spent time thinking about it, and because <a rel="noopener external" target="_blank" href="https://twitter.com/isntitvacant">@isntitvacant</a>, <a rel="noopener external" target="_blank" href="https://twitter.com/i_a_r_n_a">@i_a_r_n_a</a>, and I made MFA happen for NPM originally.</p>
<p>Companies freeloading off of open source are worried about intentional security compromises in the software they’re benefitting from. Let’s walk through their threat model: Somebody gets access to the account of somebody who works on a package that company X uses and uses that access to publish a deliberate compromise. The update gets taken automatically by the downstream consumer, and then they are shipping their environment variables out to a third party, or running a cryptocurrrency miner, or allowing an attacker to get shell access.</p>
<p>Does forcing maintainers of “important” packages to enable MFA and never turn it off help protect companies from this threat?</p>
<p>Kinda. Restrictions on package authors protect against one category of supply chain attacks. They protect against account hijacking. The state of the world used to be that some NPM users with critical spots in the dependency graph had passwords like “password”. No, I’m not joking. Taking away that easy attack vector seems helpful, so I’m glad we shipped what we did when we did. One good design choice we made was to <em>not</em> implement MFA via SMS, which would have been no protection at all against these threats because social engineering makes SMS not secure at all.</p>
<p>Requiring that a maintainer enable MFA for their account does not, however, protect the source you use from all supply-chain attacks. Legit maintainers have been responsible for some of the worst. Case in point: the infamous left-pad deletion was done by the package maintainer and MFA would not have helped one bit.</p>
<p>MFA is <em>still a good thing to do</em>, but it’s not protection against what the companies freeloading on open source maintainers are worried about.</p>
<p>Why not? Because the account level is only a proxy for the level you care about. You’re far more interested in audit trails that are at the package level and then at the source level. What changed in this release? Did maintainers change? <em>What source changed?</em></p>
<h2 id="an-historical-aside">An historical aside</h2>
<p><a rel="noopener external" target="_blank" href="https://twitter.com/isntitvacant">@isntitvacant</a> did think about MFA from the package perspective when he designed the back end support for this! He also thought about restricting tokens to CIDR ranges, though I don’t know if that has ever been exposed. We missed the chance to go even finer-grained on access token permissions than we did. And we definitely should have done audit logs on package ownership changes.</p>
<p>My only excuse is that at the time it felt like a triumph to be able to get the feature shipped at all.</p>
<p>A side comment about the mess we’re in: NPM was designed to maximize engagement from publishers, not to be a good package manager at the scale that it reached. It was designed deliberately to be viral, not to be secure or auditable.</p>
<p>For example: The default being to take updates without thinking about them, for instance, to the point where bots do all that work of dependency updating for you. Downloads number gotta go up.</p>
<p>For example: The tarball as unit of deploy: huge, contains weird stuff that you don’t care about plus whatever silly things the package maintainer put in, hides the deltas. But it was very easy to implement and good enough.<sup class="footnote-reference" id="fr-2-1"><a href="#fn-2">2</a></sup></p>
<p>NPM’s design pushes you into not thinking about your software supply chain by design decisions made when winning a war among competing node package managers was important to somebody.</p>
<p>Stop being pushed. Stop taking updates by default. Think about your supply chain differently.</p>
<h2 id="problems-not-proxies">Problems not proxies</h2>
<p>What <em>are</em> you interested in when thinking about this threat? The source itself. The software you’re relying on.</p>
<p>Inspectable audit trails for changes are far more interesting, and this is <em>not something requiring MFA for package maintainers gets you</em>. Looking at maintainers is looking at a proxy for the threat, not the threat itself.</p>
<p>Protecting against the proxy does not give you a free pass on looking at the source you’re depending on and deciding it’s okay. It does not give you a free pass to take every update there is without thinking.</p>
<p>Questions it helps to know the answers to:</p>
<ul>
<li>Who published this?</li>
<li>How do you know they were that person? (Same as controller of repo? Controller of other accounts? What’s the web of identity?)</li>
<li>What was the chain of control of the source?</li>
<li>Is the source that was published the same as the source in the advertised repo?</li>
<li>What was the source delta from the last publication?</li>
<li>What does the source do?</li>
</ul>
<p>And given the possibilities of bugs <em>and</em> of some maintainer with bad goals playing the long game, only the questions about the source are on target. The rest are proxies.</p>
<p>The tech industry relies on software they do not take the time to inspect, written by strangers they mostly choose not to pay. Sometimes the industry pays people to work on very critical projects, such as Linux itself! But the web dev world rarely stops to pay the people who were around the node scene at the beginning, writing tiny modules because that was their philosophy, which then got bricked together without their participation into the foundations of modern web development.</p>
<p>Because tech industry companies still don’t want to pay for the work they build on top of–with either their time or their money–they impose requirements on those strangers to attempt to protect themselves from a proxy for the threat, with zero cost to themselves. Those strangers have every right not to participate; it wasn’t what they signed up for back then. Any access to their work you had was a gift.</p>
<p>tl;dr Use Feross’s <a rel="noopener external" target="_blank" href="https://socket.dev">Socket</a> to scan the source itself; you won’t catch them all; pay people to write any software you truly rely on.</p>
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn-1">
<p>substack the good human, not substack the objectionable paid newsletter company. <a href="#fr-1-1">↩</a></p>
</li>
<li id="fn-2">
<p>The tarball is a case of worse is better. And yes, that’s a complex statement itself. It was good enough and easy enough to implement that it satisfied the true requirements of the problem in the moment. Knowing the problem space as well as I do now, and in the current package manager landscape, I would design it quite differently were I to take on the project myself, today. <a href="#fr-2-1">↩</a></p>
</li>
</ol>
</section>
]]></content:encoded>
      </item>
      <item>
          <title>Goodbye Cloudflare; hello Fastly!</title>
          <link>https://blog.ceejbot.com/posts/goodbye-cloudflare/</link>
          <pubDate>Sat, 27 Aug 2022 16:53:01 +0000</pubDate>
          <author>ceejceej@gmail.com (C J Silverio)</author>
          <guid>https://blog.ceejbot.com/posts/goodbye-cloudflare/</guid>
          <description>How I moved my blog from Cloudflare to Fastly, and why.</description>
          <content:encoded><![CDATA[<p>KiwiFarms is a harassment website, sort of like a terrorism-only variation on the *chan sites. It specializes in harassing trans people. It doxxes them, SWATs them and their families, and does its best to drive its victims off the internet. It also has a bodycount. They are a <a rel="noopener external" target="_blank" href="https://twitter.com/lizthegrey/status/1563380697922162689?s=20&amp;t=FaD73GtQbbBNEoIEsdPKjg">troll farm</a>.</p>
<p>Kiwifarms gets to do this and stay on the internet because they’re being <a rel="noopener external" target="_blank" href="https://time.com/6208828/cloudflare-misinformation-internet-research/">protected by Cloudflare</a>. Cloudflare has a long history of protecting incredibly vile content: they were recently infamous for hosting <a rel="noopener external" target="_blank" href="https://www.wired.com/story/cloudflare-daily-stormer/">Daily Stormer</a>.</p>
<p>Cloudflare is exceptional in its position. From the Time article:</p>
<blockquote>
<p>“We find anecdotally that sites prefer Cloudflare because of its lax acceptable use policies and its free DDoS protection services that help protect against vigilante attacks,” the researchers write. They note that AmmoLand, a popular guns rights blog, has praised the company “for its self-described ‘content-neutral’ stance.”</p>
</blockquote>
<p>Cloudflare takes a freezepeach position on free speech: they do not acknowledge the reality that in order to protect the free speech of the many, we cannot tolerate the abusive behavior of the few. Cloudflare protects the abusers instead.</p>
<p>Liz Fong-Jones has been leading the current pressure campaign against Cloudflare most effectively.</p>
<h2 id="why-this-matters-to-me">Why this matters to me</h2>
<p>When I set up my blog, I hosted it in an S3 bucket behind Cloudflare, using their free plan because I have very simple needs for it. I do not want to lend them even that little support, so today I moved my blog to Fastly.</p>
<p>I moved my last two employers to Cloudflare from other CDNs. I won’t be repeating that mistake until they shape up and start removing Nazis and troll sites <em>without</em> needing pressure campaigns to move them. I treasure my friends and it is unacceptable to me that some of them go through their lives afraid for their personal safety because of sites like Kiwifarms.</p>
<p>You might decide that freeloading off of Cloudflare is fine, because you’re siphoning resources from them. You might also be unable to pay for another CDN. Only you know your circumstances. I have the disposable income to spend a little more than I spend now on my AWS hosting bill on a CDN provider who doesn’t have to be pressured over and over again to boot sites like Daily Stormer and KiwiFarms.</p>
<p>This is how I did it, very short version:</p>
<p>Steps were:</p>
<ul>
<li>set everything up in fastly</li>
<li>tell fastly about my certs</li>
<li>verify that their test url worked</li>
<li>duplicate all my dns setup in route53</li>
<li>cut over name servers with my registrar to route 53</li>
</ul>
<p>The rest of this blog post goes into what I did in more detail, in the hope that I can reassure you it’s very do-able.</p>
<h2 id="by-hand-in-more-detail">By hand, in more detail</h2>
<ol>
<li>Create a <a rel="noopener external" target="_blank" href="https://www.fastly.com/">Fastly account</a>. (Set up 2FA!)</li>
<li>Scan through Fastly’s <a rel="noopener external" target="_blank" href="https://docs.fastly.com/en/guides/start-here">getting started guide</a>. The concepts here are different from Cloudflare’s concepts. Fastly is (oversimplifying a bit) a nice front end to <a rel="noopener external" target="_blank" href="https://www.varnish-software.com/developers/tutorials/varnish-configuration-language-vcl/">Varnish &amp; VCL</a> plus a lot of POPs around the world to reduce latency to your users. VCL can do a lot. You end up with a lot more control over how things get routed, but the cost is more complexity to cope with.</li>
<li>Give Fastly a credit card so you can enable TLS.</li>
</ol>
<p>Now let’s do the switch:</p>
<ol start="4">
<li>Find a new home for all of your DNS records. I used <a rel="noopener external" target="_blank" href="https://aws.amazon.com/route53/">AWS’s Route 53</a> as my name server because I am very comfortable with it. Your domain registrar might provide name service; all the major cloud providers also do.</li>
<li>Duplicate all of the DNS you’ve set up in Cloudflare over in your new DNS provider. Your goal is to avoid downtime when you cut over from Cloudflare to your new nameservers.</li>
<li>Now set up <em>a delivery service</em> in Fastly. This is a backend – the place the data comes from – plus a domain that is the face of the service – the hostname people type into their browsers. You’re setting up a mapping from domain to data source. For me, the back end is the AWS S3 bucket that holds my blog assets, and the domain name is what you see in your browser right now.</li>
<li>Make the service active. Fastly now gives you a test domain name, like <code>blog.ceejbot.com.global.prod.fastly.net</code>, to verify that your content is available as you expect.</li>
</ol>
<p>Now the slowest part: do something about your TLS certs. Because I am old-fashioned and I haven’t automated all this yet, I buy certs from my name registrar. You can also use AWS ACM, which automates things pretty well. Fastly will help you set up <a rel="noopener external" target="_blank" href="https://letsencrypt.org">Let’s Encrypt</a>, which is probably the best option for most people.</p>
<p>Once Fastly is aware of your cert material somehow, you are ready to cut over. You can do this in two phases. The first phase is a double-CDN phase:</p>
<ol start="8">
<li>Update your domain in Cloudflare to point to the Fastly TLS domain they gave you when you set up TLS.</li>
<li>Turn off proxying in Cloudflare. Make the orange cloud gray.</li>
</ol>
<p>Now your content should be served by Fastly instead of Cloudflare. You should see the headers change to something like this:</p>
<pre class="giallo" style="color-scheme: light dark; color: light-dark(#3C3836, #EBDBB2); background-color: light-dark(#F9F5D7, #1D2021);"><code data-lang="shellsession"><span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">$</span><span> http HEAD https://blog.ceejbot.com</span></span>
<span class="giallo-l"><span>HTTP/1.1 200 OK</span></span>
<span class="giallo-l"><span>Accept-Ranges: bytes</span></span>
<span class="giallo-l"><span>Age: 518</span></span>
<span class="giallo-l"><span>Connection: keep-alive</span></span>
<span class="giallo-l"><span>Content-Length: 7149</span></span>
<span class="giallo-l"><span>Content-Type: text/html</span></span>
<span class="giallo-l"><span>Date: Sat, 27 Aug 2022 23:22:43 GMT</span></span>
<span class="giallo-l"><span>ETag: &quot;464e0930f8616e07530366dfa7ba0567&quot;</span></span>
<span class="giallo-l"><span>Last-Modified: Sat, 23 Jul 2022 20:01:40 GMT</span></span>
<span class="giallo-l"><span>Server: AmazonS3</span></span>
<span class="giallo-l"><span>Via: 1.1 varnish</span></span>
<span class="giallo-l"><span>X-Cache: HIT</span></span>
<span class="giallo-l"><span>X-Cache-Hits: 1</span></span>
<span class="giallo-l"><span>X-Served-By: cache-pao17472-PAO</span></span>
<span class="giallo-l"><span>X-Timer: S1661642564.606369,VS0,VE35</span></span>
<span class="giallo-l"><span>x-amz-id-2: E2eXc0YfBq4rX2rGhwOWZMbU26NYxzGcaAzlQ7+E/zHhcp19RIpct8WwFIaDQEy6TWuhluNf1ng=</span></span>
<span class="giallo-l"><span>x-amz-meta-md5chksum: 464e0930f8616e07530366dfa7ba0567</span></span>
<span class="giallo-l"><span>x-amz-request-id: 9V54SNWKBG95XYY3</span></span></code></pre>
<p>Varnish is serving my content! It’s working! Now you can safely take the last step: switch your domain’s name servers over to something other than Cloudflare. It might take a day or so for the global cache of caches that is DNS to update itself. If all went well, you switched without downtime!</p>
<h2 id="automation">Automation</h2>
<p>I did not use the websites to do this: I used <a rel="noopener external" target="_blank" href="https://www.terraform.io">Terraform</a> because I automate all of my personal infrastructure. It’s good practice, I tell myself. To use terraform you need to make an API key on the Fastly dashboard. Save it in your favorite password manager, then export it in the environment variable <code>FASTLY_API_KEY</code>.</p>
<p>Set up the official terraform provider. Here’s my <code>providers.tf</code> file:</p>
<pre class="giallo" style="color-scheme: light dark; color: light-dark(#3C3836, #EBDBB2); background-color: light-dark(#F9F5D7, #1D2021);"><code data-lang="hcl"><span class="giallo-l"><span style="color: light-dark(#B57614, #FABD2F);">terraform</span><span style="color: light-dark(#7C6F64, #A89984);"> {</span></span>
<span class="giallo-l"><span style="color: light-dark(#B57614, #FABD2F);">  required_providers</span><span style="color: light-dark(#7C6F64, #A89984);"> {</span></span>
<span class="giallo-l"><span style="color: light-dark(#076678, #83A598);">	aws</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span style="color: light-dark(#7C6F64, #A89984);"> {</span></span>
<span class="giallo-l"><span style="color: light-dark(#076678, #83A598);">	  source</span><span style="color: light-dark(#427B58, #8EC07C);">  =</span><span style="color: light-dark(#7C6F64, #A89984);"> &quot;</span><span style="color: light-dark(#79740E, #B8BB26);">hashicorp/aws</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span></span>
<span class="giallo-l"><span style="color: light-dark(#076678, #83A598);">	  version</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span style="color: light-dark(#7C6F64, #A89984);"> &quot;</span><span style="color: light-dark(#79740E, #B8BB26);">~&gt; 4.0</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">	}</span></span>
<span class="giallo-l"><span style="color: light-dark(#076678, #83A598);">	fastly</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span style="color: light-dark(#7C6F64, #A89984);"> {</span></span>
<span class="giallo-l"><span style="color: light-dark(#076678, #83A598);">	  source</span><span style="color: light-dark(#427B58, #8EC07C);">  =</span><span style="color: light-dark(#7C6F64, #A89984);"> &quot;</span><span style="color: light-dark(#79740E, #B8BB26);">fastly/fastly</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span></span>
<span class="giallo-l"><span style="color: light-dark(#076678, #83A598);">	  version</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span style="color: light-dark(#7C6F64, #A89984);"> &quot;</span><span style="color: light-dark(#79740E, #B8BB26);">&gt;= 2.2.1</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">	}</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">  }</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">}</span></span></code></pre>
<p>Here’s the important part of my blog service in Terraform:</p>
<pre class="giallo" style="color-scheme: light dark; color: light-dark(#3C3836, #EBDBB2); background-color: light-dark(#F9F5D7, #1D2021);"><code data-lang="hcl"><span class="giallo-l"><span style="color: light-dark(#B57614, #FABD2F);">resource</span><span style="color: light-dark(#076678, #83A598);"> &quot;fastly_service_vcl&quot;</span><span style="color: light-dark(#076678, #83A598);"> &quot;blog&quot;</span><span style="color: light-dark(#7C6F64, #A89984);"> {</span></span>
<span class="giallo-l"><span style="color: light-dark(#076678, #83A598);">  name</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span style="color: light-dark(#7C6F64, #A89984);"> &quot;</span><span style="color: light-dark(#79740E, #B8BB26);">blog.ceejbot.com</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span></span>
<span class="giallo-l"><span style="color: light-dark(#076678, #83A598);">  activate</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span style="color: light-dark(#8F3F71, #D3869B);"> true</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span style="color: light-dark(#B57614, #FABD2F);">  domain</span><span style="color: light-dark(#7C6F64, #A89984);"> {</span></span>
<span class="giallo-l"><span style="color: light-dark(#076678, #83A598);">	name</span><span style="color: light-dark(#427B58, #8EC07C);">    =</span><span style="color: light-dark(#7C6F64, #A89984);"> &quot;</span><span style="color: light-dark(#79740E, #B8BB26);">blog.ceejbot.com</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span></span>
<span class="giallo-l"><span style="color: light-dark(#076678, #83A598);">	comment</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span style="color: light-dark(#7C6F64, #A89984);"> &quot;</span><span style="color: light-dark(#79740E, #B8BB26);">the blog</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">  }</span></span>
<span class="giallo-l"><span style="color: light-dark(#B57614, #FABD2F);">  backend</span><span style="color: light-dark(#7C6F64, #A89984);"> {</span></span>
<span class="giallo-l"><span style="color: light-dark(#076678, #83A598);">	address</span><span style="color: light-dark(#427B58, #8EC07C);">       =</span><span style="color: light-dark(#7C6F64, #A89984);"> &quot;</span><span style="color: light-dark(#79740E, #B8BB26);">blog.ceejbot.com.s3-website-us-west-2.amazonaws.com</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span></span>
<span class="giallo-l"><span style="color: light-dark(#076678, #83A598);">	name</span><span style="color: light-dark(#427B58, #8EC07C);">          =</span><span style="color: light-dark(#7C6F64, #A89984);"> &quot;</span><span style="color: light-dark(#79740E, #B8BB26);">the s3 bucket</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span></span>
<span class="giallo-l"><span style="color: light-dark(#076678, #83A598);">	port</span><span style="color: light-dark(#427B58, #8EC07C);">          =</span><span style="color: light-dark(#8F3F71, #D3869B);"> 80</span></span>
<span class="giallo-l"><span style="color: light-dark(#076678, #83A598);">	shield</span><span style="color: light-dark(#427B58, #8EC07C);">        =</span><span style="color: light-dark(#7C6F64, #A89984);"> &quot;</span><span style="color: light-dark(#79740E, #B8BB26);">pdx-or-us</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">  }</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">}</span></span></code></pre>
<p>This terraform fragment sets up DNS so Fastly handles requests to my content:</p>
<pre class="giallo" style="color-scheme: light dark; color: light-dark(#3C3836, #EBDBB2); background-color: light-dark(#F9F5D7, #1D2021);"><code data-lang="hcl"><span class="giallo-l"><span style="color: light-dark(#B57614, #FABD2F);">resource</span><span style="color: light-dark(#076678, #83A598);"> &quot;aws_route53_zone&quot;</span><span style="color: light-dark(#076678, #83A598);"> &quot;ceejbot-com&quot;</span><span style="color: light-dark(#7C6F64, #A89984);"> {</span></span>
<span class="giallo-l"><span style="color: light-dark(#076678, #83A598);">  name</span><span style="color: light-dark(#427B58, #8EC07C);">         =</span><span style="color: light-dark(#7C6F64, #A89984);"> &quot;</span><span style="color: light-dark(#79740E, #B8BB26);">ceejbot.com</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span style="color: light-dark(#B57614, #FABD2F);">resource</span><span style="color: light-dark(#076678, #83A598);"> &quot;aws_route53_record&quot;</span><span style="color: light-dark(#076678, #83A598);"> &quot;blog&quot;</span><span style="color: light-dark(#7C6F64, #A89984);"> {</span></span>
<span class="giallo-l"><span style="color: light-dark(#076678, #83A598);">  zone_id</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span> aws_route53_zone</span><span style="color: light-dark(#427B58, #8EC07C);">.</span><span style="color: light-dark(#076678, #83A598);">ceejbot-com</span><span style="color: light-dark(#427B58, #8EC07C);">.</span><span style="color: light-dark(#076678, #83A598);">zone_id</span></span>
<span class="giallo-l"><span style="color: light-dark(#076678, #83A598);">  name</span><span style="color: light-dark(#427B58, #8EC07C);">    =</span><span style="color: light-dark(#7C6F64, #A89984);"> &quot;</span><span style="color: light-dark(#79740E, #B8BB26);">blog.ceejbot.com</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span></span>
<span class="giallo-l"><span style="color: light-dark(#076678, #83A598);">  type</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span style="color: light-dark(#7C6F64, #A89984);"> &quot;</span><span style="color: light-dark(#79740E, #B8BB26);">CNAME</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span></span>
<span class="giallo-l"><span style="color: light-dark(#076678, #83A598);">  records</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span style="color: light-dark(#7C6F64, #A89984);"> [</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">	&quot;</span><span style="color: light-dark(#79740E, #B8BB26);">n.sni.global.fastly.net</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span><span style="color: light-dark(#7C6F64, #A89984);">,</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">  ]</span></span>
<span class="giallo-l"><span style="color: light-dark(#076678, #83A598);">  ttl</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span style="color: light-dark(#8F3F71, #D3869B);"> 3600</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">}</span></span></code></pre>
<p>This isn’t all of the terraform in my setup. I also have a policy on the S3 bucket restricting access to it to <a rel="noopener external" target="_blank" href="https://developer.fastly.com/reference/api/utils/public-ip-list/">Fastly’s public IP list</a>. That’s a reasonable practice to prevent an accidental gigantic AWS egress cost.</p>
<p>You can do a lot more with VCL and Varnish if you feel inclined. For a while through the mid teens, all of NPM’s registry traffic was proxied through Fastly, with a carefully maintained custom varnish file routing things as much as possible at the edge. Most of us won’t need that power, but it’s available if you do.</p>
]]></content:encoded>
      </item>
      <item>
          <title>Reduce Friction</title>
          <link>https://blog.ceejbot.com/posts/reduce-friction/</link>
          <pubDate>Sat, 23 Jul 2022 12:54:39 +0000</pubDate>
          <author>ceejceej@gmail.com (C J Silverio)</author>
          <guid>https://blog.ceejbot.com/posts/reduce-friction/</guid>
          <description>Why a relentless focus on reducing developer friction pays off in team productivity, and some ways to do this.</description>
          <content:encoded><![CDATA[<p>The topic of reducing friction exhausts me: Do people still need to be persuaded to help their developers go faster? Really? In this, the year 2022? But yes, in this, the year 2022, many teams require persuasion on this topic. Or rather, their leaders require persuasion that they have to do more than give lip service to this principle, and that they must invest resources in making it so, and that those resources will not be “wasted” resources, not even for <em>that</em> person, you know the one, the official VP of Feature Factory.</p>
<p>Some leaders are not worried about wasting time, but are instead worried that devoting brains to this work will <em>slow teams down</em>. They admit that current processes are full of friction, but claim that they have to finish whatever they’re in the middle of before they should try to fix things. They think that reducing friction is a distraction from the <em>real</em> work. This approach is short-sighted. The best time to reduce friction for your team was the moment it came into being, and the second best time is now.</p>
<p>I’m going to cover three topics in this post. First, I’ll define what we mean by “developer friction”. Then I’ll make the case about why reducing friction is beneficial to engineering organizations, including benefits in areas I didn’t expect. And then I’ll go into concrete suggestions about how to do it, and the mindset that you need to bring to thinking about it. As is true with many other posts in this blog series, its audience is people who are technical leaders in their organization, but I hope anybody who wants to help their engineering org do better work can get something out of this.</p>
<h2 id="defining-our-terms">Defining our terms</h2>
<p>Let’s start by defining “process”. Process is <em>the way you habitually do things</em>. Do not confuse process with ceremony or formality, or any other term you’d like to use to describe overhead added to the core of the thing you want to get done. <em>You always have process.</em> You might not have thoughtfully-designed, intentional process.</p>
<p>“Ceremony” is a thing you do every time, ritualistically, usually involving other people. Regular meetings are a kind of ceremony. “Formality” refers to how prescribed and enforced a process is. When people react to “process” as a bad thing, they’re usually thinking of processes with heavy formality or more ceremony than they’re worth.</p>
<p>An example of a team process: “We prefer to have code PRs reviewed before we land them in main. It’s okay if docs or other non-functional changes don’t get reviewed and go directly into main.”</p>
<p>Adding ceremony: “All changes need to go through PRs, though we don’t require review.”</p>
<p>Adding more ceremony: “All changes must go through PRs with review, but we are okay if reviews are a rubber stamp.”</p>
<p>Adding formality: “We require that all PRs be reviewed &amp; all CI tests pass before they can land in main, and we enforce this with settings in our source code repo that only administrators can change.”</p>
<p>Here’s a non-tech example of ceremony that might help you recognize it: <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/Pointing_and_calling">pointing and calling</a>. This is a ceremony that helps operators of dangerous equipment (most often trains) confirm to each other what the status of important indicators is. Station guards will point at an indicator showing which side of the train to open the doors on, and call out as they do so, making sure the train conductor knows which set of doors to open. Adding a ceremony to the process helps the operators avoid opening the wrong set of doors. Another example of this would be <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/Lockout%E2%80%93tagout">lockout-tagout</a>. This formal ceremony ensures that people know when dangerous equipment is deactivated and can be worked on safely.</p>
<p>Let’s talk about “friction”, the main thing this post is worried about. <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/Friction">Friction</a> is increased in a process in each of the examples above. “Friction” is a useful metaphor here because each of those examples <em>oppose motion</em>: they demand more energy be invested in moving the project than would be required if they weren’t there. This might be a good idea! Lockout-tagout makes equipment safer to maintain. The lowest possible friction version of the PR example above is “we don’t care if code gets reviewed; merge right into that production branch.” You can see why adding friction in requiring PRs might be good for that team.</p>
<p><em>Adding friction is just fine when it buys you something worthwhile.</em></p>
<p>Teams with high levels of trust don’t need more than that first version of the PR process. Teams that don’t trust each other–or are perhaps required not to trust each other because of mandated security processes–need something more like the fully-formal version. A team that needs that fully-formal version will move more slowly than the first team. Is this worth the cost? It depends on the situation! Your goal is to identify your team’s work habits and work environment and identify things that are slowing everybody down <em>without buying you something worthwhile</em>.</p>
<p>Sometimes process is… well, ludicrous and obviously causing harm. This Twitter thread is full of pure, wasteful friction. Merely reading it raises my stress levels.</p>
<blockquote>
<p>Let’s share tech stack horror stories: what’s the worst workflow or most absurd limitation you’ve hit with a codebase?</p>
</blockquote>
<blockquote>
<p>I’ll start: while working as a subcontractor, I wasn’t able to submit code directly for review. I had to attach the updated files to an email. 🥲</p>
</blockquote>
<blockquote>
<p>What’s yours?</p>
</blockquote>
<blockquote>
<p>— Jason Lengstorf (@jlengstorf) July 21, 2022</p>
</blockquote>
<p>Process isn’t the only source of optional friction, and it might not be the most painful source. Instead, the work environment is often the worst source. The tools. The platform. CI workflows. Automation or, more likely, the <em>absence</em> of automation. Things that break and require human intervention. Buggy tools. Slow tools. Things people need to do often that are flaky. Builds that take forever and slow down develop-test loops. Continuous integration testing that takes a long time to run and slows down landing all work. Slow deploy processes that make the cost of pushing changes live high, and therefore makes pushing changes dangerous.</p>
<p>The other term we need to define is “toil”. The English word means “labor that tires you out”. In the context of tech world jargon, we use it to mean work that’s draining or time-consuming that doesn’t seem to be related to the core of what we need to get done. Repeated work. Predictable routine work. A process that is predictable and time-consuming but has to be done by hand is <em>toil</em>. Resolving Dependabot PRs to your repos is <em>toil</em>: it feels like work but accomplishes nothing worthwhile.</p>
<p>You shouldn’t tolerate either toil or tools misery. They are entirely avoidable, and they’re killing your team’s velocity and making everybody unhappy. Take stock of problems in this category, prioritize them, and eliminate them.</p>
<h2 id="making-the-case">Making the case</h2>
<p>You might think it would be easy to point to these sources of slow-down and say, “let’s fix things”. In practice, you might get pushback. Why? What can we, as technical leaders, do about the resistance to making things better?</p>
<p>First we must acknowledge that changing any system is difficult: systems are self-reinforcing for many reasons. People within the system see the <em>cost</em> of change clearly, but they often don’t have good ways to measure the <em>rewards</em> of change. Also (and let’s be honest here) all of us have lived through having change promoted to us as unalloyed good, then seen it turn out to be not so great. Or actively awful. People proposing change have a higher bar to jump over than people who want the status quo. So if you want change to happen, you have to invest energy yourself. You’ll need to make the case for action.</p>
<p>Why hasn’t anyone else made the case? Why is your team stuck here? Good questions! Remember that the people next to you in this situation probably hate the friction just as much as you do. If they could stop it, they would. Once again, we have to go to the system they’re in and look what what it reinforces. You, as an analyst of that system, have an easier time popping out of it and changing it.</p>
<p>Let’s look at some reasons why people around you might resist the push to make things go faster.</p>
<h3 id="it-didn-t-happen-overnight">It didn’t happen overnight</h3>
<p>The team might be unaware of how bad the problem truly is. They might not have noticed it was happening, because it probably didn’t get bad all at once; the slowdowns and the trouble got worse slowly over time.</p>
<p>To show how bad it is and break people out of denial, you might go to the data. How costly is the friction? Measure it! Count the number of times tool <em>X</em> explodes and the team wastes a day on cleanup. Graph how much time people spend waiting for slow builds. The data will help you prioritize, so it is not a waste. (I think gathering metrics on internal tools is a good habit for teams even when everybody’s happy.)</p>
<h3 id="ownership">Ownership</h3>
<p>The resistance to change might come from a far more human and emotional place. People might be attached to the things they built in the past, and reluctant to retire them. Don’t be a jerk about the software past versions of the team wrote. People do the best they can given the circumstances they’re in. Solutions that solved the problems of the past might no longer be good at solving the problems of the present. Honor the work done earlier, and let people feel good about it even as you’re coaxing them into replacing it. If you can, <em>let them own the work</em> of making their thing better. If that’s not possible, at least seek out their feedback and ask them what they’d do differently this time around. They probably have good ideas.</p>
<p>Sometimes people will block whatever work happens. They might want to retain control. They might be unable to admit they were wrong about something. The worse case I’ve seen was somebody who simply resented all authority telling them what to do about anything. Toxic orgs probably feature several people like that. Do I have to tell you what to do here? You don’t want to do it, because you’re a human being with empathy, but sometimes you have to fire people.</p>
<h3 id="stress">Stress</h3>
<p>Organizations with a lot of friction might have people stressed by the work of pushing things forward despite the friction. Your most dedicated and motivated colleagues might be working the hardest to do this, and suffering the worst stress as a result. Stressed people can’t imagine adding to their workload by revamping existing systems that work, however poorly. They will resist change to protect themselves from their burdens getting worse.</p>
<p>This is an own-goal on the part of the organization. Leaders can prevent this, and indeed must. Stressed people don’t do their best work. Full stop.</p>
<p>Stressed people need to have their immediate needs honored and work shifted away from them. You must not listen to their opinions about what can and cannot happen until you’ve fixed their immediate emergency. Indeed, removing friction might give them the space to imagine a better world.</p>
<p>Don’t ask them to do the work of fixing their desperate situation. Fix it for them. This one’s on management, and maybe on you, o fellow technical leader.</p>
<h3 id="learned-helplessness">Learned helplessness</h3>
<p>The most depressing resistance to change comes from people who say that this is how bad it always is. They can’t imagine things being better.</p>
<p>Anecdote time! I once worked for a moderately successful but not quite successful enough startup that made a hardware thingie you might even have heard of. Eventually it was acquired by ConHugeCo Software, Inc, a very very very large company indeed that you’ve definitely heard of. The new corporate owners wanted their newly-acquired software team to work on project Foobar, already in motion. Foobar had a lot of existing process and tooling and a team that was already pushing it forward. They were behind. They were engaged in weird political machinations to create excuses, they were so behind. Surely this acquihired team could help!</p>
<p>Um.</p>
<p>Eventually I joined project Foobar, and I learned why it was behind. Getting a single commit into the source repo for project Foobar took at least half a day and sometimes an entire day. You had to get into line to check in. When you were head of the line, you had to resolve any merge conflicts that were caused by the people who merged in since you got into the line. (And no, this was <em>not</em> git.) You then had to build the full thing, and that was slow. Hours slow. Then you had to test. Then you could merge. Heaven help you if you broke the build: there were people who would get mad at you about that and penalties for it were discussed.</p>
<p>“Why,” I asked somebody, “do we not have a build team making this faster and better?”</p>
<p>The answer stayed with me. It was: “Nobody wants to be on a build team. They get laid off when their work is finished.”</p>
<p>Laid off. Their work. Finished. Uh. What?</p>
<p>The culture gap was epic and unbridgeable. The project turned out to be a famous disaster. Are you surprised? No? None of us at $acquiredCompany were surprised, either. The acquiring team could not imagine healthier processes. The cudgel was their only tool. They did not fix anything because that’s the way things were.</p>
<p>This is learned helplessness. Reject it. Things can be better than that. It is not only possible but <em>normal</em> for things to be better. I know that. You know that. Stand up for it.</p>
<p>If you can’t, leave.</p>
<h3 id="the-positive-argument">The positive argument</h3>
<p>Let’s make the case with more positive arguments. What will you get by relentlessly reducing developer friction? The obvious benefit: the whole team will go faster. I have to call this out explicitly, because a lot of the pushback to the idea of reducing friction comes from not thinking about what this means.</p>
<p>Everybody. Goes. Faster.</p>
<p>Reducing the amount of time it takes to do something by a couple orders of magnitude can have radical effects not just in kind but in category. When it took many minutes do download a single MP3 file, nobody was streaming movies. Now that gigabit fiber is an option for many homes, we’re streaming high-definition movies on a whim. Things you couldn’t imagine happening before become normal. You can probably think of more examples like this.</p>
<p>Here’s a modern example I’ve lived a couple of times now:</p>
<p>Deploys become fast: the cost of making changes is now low. <br>
The cost of making changes is low: people become less fearful of making changes. <br>
Less fear: changes get smaller and more frequent. <br>
Small, frequent changes: less dangerous inherently, so failures happen less often. <br>
Failures happen less often: the team becomes more confident.<br>
A confident team experiments and pushes themselves into trying new things. <br>
Everything gets better.</p>
<p>This is a virtuous cycle. This particular virtuous cycle can be promoted in lots of ways–great CI for instance–but hey, even CI benefits from running fast. And frequently. And easily from a developer’s laptop and not just a remote process if you can wrangle that one. A barrier to doing something is a kind of friction too!</p>
<p>Friction is <em>frustrating</em>. It generates stress. Nobody enjoys slogging through a ceremony they can’t see the benefits of. Nobody enjoys watching a deploy fail <em>again</em> in the same way as the previous five times this week. Friction without payoff makes people unhappy. To my mind, this is reason enough for fixing it. Content people who are comfortable and talking regularly with their colleagues do great work; unhappy teams spend their time fretting about their unhappiness. The world is stressful. Don’t add to it. This is ethically good as well as pragmatic for whatever your shared venture is.</p>
<p>Let’s make a more banal, money-based argument next.</p>
<p>Salary is, for most companies, the single biggest cost they have. Stop wasting that money! Why are you spending money making your programmers do things by hand that could be done by a small shell script? This is overall a complex topic, and a lot of things factor into your decision to build, buy, or do nothing. Here, we’re most likely talking about build OR buy vs doing nothing at all. A fast calculation of salary hours vs payoff is useful for deciding when act as well as when <em>not</em> to act. Make a rough estimate of how much time your team is spending wasting on waiting for builds (fixing something, pushing a repeated process by hand, etc.) <em>for the entire year</em>, then compare that to what you’d invest into a single push into making that faster.</p>
<p>Once again, measurements help to inform your decisions. If you don’t have data, do something lightweight to get it.</p>
<h2 id="things-to-try">Things to try</h2>
<p>You are convinced! You have convinced others! You are able to act to reduce your team’s friction! How do you do it?</p>
<p>Start by asking your team what is slowing them down. They will straight-up tell you what’s wrong. Listen to reports of irritation; if the irritation rises to the level of frustration pay special attention. You might not take your team’s proposed <em>solutions</em> at face value. Here your team is like any software user, who will tell you all about the solution they’ve imagined, not the best solution you might provide. Listen to what people are trying to do and why they’re being prevented. Pay attention to the reality of their stories. Question everybody’s assumptions about the way things have to be, including your own.</p>
<p>Imagine what you would do in the ideal case, if you were designing the thing from scratch today. Take a step toward that ideal from where you are now. This <em>is</em> possible.</p>
<h3 id="if-you-re-using-bad-software-stop">If you’re using bad software, stop.</h3>
<p>Is your system configuration software driving you nuts? Switch to something else. (It will drive you nuts too, but perhaps less nuts.)</p>
<p>Is <em>X</em> famous SAAS thing that was super-cheap to buy driving your team nuts? (I’m looking at you, ubiquitous but relentlessly mediocre famous suite of tools.) Switch to something else.</p>
<p>Has your team staged a revolt and started using something that isn’t the official choice? Listen to the pain of your team. Honor the pain. Switch to their choice. This isn’t about allowing chaos to reign, but about paying attention to existing signals, and paying <em>especial</em> attention to strong signals.</p>
<p>Make team software changes definitively and without half-measures. Commit to the change. Retire the old stuff. Plan a cutover if necessary so you don’t leave mess behind: do any required data migrations. Get feedback on the results. You shouldn’t make changes like this on a whim unless the cost of change is pretty low, but doing it on the worst offenders can be a huge morale boost.</p>
<h3 id="treat-internal-tools-as-important-software">Treat internal tools as important software.</h3>
<p>Work on internal tools is highly-leveraged: every one of your developers will write better software when their tools are good. It is <em>worth</em> devoting senior engineering brains to them. It is worth devoting <em>your</em> brain to them if there is nobody else. Your job, o fellow technical leader, is to make your team successful at building the widgets your organization wants to build. We must do the things nobody else can do.</p>
<p>If using an off-the-shelf tool isn’t possible, then the tool you’re building is critical to your product. Treat it like that. Take the work seriously. Design it thoughtfully. Do your usual requirements analysis! Who’s using this tool? What are they trying to do? What are the performance and latency requirements? How should errors be handled or reported?</p>
<p>Sweat the output of internal tools. Don’t bury important results of CI in a rubbish heap of uninteresting compiler output. <a rel="noopener external" target="_blank" href="https://www.edwardtufte.com/tufte/books_vdqi">Tufte’s design principles</a> apply here too.<sup class="footnote-reference" id="fr-3-1"><a href="#fn-3">1</a></sup></p>
<p>Doing this analysis on testing system output was super-fulfilling and helpful for the consumers of the test output.</p>
<p>Common tool areas for you to think about:</p>
<ul>
<li>Chat and video conferencing software: is it reliable and high-quality?</li>
<li>Bug/issue/task trackers: help or administrative burden?</li>
<li>Source control software and tooling around it.</li>
<li>Development environments: setup of any common software that your team needs to use. Examples would be specific versions of a language runtime or compiler needed to develop software.</li>
<li>Internal tools that solve problems specific to your internal workflows.</li>
<li>Build systems, both for the develop/test loop and for release processes.</li>
<li>Deploying software. Is it fast? Is it reliable?</li>
<li>The substrate upon which software gets deployed.</li>
<li>Automated testing, particularly integration testing.</li>
</ul>
<p>Distribute internal tools in compiled, packaged form. Don’t make people build/install them every time they need to use them. Have enough release process for these tools to ensure they work. Consult <em>user</em> convenience, not developer convenience here. (The needs of the many, etc etc.)</p>
<h3 id="treat-your-processes-as-worthy-of-thoughtful-design">Treat your processes as worthy of thoughtful design.</h3>
<p>I mentioned earlier that you always have process, because process is the way you usually do things. <em>Think about your processes</em> and tweak them as needed to remove unnecessary friction from them.</p>
<p>Water runs downhill. People always do the thing that’s easiest to do. Your goal is therefore to make the right thing to do the easiest thing to do. If people are regularly doing any end-run around a process to get work done (say, regularly asking for rubber-stamp PRs so they can be unblocked), you have a process that’s not earning back its energy cost. Fix it.</p>
<p>What are the goals you want a habitual-way-of-doing-things in an area to achieve? What values do you want to express? Be clear about them. Be clear about the priorities of your values. You might need to honor high priorities and let lower priorities go unfulfilled.</p>
<p>Make sure you have a <em>feedback loop</em> somewhere helping you evaluate your new processes. Designing processes without feedback from the lived reality is possibly worse than not designing them, because you’ll have people held accountable for doing things that turn out to be bad ideas. Iterate. Improve. Nothing need be set in stone. It’s okay to change! It’s okay to look at where people are walking right now and pave those paths. It’s a decent starting point.</p>
<p>Jump out of the system and examine its assumptions. One way of reframing the “I’m blocked by no PR reviewer here” problem is to notice that the person who’s blocked did the work alone and has no team or buddy who shares context about the work. If they paired, they would have an instant PR review, and a pretty high quality one.<sup class="footnote-reference" id="fr-2-1"><a href="#fn-2">2</a></sup> If the work was planned work and review was blocked, perhaps time for reviews should be budgeted into your team’s plans.</p>
<p>The best process is one that your team doesn’t even think of as a process because it’s been automated into invisibility.</p>
<h3 id="automate">Automate.</h3>
<p>Obliterate toil: automate it.</p>
<p>Automate ruthlessly. This is where I have seen the most <em>surprising</em> pushback. We’re programmers. Automating processes is what we do! People will flinch about this, afraid of time spent automating things that won’t pay off. Yes, we’ve all been there. So <em>don’t do that.</em> Don’t automate things that are really one-offs. If there’s any chance you have to do the same thing more than five times<sup class="footnote-reference" id="fr-1-1"><a href="#fn-1">3</a></sup>, automate it. If it’s complex and difficult for a human to do, automate it. If the blast radius of the explosion caused by a human doing it wrong is large, automate it. If the end results need to be the same every time, automate it.</p>
<p>Infrastructure should be automated as far as you can push it.</p>
<p>The upside of automation is that the software that does the work for you can be instrumented.</p>
<h3 id="measure-and-observe">Measure and observe.</h3>
<p>This is a corollary of deciding to treat your tools as important software, but it’s worth calling out.</p>
<p>Measure everything, and <em>make the results of the measurement visible.</em> Measure how long a process takes. Measure how long PRs sit unreviewed. How long each step of a deploy takes and how many deploys fail. Make all of this data easy to look at.</p>
<p>Instrument your tools so you know how often people are using them, how long the runs takes, and whether they succeed or fail. (Don’t instrument so heavy-handedly that you slow them down.)</p>
<p>My favorite way to do this is to use <a rel="noopener external" target="_blank" href="https://www.honeycomb.io">Honeycomb</a> to trace everything, not just our production software. At a recent job we instrumented builds, deploys, and CI runs this way. The output of those runs prominently included links to Honeycomb’s visualizations of the traces. Every build and deploy report included a link to a view like this about how long it took:</p>
<p><img src="/images/honeycomb-build-trace.png" alt="a screenshot of a Honeycomb visualization of trace data from a build and deploy, showing timings for each step" title="A Honeycomb trace of a build and deploy flow" /></p>
<p>Is this deep? No. Did it take a long time to do? Also no. Is it helpful? Definitely <em>yes</em>. Imagine this, for everything. Imagine this, telling you about timings for every single internal tool you run, including the exit code returned and who ran it. Imagine how much better you can make every single tool your team uses with data like this.</p>
<p>You might have another tool you like to use here, which is great! Please tell me about it on Twitter!</p>
<h2 id="the-deer-they-are-teal">The deer, they are teal</h2>
<p>Here’s what I’d like you to take away from this blog post.</p>
<ul>
<li>Friction is slowing down your team.</li>
<li>The energy cost of overcoming friction needs to buy you something worthwhile, or it needs to be reduced.</li>
<li>Investigate friction by talking to your team. Frustration is an important signal.</li>
<li>Observability isn’t just for your production software: measure everything. Use data to inform your decisions.</li>
<li>Order of magnitude changes in cost result in entirely new behaviors.</li>
<li>Design your processes.</li>
<li>Design your tools.</li>
<li>Automate ruthlessly.</li>
<li>Set up feedback loops so you learn what’s working and what’s not.</li>
</ul>
<p>Most importantly, you <em>can</em> fix it. Every little bit you fix gives you more energy back so you can fix the next thing. It <em>will</em> be worth the investment.</p>
<hr />
<p>My thanks to <a rel="noopener external" target="_blank" href="https://twitter.com/isntitvacant">Chris Dickinson</a> for the lockout-tagout and pointing-and-calling examples! Also my thanks to David Zink for editing my prose into a tighter form.</p>
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn-3">
<p>Tufte’s design principles, recapped because they are so good:</p>
<ol>
<li>Above all else show the data.</li>
<li>Maximize the data-ink ratio.</li>
<li>Erase non-data-ink.</li>
<li>Erase redundant data-ink.</li>
<li>Revise and edit.</li>
</ol>
<p>He’s talking about visual design, but this works for writing as well. <a href="#fr-3-1">↩</a></p>
</li>
<li id="fn-2">
<p>To repeat myself: PRs are best used to socialize work that’s already in a good state, not to find bugs in work somebody has already decided is finished. In other words, the useful review and tightening should happen <em>before</em> the PR process, in some earlier phase. Pairing is good. Strong testing is good. Team discussion about ways of solving a problem are good, so the approach taken in a PR doesn’t need to be debated. The PR is to say to a wider audience: hey, this thing happened. An exception to my own approach: small, uncontroversial bug fixes are perfect for review in PRs. <a href="#fr-2-1">↩</a></p>
</li>
<li id="fn-1">
<p>I kinda want to say “three times” here instead of five, but you know, use your judgement. Do a little basic arithmetic on how long a thing takes and how often it’ll need to happen. Think how important getting it done consistently is. Prioritize to match. <a href="#fr-1-1">↩</a></p>
</li>
</ol>
</section>
]]></content:encoded>
      </item>
      <item>
          <title>Against dogmatism</title>
          <link>https://blog.ceejbot.com/posts/against-dogmatism/</link>
          <pubDate>Sun, 29 May 2022 13:34:29 -0700</pubDate>
          <author>ceejceej@gmail.com (C J Silverio)</author>
          <guid>https://blog.ceejbot.com/posts/against-dogmatism/</guid>
          <description>Dogmatism is an enemy because it makes bad decisions.</description>
          <content:encoded><![CDATA[<p>Sometimes I think that my next conference talk ought to be nothing more than a live read-through of Tef’s blog post, <a rel="noopener external" target="_blank" href="https://programmingisterrible.com/post/176657481103/repeat-yourself-do-more-than-one-thing-and">“Repeat yourself, do more than one thing, and rewrite everything”</a>. This is a bad idea because Tef should do that, in some post-pandemic future when international travel is safe again and I can attend and buy him a drink. So I’m going to let Tef’s blog post push me off into my own direction instead, and attempt to add something useful to his wisdom bombs.</p>
<p>Tef’s main point—worked through via examples of common advice given to programmers that is sometimes bad advice—is that all advice <em>has a context</em>.</p>
<blockquote>
<p>When you hear a piece of advice, you need to understand the structure and environment in place that made it true, because they can just as often make it false. Things like “Don’t Repeat Yourself” are about making a tradeoff, usually one that’s good in the small or for beginners to copy at first, but hazardous to invoke without question on larger systems. – Tef</p>
</blockquote>
<p>“Don’t repeat yourself” is the advice I rail against in <a rel="noopener external" target="_blank" href="https://blog.ceejbot.com/posts/legacy-you-hate/#fnref:6">a recent post in this series</a> because I saw how application of the advice damaged a particular code base. Any two code paths that looked at all similar were collapsed into single methods with long parameter lists, with flags and null checks to determine mid-flow which one of the five different entry points was in use this time. This made the code difficult to understand, debug, and change, because any change had to be verified as appropriate to make for many different entry points. Every bug we worked on required careful documentation of the many ways a specific code path could be invoked and careful mental simulation of execution for each.</p>
<p>Was the maintenance cost worth whatever was saved by not duplicating some smaller sections of code? No. But probably it didn’t start out that way: it started out with somebody adding an entry point and <em>not</em> copying code, because, well, don’t repeat yourself. And then do that a few more times, each time adding a parameter while scrupulously not repeating code, until the programmers who understood each path through were all gone.</p>
<p>I grind an axe here, of course. My point is that <em>following the DRY advice dogmatically was a bad idea</em>. It’s the dogmatism that gets you.</p>
<p>Dogmatism says: Don’t repeat yourself means don’t repeat any code, ever.</p>
<p>Dogmatism says: This particular one project management methodology is the one true methodology! Every team at this company will do agile/scrums and always-pair-program/never-pair-program while fibonacci-pointing/playing-planning-poker.</p>
<p>Dogmatism says: Object orientation is the only way people should structure code and therefore this programming language only has classes.</p>
<p>Dogmatism says: All software must follow one of the named design patterns in the Gang of Four book/some other book and if you can’t name the pattern you’re doing something wrong.</p>
<p>Put that way it sounds silly, right? So we do we keep doing it?</p>
<p>Because we don’t like the reality that we must <em>always do the work</em> to find the right solution to the specific problem in front of us. It’s much easier to fall back on a set of rules that we don’t have to think about or make hard decisions about. But this compromises our solutions.</p>
<p>There’s a blog post in me about how making tradeoffs well requires understanding clearly the values you’re using to select among possibilities. Every value is a razor you can use to make decisions. Dogmatism is a value! It makes decisions for you.</p>
<p>I suspect dogmatism is a value we often hold without self-reflection. That is, we can hold it as a value without being aware that it’s a value and that it is influencing our decision-making. I think it makes bad decisions. Dogmatism doesn’t let you weigh tradeoffs. And friend, it’s tradeoffs all the way down.</p>
]]></content:encoded>
      </item>
      <item>
          <title>One year for a one-line fix</title>
          <link>https://blog.ceejbot.com/posts/one-year-for-one-liner/</link>
          <pubDate>Tue, 24 May 2022 14:59:59 +0000</pubDate>
          <author>ceejceej@gmail.com (C J Silverio)</author>
          <guid>https://blog.ceejbot.com/posts/one-year-for-one-liner/</guid>
          <description>Why it took me a year to arrive at a one-line fix for a massive performance problem, and how I hope to shorten that time should I encounter a similar situation again.</description>
          <content:encoded><![CDATA[<p>This blog post is harder to write than you might think, because it goes right into a number of people problems and the behavioral patterns of toxic organizations. I wish to preface all of this by noting that toxic organizations warp the behavior of everyone in them. People who might behave in healthy ways in healthy orgs find themselves behaving badly inside toxic systems. The only thing to do is fix the organization first. So I have sympathy for everybody involved in this story, both my unknown predecessors and the people who were right next to the problem the whole time. I am most interested in <em>what I will do differently</em> next time.</p>
<p>With that preface in mind, let’s tell a story.</p>
<h2 id="let-me-tell-you-a-war-story">Let me tell you a war story.</h2>
<p>Once upon a time there was a dot-net monolith, one that had been poorly maintained for a long time, hacked upon by rushed people who were evaluated only by how fast they pumped out the next feature the CEO wanted. This dot-net monolith was in a poor state and everybody around it knew that. It was expensive to run (its AWS costs were enormous), expensive to work on (making changes was time-consuming and dangerous), expensive to deploy (deploys often broke and took hours to resolve), difficult to test (the test suite was a mechanical turk service that ran overnight), and difficult to understand (re-entrant side-effect-heavy functions and in-memory caches on top of external Redis caches made for some fun race condition factories). The team around it knew it had trouble.</p>
<p>Enter me, somebody who didn’t know a dot-net from a dot-product. I was brought in to scale out the system, which had a lot of new code written in JavaScript around that dot-net thing. I was fairly confident in my ability to make node jump through hoops. I knew that C# was Microsoft’s proprietary version of Oracle’s proprietary Java, so at least I could read the code. Mostly. At my request, I started out fixing bugs on the team that touched the most varied parts of the system, so I could get my hands dirty first, learn how things fit together in reality, and earn credibility with the overall team before I had to start making changes.</p>
<p>Two weeks into my new job, the entire system fell over on a regular weeknight, under regular load. And by “falling over”, I mean it became non-functional. All API endpoints began to fail to respond. The site was down. Nobody could purchase widgets and have them delivered.</p>
<p>Why? Nobody could say. It looked like it was Redis. At least, the CPU on the Redis cache instance was hitting 100% and when it did, everything stopped.</p>
<p>Now, like many of us, I was very familiar with Redis. I trusted Redis. It is often the most reliable piece of software in my stack. I’d pumped a lot of traffic through Redis at the world’s JavaScript registry, a lot more than this single-state retail outfit could possibly be sending through it. What was this system doing to Redis that was making it thrash so badly? Nobody knew. What were we putting into Redis? Nobody knew. How many objects were in it? Nobody knew. How big were they? Nobody knew.</p>
<p>“Where are your metrics?” I asked. There was an expensive hosted Graphite service, but nobody was looking at it. There was an expensive APM product wired up to the monolith, but nobody knew how to interpret it. There were Cloudwatch graphs! Only infra had access to these or the ability to make dashboards.</p>
<p>At this point I knew what my job needed to be first. I went on an observability tear over the next months, among other tears inspired by this outage.</p>
<p>We got through that initial outage by upgrading to AWS’s largest Elasticache, which was ruinously expensive but seemed to hold up under the load. We then mitigated the problem around the edges by taming some problems with the website hitting endpoints more than it needed to, and at the core (most meaningfully) by splitting up the cache into several different cache instances. (Two very thoughtful engineers had already browbeaten their way into being given time to refactor the code enough to make this split happen, because they knew this was a problem area before the outage happened. The implications of this sentence are entirely intentional, and we’ll come back to them.)</p>
<p>We limped through the weeks remaining until the day that was the big sales day for the industry, the one that was going to be the biggest day ever with $X of revenue, for some record-breaking value of X. The entire company prepped for months for this event, with marketing and incentives and ordering stock to be sold.</p>
<p>Three hours after opening, the system went down. Adding more instances of the monolith brought the system down harder. In the end, we had 3 hours of downtime in the middle of the hottest business day of the year, the equivalent of Black Friday, and this downtime ruined the work of everybody at the company who’d prepared for that day. It was bad. Very bad. Company-harming bad.</p>
<p>One year later, my colleague Chris and I identified the problem and fixed it with a one-liner.</p>
<h2 id="that-s-a-heck-of-a-war-story">That’s a heck of a war story.</h2>
<p>Right? The one-line mistake that nearly killed a company, and the one-line fix that saved it. Except, well, it’s more complicated than that.</p>
<p>In the end, none of the observability instrumentation I added mattered.<sup class="footnote-reference" id="fr-1-1"><a href="#fn-1">1</a></sup> It was the default Cloudwatch Redis graphs that identified the problem. When we stood up an idle cluster of the service using our new deploy system (deploy times down from 30 minutes minimum to less than 3 minutes tops)– ahem–</p>
<p><em>Hold on, you rewrote the deploy system?</em></p>
<p>Yeah. When the entire infra team was laid off we were finally able to fix probably the worst cause of daily development friction–</p>
<p><em>Hold on, the entire infra team was laid off? And this let you fix things?</em></p>
<p>Yes, as I was saying, we finally had access to everything and freedom to fix what had been obviously broken for a long time.</p>
<p><em>Hold on.</em></p>
<p>I know. There’s a lot to unpack here, and I’ve been struggling for some time to find a way to unpack it that remains kind to the people who were trapped in this toxic organization alongside me, doing bad things because that’s what the organization wanted them to do. For some of <em>that</em> story, read <a rel="noopener external" target="_blank" href="https://blog.ceejbot.com/posts/dysfunction-junction/">“Dysfunction junction”</a> first.</p>
<p>I’ll talk about those things in a minute.</p>
<h2 id="the-punchline-to-the-story">The punchline to the story.</h2>
<p>Back to the bug. Redis. AWS’s largest Redis. CPU hitting 100%. That bug.</p>
<p>As I was saying, testing our new deployment system was what made us look at this problem again with fresh eyes. We used the new deployment system to stand up an <em>idle production cluster</em>, ready to be swapped in for the older prod cluster that used the old deploy system. This was something we’d done for every microservice in the system, so we had a lot of practice doing it, and at last we were doing the hard one, the dot-net one.</p>
<p>The moment we brought the new, idle cluster into existence with terraform, we noticed the Redis instance CPU spike and cause trouble to the production system. That was surprising! The new prod cluster wasn’t live yet! We looked at the full set of Redis graphs and noticed an oddity. The new connections per minute was <em>absurdly</em> high normally, and the idle cluster had just spiked it higher.</p>
<p>First, no way should any process be generating new Redis connections except on restart. That graph should be sitting at zero. Second, idle clusters shouldn’t be creating any new load on Redis except at process start.</p>
<p>“Huh,” we said. “Could pings from the load balancer be creating new connections? Because pings are the only traffic it’s taking. That would be fundamentally broken but it would explain this.”</p>
<p>So we fired up a video chat, shared a screen with the code that injected Redis into the dot-net frammistans, and set about understanding what it was doing. We learned the word “lifestyle” and read the docs on the various kinds of lifestyles: transient, scoped, and singleton. Nearly all of the Redis connection managers were singleton lifestyle, which is for the lifespan of the application. Seems good! Then we noticed one line that didn’t look like the others, injecting connections for the general-use cache:</p>
<pre class="giallo" style="color-scheme: light dark; color: light-dark(#3C3836, #EBDBB2); background-color: light-dark(#F9F5D7, #1D2021);"><code data-lang="csharp"><span class="giallo-l"><span style="color: light-dark(#076678, #83A598);">container</span><span style="color: light-dark(#7C6F64, #A89984);">.</span><span style="color: light-dark(#79740E, #B8BB26);">Register</span><span style="color: light-dark(#7C6F64, #A89984);">&lt;</span><span style="color: light-dark(#B57614, #FABD2F);">ICacheConnectionManager</span><span style="color: light-dark(#7C6F64, #A89984);">,</span><span style="color: light-dark(#B57614, #FABD2F);"> RedisConnectionService</span><span style="color: light-dark(#7C6F64, #A89984);">&gt;</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#076678, #83A598);">Lifestyle</span><span style="color: light-dark(#7C6F64, #A89984);">.</span><span style="color: light-dark(#076678, #83A598);">Scoped</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#7C6F64, #A89984);">;</span></span></code></pre>
<p>“Scoped” means to create an instance of the thingie <em>once per request lifecycle</em>. Once per request.</p>
<p>Every. Single. Endpoint. Invocation. Created. A. New. Redis. Connection. Pool.</p>
<p>All requests, not just requests that needed to use a Redis. Requests like the health check endpoint, the one that should be near-zero cost because load balancers hit it frequently, requests like that. This is why the Redis cpu graphs looked like some kind of exponential function on the “active thingies in the system” count, because it literally was. The system had been DOSsing itself into downtime for years.</p>
<p>We changed that one line to give it a singleton lifestyle and deployed the change to our (new, shiny, one of many cattle) integration environment. We observed that the new connections graph began behaving as we expected, and everything kept working. So we deployed it to production.</p>
<p>Really easy fix. It let us stop running the largest Elasticache AWS sells, collapse all the split-out caches into the new much more modest not-clustered cache, and made everything go faster. Scaling horizontally no longer caused the system to punch itself in the face. That plus the new fully-terraformed ALBs made dealing with big days completely routine, and engineering commenced a very quiet two years of rebuilding with an AWS bill that was a fraction of what it was before, and I mean holy heck we cut that bill down to something that was [Ceej’s editor has deleted a lot of ranting here].</p>
<p>However. I remain unhappy about this fix.</p>
<p>I should have spotted this a year before, during the original Redis-caused outages. If I had seen that graph– and I should have demanded to look at all those graphs– I would have known immediately that something was very wrong, because nothing should be creating new connections like that. But I didn’t. Why not?</p>
<h2 id="why-did-it-take-a-year">Why did it take a year?</h2>
<p>Expertise, ownership, and trust. Each of these concepts is a two-edged sword and each cut me with its second edge.</p>
<h3 id="expertise">Expertise.</h3>
<p>Expertise. I knew I did not have C# or dot-net expertise. I had to rely on the people who had it. I was also <em>not familiar with the code</em> that did this work, especially at two weeks in. I had to trust the people who knew dot-net and knew the code to assure me that there were no obvious howling bugs in it.</p>
<p>Where expertise is assumed but is not present, bad code goes unchecked. People get angry when you review their work and ask for changes, or even when you only ask questions about the work. Defensiveness can arise in low-trust environments, but it can also mask situations where people don’t have the expertise you need them to. Or situations where people have the expertise but are so pressured, stressed, and burned out that they’re not operating at full capacity.</p>
<p>Here the second edge cut me because I assumed without pushing that the people with dot-net expertise had already investigated the obvious possibilities. But also! I lacked this expertise myself. When we finally hired people who were expert with dot-net and comfortable with it, they laughed at this bug, because it was familiar territory for them. They’d have looked for and found it immediately.</p>
<h3 id="ownership">Ownership.</h3>
<p>When some one human or a team owns something, I feel I need to let them own it and trust their expertise. Meddling in their work can destroy their self-confidence or make them feel undermined. A feeling of ownership is good! It means you feel responsibility for that thing, and know that the burden of maintaining it rests on you.</p>
<p>The other edge of ownership is gatekeeping. The deploy system was obviously a block to all development by all teams. The team had a Slack channel where they negotiated who was going to merge which code for the <em>single deploy window</em> available on four days a week, with no deploys allowed on Fridays. Deploys were flaky and could take up to three hours to resolve. A colleague with a technical leadership role was in fact working on a better deploy system, but the infra team manager instructed their team to ignore the work.<sup class="footnote-reference" id="fr-2-1"><a href="#fn-2">2</a></sup></p>
<p>The infra team also jealously guarded access to things they thought belonged to them, such as access to an Athena search setup for production logs. At one point one of them locked down commits to the main monolith’s repo, announcing to the team that they no longer got to merge into “my repo”. To be clear, this was a human being who’d been burned to an absolute crisp by overwork; the blame flows upward.</p>
<p>Management can of course be the worst gatekeeper of all, and it was in this case. I mentioned briefly before that Redis had been identified as a problem area by some informed engineers, and they had to push hard to be allowed the time to work on it. They might have been more successful if <em>supported</em> by management instead of being treated as if they were wasting time that would be better spent on cranking out this month’s pet feature for the CEO.</p>
<p>From a distance, I can say with some confidence that gatekeeping was the worst block to diagnosing this Redis bug.</p>
<h3 id="trust">Trust.</h3>
<p>Trust. I said the word “trust” in each of the two preceding sections, because I had to extend trust to my colleagues. You earn trust by granting trust. People live up or down to your expectations of them, and I prefer to expect the best.</p>
<p>You can see the downside of all of these. Where expertise is assumed but not present, bad things happen. Where ownership turns into gatekeeping, other people are blocked from fixing things even if they could help. When trust is not warranted, things get into bad states and stay that way.</p>
<blockquote>
<p>“Trust, but verify.” – unknown origin, but possibly Khrushchev</p>
</blockquote>
<h2 id="then-the-layoffs-happened">Then the layoffs happened.</h2>
<p>The gatekeepers were all gone. The experts were also (mostly) gone. The ownership and responsibility were all on me and a much smaller but motivated team.</p>
<p>When the ownership fell to me, I felt both responsibility and empowerment. I was no longer politely taking people at their word, because those people weren’t there any more. I was investigating and experimenting on my own, and ruthlessly testing all of my own hypotheses. I knew I didn’t have expertise, and even when I <em>do</em> have expertise I have learned the hard way to double-check all my own work.</p>
<p>I was also not bound by the past. I did not care if something had always been that way. I was okay with doing things differently. I don’t much trust myself, but I did trust the people working alongside me in that moment. And most especially, I trusted the work we did together, because we verified it together.</p>
<p>The sad thing is that the ownership turned into gatekeeping problem was the difficult one to surmount, the one that in retrospect I’m not sure I could have solved in any other manner than parting ways with the gatekeeping team. I am going to tentatively state a thesis: operations/infra teams as teams separate from engineering always turn into walled-off defensive gatekeepers. You cannot allow them to exist in healthy orgs. You must practice some variation on devops by embedding people with this expertise into project teams.</p>
<p>Maybe there’s a way to do it if you frame their goal as <em>developer experience</em> not as “operations” or making AWS go brrrrrrr. The goal has to be to keep people focused on their customers– the engineers building the project– and not on defense against their colleagues. The same goes for security teams: embed those experts where they can have sympathy for the problems their colleagues are trying to solve and improve their solutions early.</p>
<p>But I digress. Expertise, ownership, and trust are a big part of this story, but they’re not everything.</p>
<h2 id="the-context-also-mattered">The context also mattered.</h2>
<p>Years later I learned that one of the two engineers who’d started working on Redis before the outages had some suspicions that there was a lifestyle problem, but he was afraid to change code that had been that way the entire time he’d worked there. I had no such fears, because we had <em>removed reasons to fear experimentation</em> by completely rewriting the infrastructure and deployment environment to make experimentation low-cost. We’d also invested in full tracing via <a rel="noopener external" target="_blank" href="https://www.honeycomb.io">Honeycomb</a>. We knew what was going on with the system in ways that we didn’t in that first outage.</p>
<p>The highly-contended integration environment had <a rel="noopener external" target="_blank" href="https://www.neversaw.us/2020/12/19/deploying-at-eaze/">become many environments</a>. (The link goes into detail about that project <em>and</em> tells the story of this bug from another perspective.) Deploys had been made fast and reliable. Access to information and metrics was available to everybody. Full access to AWS was available to all engineers. If a change broke production, the fix was three minutes away.</p>
<p>We weren’t scared to make changes any more.</p>
<p>I also want to call out that the team had progressed past trusting the word of people in the past about how things worked and whether or not things were feasible. We had the space and the support to read code to see if it genuinely behaved as described or if it worked differently, and experiment with changing things. Management was no longer blocking people from investigating or fixing technical debt.</p>
<h2 id="what-can-we-learn-from-this-story">What can we learn from this story?</h2>
<p>This is what I’d like you and my future self to take away from this war story:</p>
<ul>
<li>Assume nothing. The people around you might be wrong! You might be wrong too!</li>
<li>Test all hypotheses. Each test gives you more information.</li>
<li>Eliminate gatekeeping. No team can afford to cope with the damage done by people who want to keep information or access away from their colleagues.</li>
<li>Observability, even humble standard metrics, is invaluable.</li>
<li>You (o fellow technical leader) own everything. You must always feel the responsibility of that ownership. You can share it, but it’s always partly yours.</li>
<li>Trust but verify. Especially team superstitions.</li>
<li>Ruthlessly eliminating developer friction pays unexpected dividends.</li>
</ul>
<p>Also, it was totally not Redis’s fault.</p>
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn-1">
<p>For this specific problem. It was and remains invaluable for other reasons. <a href="#fr-1-1">↩</a></p>
</li>
<li id="fn-2">
<p>Most toxic behavior is driven by toxic organizations, but some toxic behavior is individual and <em>creates</em> that toxic organization. <a href="#fr-2-1">↩</a></p>
</li>
</ol>
</section>
]]></content:encoded>
      </item>
      <item>
          <title>Legacy you hate</title>
          <link>https://blog.ceejbot.com/posts/legacy-you-hate/</link>
          <pubDate>Tue, 24 May 2022 13:17:00 +0000</pubDate>
          <author>ceejceej@gmail.com (C J Silverio)</author>
          <guid>https://blog.ceejbot.com/posts/legacy-you-hate/</guid>
          <description>What to do with a legacy monolith implemented with a language and framework you don’t know and&#x2F;or dislike.</description>
          <content:encoded><![CDATA[<p>What should you do with a pile of legacy code you hate?</p>
<p>This was the central challenge of my last job. I was partially successful at solving it, and unsuccessful in ways that I want to share with you so you can do better than I did.</p>
<p>Let’s start by clarifying the problem.</p>
<p>“Hate” is a spongy word and we can be more descriptive about why you dislike the code base you’re presented with. Maybe it’s bad code: tangled, hard to maintain, failure-prone. Maybe it was written long ago by people who’ve long since left the company (burned out by having to maintain it) and nobody left understands it. Maybe only a few people are able to change how it behaves, and it takes those people far longer than anybody likes. Maybe it doesn’t scale in the ways you need it to. Maybe it’s <em>also</em> written in a language you don’t like or don’t know, or maybe it’s written on top of a framework you don’t like or don’t know.<sup class="footnote-reference" id="fr-1-1"><a href="#fn-1">1</a></sup></p>
<p>Whatever the reason, you’re very done with this pile of code and so is everybody around you. It needs to be replaced and you all know it. <em>And yet</em>, it’s the money engine for your company.</p>
<p>What to do?</p>
<h2 id="don-t-rewrite-immediately">Don’t rewrite immediately.</h2>
<p>The temptation will be to rewrite the whole thing. You already know that you shouldn’t.</p>
<p>We all know that <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/Second-system_effect">second system syndrome</a> is a thing, and we all know that big-bang rewrites are notoriously difficult to pull off. As Gall famously said:</p>
<blockquote>
<p>“A complex system that works is invariably found to have evolved from a simple system that worked. The inverse proposition also appears to be true: A complex system designed from scratch never works and cannot be made to work. You have to start over, beginning with a working simple system.” – John Gall</p>
</blockquote>
<p>The other reality is that companies rarely have the time and resources to devote to rewrites, even if they have run themselves deep into tech debt. They never want to pay down that debt, and they never like the idea of giving up new feature work for a rewrite project that doesn’t move them forward.</p>
<p>And yet, this code needs to end up being rewritten <em>somehow</em>, because it’s a disaster that is costing the organization dearly, and perhaps even driving it to the brink of failure.<sup class="footnote-reference" id="fr-2-1"><a href="#fn-2">2</a></sup> This is true at the same time that you can’t dive into a big bang rewrite.</p>
<p>Reframe the problem and shift the goal: you want to be able to rewrite <em>in useful pieces</em>. Small pieces allow you to make incremental progress that can be seen to be “delivering business value” or at least measurable progress toward the end goal. Small pieces are <em>also</em> small systems on their own, each of which is simple enough to be kept working.</p>
<p>Now you’ve shifted your task to identifying useful pieces and rewriting those, and that task is more achievable. How do you identify useful pieces to rewrite? Well, this is both bad news (because you hate the code) and good news (because understanding complex systems is fun): you need to spend a lot of time with the system you have. You need to invest in it.</p>
<h2 id="you-need-to-understand-it-even-if-you-hate-it">You need to understand it, even if you hate it.</h2>
<p>It’s your money engine. It has to keep working.</p>
<p>You cannot hope to replace what you do not understand.</p>
<p>Understanding it deeply will allow you to find the cracks you can hammer a wedge into.<sup class="footnote-reference" id="fr-3-1"><a href="#fn-3">3</a></sup></p>
<p>Understanding it deeply will allow you to know when you’ve finished replacing it.</p>
<p>So how do you understand it?</p>
<p>This isn’t the same as <a rel="noopener external" target="_blank" href="https://blog.ceejbot.com/posts/programming-as-theory-building/">the theory of the program</a>, which is about how the code is constructed. You do need to know what problems the code solves for the system around it, what “affair of the world” it exists to model. More important for this task is understanding <em>the details of its current behavior</em> as part of a larger working system. This entire system is <em>not just the code</em> in the thing you want to replace. It is all the systems around that code as well: the web site, the analytics pipeline downstream from it, the internal admin workflows, the profusion of microservices that we all persist in writing around everything.</p>
<p>This whole system is an evolved, complex working system. It probably doesn’t have detailed specifications. It probably also does not have comprehensive tests. (If it did, you might not be in this mess.)</p>
<h2 id="write-tests">Write tests.</h2>
<p>If the system does not have comprehensive automated tests, invest in writing those tests before doing anything else. It’s hard to talk people into writing specs for features that have existed unspecified for years, but everybody understands why tests are useful. (If somebody doesn’t, then there are many excellent books you can drop on their head to enlighten them.)</p>
<p>Not kicking off a testing project the moment I was in charge of this problem is my number one regret from my last job. We eventually did it and it was so valuable I was angry with myself. There were organizational reason why it was difficult for the team to commence that work earlier, including work that was genuinely urgent, but we could have started writing tests sooner! My advice to you would be to prioritize testing higher than I did, and defer what work you can until afterward.</p>
<p>Start by investing time in the test framework and tooling. Your goal is to make it easy for everyone on the team to write tests and to understand their results. People do what is easiest to do, so you must make the right thing easy. The importance of this work deserves a blog post all its own.<sup class="footnote-reference" id="fr-4-1"><a href="#fn-4">4</a></sup> However, anything is better than nothing.</p>
<p>Don’t negotiate on these tasks:</p>
<ul>
<li>Write integration tests. Test how the Hated Code™ calls out to everything around it. Test that the expectations of the code around the Hated Code™ are being met.</li>
<li>Don’t accidentally pour glue over implementation details that should be hidden. If unit tests don’t exist at all, you might want some, but they’re not as important as integration tests that validate overall system behavior.</li>
<li>Involve the whole organization in the test-writing effort. Prioritize this work alongside feature work and make on-going test-writing part of regular maintenance.</li>
<li>Automate running the tests. Do not rely on humans doing anything by hand. Run them continuously against an integration environment, or in whatever context is sensible for your setup. The important thing is to have the tests run against every change intended to land in the production environment.</li>
</ul>
<p>You’ll find bugs in the overall system while doing this. It’s a judgement call whether you should invest time in fixing them. Some bugs might be difficult to fix because of the problems that lead you to want to replace the mess; don’t waste your time. Some bugs are load-bearing because the system will have grown around them, like a tree growing around a bicycle. You can cut the bike out, but at what cost to the tree? Fixing bugs that are easy to fix gives everybody dopamine cookies and shows people around the project that the investment in testing has started to pay off, so let yourself do some of that.</p>
<p>If you and your team didn’t understand your system going into the testing effort, you will afterward. The tests will support any refactoring or replacement work by verifying that the entire system continues to work. They are the scaffolding around your new construction project.</p>
<h2 id="identify-and-exploit-wedge-points">Identify and exploit wedge points.</h2>
<p>Now you can start thinking about changing the system.</p>
<p>Your goal here is to split up your monolithic code base by identifying good points to hammer in wedges to use to split off chunks.</p>
<p>Where are you going to hammer in your wedge first? Have you identified a modular boundary you can exploit to split off a chunk of functionality for a rewrite? Look for clean lines of separation: data, access methods, business logic all must come out in one piece. The common approach is to put a proxy in front of the monolith-ish thing you want to start replacing and redirect traffic from it to your rewrite. One popular term for this is <a rel="noopener external" target="_blank" href="https://martinfowler.com/bliki/StranglerFigApplication.html">“the strangler fig pattern”</a>. I often call it “divide and conquer”.</p>
<p>The advantage of this approach is that it keep the pressure on the system to remain working at all times, allowing you to pay full respects to Gall. The tests are your latch on this working state: they validate that your replacement is behaving properly in context. You might find yourself writing even more tests at this point to support the validation; this is fine!</p>
<p>The disadvantage of this approach is that you need to have good split points, and <em>you probably don’t.</em> Good division points indicate where good modularity already exists and if you had that you’d probably be less unhappy with the mess.</p>
<h2 id="create-split-points-if-they-don-t-exist">Create split points if they don’t exist.</h2>
<p>This is important: don’t rewrite anything yet.</p>
<p>Don’t proceed until you can find a good location to drive that wedge in and split off a chunk. Don’t take half measures. Some of the worst tech debt I encountered recently was in functionality that was half implemented inside the Hated Code™ monolith and half outside. The implementation details were spewed out everywhere. Changing functionality was extra difficult because it needed to be changed in two places, and one of the places was a code base that was very hard to work within. Also, once we’d fixed the primary performance bottlenecks, the secondary ones were all in how the Hated Code™ treated these satellite services as databases that it owned. Important working data vital to the operation of the system was a mashup of data from other microservices plus the monolith.</p>
<p>Don’t do this to yourself.</p>
<p>No, really, modularity is important. Parnas’s 1972 paper, <a rel="noopener external" target="_blank" href="http://sunnyday.mit.edu/16.355/parnas-criteria.html">“On the Criteria To Be Used in Decomposing Systems into Modules”</a> points right at the important thing, which is that hiding information and implementation details allows you to change both. Modularity allows change.</p>
<p>Premature modularity is a form of premature optimization, and it hurts, but I’ve more often seen no modularity at all. Gotta go fast and break things, right? Side effects everywhere, code that has been DRYed to disastrous levels, the details of specific data structures in one place used to make decisions somewhere else, extreme cleverness that relies on implementation details in distant locations in the system. Rushed people make short-term decisions, and their hacks pile up into tangles of code.</p>
<p>Whatever the cause, you might have to start by refactoring internally to bring order and modularity to a <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/Big_ball_of_mud">ball of mud.</a><sup class="footnote-reference" id="fr-5-1"><a href="#fn-5">5</a></sup> Start hiding details behind interfaces.</p>
<p>An aside: “Don’t Repeat Yourself” aka DRY has been misunderstood and misapplied to disaster so often I would like to stop saying it to newer programmers. Often much better advice is to <a rel="noopener external" target="_blank" href="https://programmingisterrible.com/post/176657481103/repeat-yourself-do-more-than-one-thing-and">repeat yourself to find patterns</a>.</p>
<p>If you find yourself with a function or method that has an enormous parameter list to distinguish the six different ways it might be called, you have a case of DRY madness that has broken modularity. One technique that might help if you’re in this situation is to do the least DRY thing possible: refactor to expand each code flow into one large function for each, replacing each call out to an overused long-parameter list function with the same code, inline. Simplify as you write. Strive for branchless programming as an antidote!<sup class="footnote-reference" id="fr-6-1"><a href="#fn-6">6</a></sup> The real patterns that support a better split-up of responsibility will emerge as you do this work.</p>
<p>Once again, your tests are going to have your back as you go. You’ll know if that flow stays working or not. You might find that your Hated Code™ is less hate-worthy after you’ve cleaned it up. Maybe you’re more in sympathy with it now? Or maybe not.</p>
<h2 id="time-to-drive-those-wedges-in-with-a-sledgehammer">Time to drive those wedges in with a sledgehammer.</h2>
<p>Now you can strangler-fig/divide-and-conquer/split those rocks as you go. You’ll probably get the modularity boundaries closer to right than your predecessors, because you have a lot more information than they did: you have a far more developed system to study!</p>
<p>If you’re tight on resources, you might choose to do <em>nothing</em> about any specific modular chunk of code. Leave it where it is, and make incremental improvements opportunistically. If this segment is not performing well, or is doing the wrong thing, or is hard to maintain, or if the team is far more comfortable with working in some other language ecosystem, then replace it. Prioritize potential rewrites by how much you hate the current implementation; that is, how many ways they’re failing to do what good code does.</p>
<p>Here’s where I remind you that modularity in your system does not require splitting its components into separate microservices. Microservice APIs are strong module boundaries; these API boundaries resist change unless you plan carefully. On the other hand, these boundaries <em>do</em> resist attempts at clever end-runs around that modularity.<sup class="footnote-reference" id="fr-7-1"><a href="#fn-7">7</a></sup> I like to bundle together data that is roughly similar size and changes at similar rates or in a similar style. CRUD data that is infrequently destructively updated and all lives in the same kind of database might all belong together. Geographical data that all uses PostGIS belongs with other data like that. This is itself a gigantic topic, so I won’t go further other than to remind you that microservices have tradeoffs. The important goal is to leave a system than can be <em>more easily rewritten</em> behind yourself.</p>
<h2 id="plan-to-rewrite-next-time">Plan to rewrite next time.</h2>
<p>All code has a lifespan.</p>
<p>Your designs make tradeoffs (always) that suit the context you’re working in:</p>
<ul>
<li>What language ecosystem is the current team comfortable using?</li>
<li>Do you need to get this project done rapidly, so some shortcuts are okay?</li>
<li>What performance characteristics are acceptable today?</li>
<li>What task does this component have to perform today?</li>
</ul>
<p>The context <em>around</em> working code changes over time. The business context the code exists in is guaranteed to change. Product requirements change. The tools your team is happy with today might make the team unhappy three years from now. Other parts of the system will change around it.</p>
<p>Make it easier for your future self or your successors to rewrite any given component of a system. If you know the lifespan of a decision, or when a scaling shift will make a component a good candidate for a rewrite, record that information right next to the code.</p>
<h2 id="the-tl-dr">The tl;dr.</h2>
<p>It’s okay to hate that code base. It is hate-able. It’s okay to want to replace it. You can replace it! But you have to put in the work first. The work I’ve had to do in this situation looks like this:</p>
<ul>
<li>Understand it even if you dislike it. ⬅️ <em>treat it like a puzzle</em></li>
<li>Write tests. For the system. Mostly integration. ⬅️ <em>helps everything</em></li>
<li>Identify or create wedge points. ⬅️ <em>most of the time will go here</em></li>
<li>Split off chunks and rewrite. ⬅️ <em>the fun part</em></li>
<li>Shrink the mess until it’s tolerable. ⬅️ <em>satisfying!</em></li>
<li>Plan so rewriting the new chunks is easier next time. ⬅️ <em>pay it forward</em></li>
</ul>
<p>Anyway, this is what I’ve learned from trying to do this work with limited resources. It’s best not to be in this situation: instead devote time to maintaining the system as a system and every bit of code in it. But most of us don’t have time machines to prevent past technical leaders from making these mistakes.</p>
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn-1">
<p>Being written in a language ecosystem you don’t like is not enough of a reason to rewrite something all by itself. If you’ve landed into a team that doesn’t know the language ecosystem that company’s money engine is written in, your first task is to correct the hiring mistake of the past. You might have to become an expert into the thing you don’t know; you might (like me) discover that you dislike the thing you had to become an expert in. Probably the real takeaway is to do better due diligence than I did, and discover in advance what flavor of mess you’re expected to clean up. But sometimes, the ecosystem mismatch is the last misery on top of a pile of miseries. <a href="#fr-1-1">↩</a></p>
</li>
<li id="fn-2">
<p>This was literally true in my case. <a href="#fr-2-1">↩</a></p>
</li>
<li id="fn-3">
<p>If you have never seen rocks split by hand with the wedge and feather technique, check out this <a rel="noopener external" target="_blank" href="https://www.youtube.com/watch?v=I5QgueNooRs">video showing somebody breaking up a big boulder</a>. <a href="#fr-3-1">↩</a></p>
</li>
<li id="fn-4">
<p>I am nudging Chris Dickinson into blogging about how he approached this project at our mutual former employer, but until he does here’s a link to a <a rel="noopener external" target="_blank" href="https://twitter.com/isntitvacant/status/1493670128323497985">tweet about his approach to the work</a>. <a href="#fr-4-1">↩</a></p>
</li>
<li id="fn-5">
<p>This is the part of the rock-splitting video where you haul out the drill and bore a hole to stick a wedge into. The metaphor is now out of control because you can’t drill holes in big balls of mud, but, uh, let’s pretend the mud has been pressurized into rock over many thousands of years? <a href="#fr-5-1">↩</a></p>
</li>
<li id="fn-6">
<p>DRY is misunderstood, IMO. The original principle is “Every piece of knowledge must have a single, unambiguous, authoritative representation within a system.” This is a good principle! It does not mean that you need to collapse any two bits of code that look mostly the same. As with everything, advice has contexts. Everything in moderation. <a rel="noopener external" target="_blank" href="https://programmingisterrible.com/post/176657481103/repeat-yourself-do-more-than-one-thing-and">Tef is right</a>. <a href="#fr-6-1">↩</a></p>
</li>
<li id="fn-7">
<p>Though I have seen people manage to do that. E.g., replicating an entire db to get at a subset of its data rather than using the API that was put in front of the db specifically to hide the implementation details of the db schema. Sigh. But even this is a case of people doing what feels easiest: if the replication tools are right there and calling an API feels harder, they’ll reach for replication. The right solution is to make doing the right thing the easiest thing for everybody. This is more work for you, which you needed, right? <a href="#fr-7-1">↩</a></p>
</li>
</ol>
</section>
]]></content:encoded>
      </item>
      <item>
          <title>Why Rust&#x27;s postfix await syntax is good</title>
          <link>https://blog.ceejbot.com/posts/postfix-await/</link>
          <pubDate>Fri, 13 May 2022 16:00:31 +0000</pubDate>
          <author>ceejceej@gmail.com (C J Silverio)</author>
          <guid>https://blog.ceejbot.com/posts/postfix-await/</guid>
          <description>This is a blog post for non-Rustaceans about why Rust&#x27;s await syntax is good.</description>
          <content:encoded><![CDATA[<p>The other day on Twitter Kat Marchán said this:</p>
<blockquote>
<p>my strongest opinion on programming languages is that postfix .await is the single greatest innovation in the past 70+ years of programming language theory and history and you can’t convince me otherwise.
— Kat Marchán has permanently left this site (@zkat__) May 12, 2022</p>
</blockquote>
<p>And Jan Lehnardt asked:</p>
<blockquote>
<p>I’ve seen a few folks say this. Do you know of a “here is how this compares to async/await keywords” for someone who barely rusts?
— Jan Lehnardt is on Mastodon: @janl@narrativ.es (@janl) May 12, 2022</p>
</blockquote>
<p>I didn’t know of any, and a little searching didn’t turn one up. So here’s one? I hope? If you are not programming Rust a lot, and want to know why not-language-designers like me think that Rust’s <a rel="noopener external" target="_blank" href="https://rust-lang.github.io/async-book/03_async_await/01_chapter.html">await syntax</a> is good, this is the blog post for you.</p>
<p>While looking around for somebody explaining why this is nice syntax, I found <a rel="noopener external" target="_blank" href="https://github.com/rust-lang/rust/issues/57640">one of the discussions</a> about possibilities before it was selected. That’s a pretty long conversation, and I enjoyed skimming it. <a rel="noopener external" target="_blank" href="https://github.com/rust-lang/rust/issues/57640#issuecomment-455361619">This comment</a> examining what Rust might look like with a number of the syntax possibilities was particularly neat. It immediately jumped out to me that the one they landed on (postfix field) and the close relation to it (postfix method) felt more Rust-y. But why? You might have to be a Rust user already to feel that.</p>
<p>So in order to explain why Rust’s <code>.await</code> is a nice bit of syntax, I will start by explaining two other things: how chaining calls is idiomatic Rust, and how error propagation with another nice bit of syntax, <code>?</code>, supports this.</p>
<h2 id="hoo-hah-back-on-the-chain-gang">Hoo hah back on the chain gang</h2>
<p>Chaining is a very common idiom for taking one collection and transforming it into another, perhaps even one of a different type. This snippet takes a collection of id-having-things (any collection type, so long as it is iterable), iterates through them, plucks out the ids, and re-collects them into a Vec:</p>
<pre class="giallo" style="color-scheme: light dark; color: light-dark(#3C3836, #EBDBB2); background-color: light-dark(#F9F5D7, #1D2021);"><code data-lang="rust"><span class="giallo-l"><span style="color: light-dark(#AF3A03, #FE8019);">let</span><span style="color: light-dark(#076678, #83A598);"> ids</span><span style="color: light-dark(#427B58, #8EC07C);">:</span><span style="color: light-dark(#B57614, #FABD2F);"> Vec</span><span style="color: light-dark(#7C6F64, #A89984);">&lt;</span><span style="color: light-dark(#B57614, #FABD2F);">usize</span><span style="color: light-dark(#7C6F64, #A89984);">&gt;</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span style="color: light-dark(#076678, #83A598);"> things_with_ids</span><span style="color: light-dark(#427B58, #8EC07C);">.</span><span style="color: light-dark(#B57614, #FABD2F);">iter</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#427B58, #8EC07C);">.</span><span style="color: light-dark(#B57614, #FABD2F);">map</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#9D0006, #FB4934);">|</span><span style="color: light-dark(#076678, #83A598);">xs</span><span style="color: light-dark(#9D0006, #FB4934);">|</span><span style="color: light-dark(#076678, #83A598);"> xs</span><span style="color: light-dark(#427B58, #8EC07C);">.</span><span>id</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#427B58, #8EC07C);">.</span><span style="color: light-dark(#B57614, #FABD2F);">collect</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#7C6F64, #A89984);">;</span></span></code></pre>
<p>Here’s a slightly-edited real-world example, which chains some stuff to end up with a string:</p>
<pre class="giallo" style="color-scheme: light dark; color: light-dark(#3C3836, #EBDBB2); background-color: light-dark(#F9F5D7, #1D2021);"><code data-lang="rust"><span class="giallo-l"><span style="color: light-dark(#AF3A03, #FE8019);">let</span><span style="color: light-dark(#076678, #83A598);"> malformed_kinds</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span style="color: light-dark(#076678, #83A598);"> requested_kinds</span></span>
<span class="giallo-l"><span style="color: light-dark(#427B58, #8EC07C);">    .</span><span style="color: light-dark(#B57614, #FABD2F);">iter</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span></span>
<span class="giallo-l"><span style="color: light-dark(#427B58, #8EC07C);">    .</span><span style="color: light-dark(#B57614, #FABD2F);">filter</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#9D0006, #FB4934);">|</span><span style="color: light-dark(#076678, #83A598);">xs</span><span style="color: light-dark(#9D0006, #FB4934);">|</span><span style="color: light-dark(#9D0006, #FB4934);"> !</span><span style="color: light-dark(#B57614, #FABD2F);">is_valid</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#076678, #83A598);">xs</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#7C6F64, #A89984);">)</span></span>
<span class="giallo-l"><span style="color: light-dark(#427B58, #8EC07C);">    .</span><span style="color: light-dark(#B57614, #FABD2F);">cloned</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span></span>
<span class="giallo-l"><span style="color: light-dark(#427B58, #8EC07C);">    .</span><span style="color: light-dark(#B57614, #FABD2F);">collect</span><span style="color: light-dark(#427B58, #8EC07C);">::</span><span style="color: light-dark(#7C6F64, #A89984);">&lt;</span><span style="color: light-dark(#B57614, #FABD2F);">Vec</span><span style="color: light-dark(#7C6F64, #A89984);">&lt;</span><span style="color: light-dark(#076678, #83A598);">_</span><span style="color: light-dark(#7C6F64, #A89984);">&gt;</span><span style="color: light-dark(#7C6F64, #A89984);">&gt;</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span></span>
<span class="giallo-l"><span style="color: light-dark(#427B58, #8EC07C);">    .</span><span style="color: light-dark(#B57614, #FABD2F);">join</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span><span style="color: light-dark(#79740E, #B8BB26);">,</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#7C6F64, #A89984);">;</span></span></code></pre>
<p>And the idiom is seen in other areas of API design. Here’s how my <a rel="noopener external" target="_blank" href="https://github.com/ceejbot/modcache">little Skyrim mod tool</a> sends posts (with some editing to make it a useful example):</p>
<pre class="giallo" style="color-scheme: light dark; color: light-dark(#3C3836, #EBDBB2); background-color: light-dark(#F9F5D7, #1D2021);"><code data-lang="rust"><span class="giallo-l"><span style="color: light-dark(#AF3A03, #FE8019);">let</span><span style="color: light-dark(#076678, #83A598);"> agent</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span style="color: light-dark(#B57614, #FABD2F);"> ureq</span><span style="color: light-dark(#427B58, #8EC07C);">::</span><span style="color: light-dark(#B57614, #FABD2F);">AgentBuilder</span><span style="color: light-dark(#427B58, #8EC07C);">::</span><span style="color: light-dark(#B57614, #FABD2F);">new</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span></span>
<span class="giallo-l"><span style="color: light-dark(#427B58, #8EC07C);">    .</span><span style="color: light-dark(#B57614, #FABD2F);">timeout_read</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#B57614, #FABD2F);">Duration</span><span style="color: light-dark(#427B58, #8EC07C);">::</span><span style="color: light-dark(#B57614, #FABD2F);">from_secs</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#8F3F71, #D3869B);">50</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#7C6F64, #A89984);">)</span></span>
<span class="giallo-l"><span style="color: light-dark(#427B58, #8EC07C);">    .</span><span style="color: light-dark(#B57614, #FABD2F);">timeout_write</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#B57614, #FABD2F);">Duration</span><span style="color: light-dark(#427B58, #8EC07C);">::</span><span style="color: light-dark(#B57614, #FABD2F);">from_secs</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#8F3F71, #D3869B);">5</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#7C6F64, #A89984);">)</span></span>
<span class="giallo-l"><span style="color: light-dark(#427B58, #8EC07C);">    .</span><span style="color: light-dark(#B57614, #FABD2F);">build</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#7C6F64, #A89984);">;</span></span>
<span class="giallo-l"><span style="color: light-dark(#AF3A03, #FE8019);">let</span><span style="color: light-dark(#076678, #83A598);"> maybe_response</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span style="color: light-dark(#076678, #83A598);"> agent</span></span>
<span class="giallo-l"><span style="color: light-dark(#427B58, #8EC07C);">    .</span><span style="color: light-dark(#B57614, #FABD2F);">post</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#076678, #83A598);">uri</span><span style="color: light-dark(#7C6F64, #A89984);">)</span></span>
<span class="giallo-l"><span style="color: light-dark(#427B58, #8EC07C);">    .</span><span style="color: light-dark(#B57614, #FABD2F);">set</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span><span style="color: light-dark(#79740E, #B8BB26);">apikey</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span><span style="color: light-dark(#7C6F64, #A89984);">,</span><span style="color: light-dark(#427B58, #8EC07C);"> &amp;</span><span style="color: light-dark(#076678, #83A598);">self</span><span style="color: light-dark(#427B58, #8EC07C);">.</span><span>apikey</span><span style="color: light-dark(#7C6F64, #A89984);">)</span></span>
<span class="giallo-l"><span style="color: light-dark(#427B58, #8EC07C);">    .</span><span style="color: light-dark(#B57614, #FABD2F);">set</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span><span style="color: light-dark(#79740E, #B8BB26);">user-agent</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span><span style="color: light-dark(#7C6F64, #A89984);">,</span><span style="color: light-dark(#7C6F64, #A89984);"> &quot;</span><span style="color: light-dark(#79740E, #B8BB26);">modcache: github.com/ceejbot/modcache</span><span style="color: light-dark(#7C6F64, #A89984);">&quot;</span><span style="color: light-dark(#7C6F64, #A89984);">)</span></span>
<span class="giallo-l"><span style="color: light-dark(#427B58, #8EC07C);">    .</span><span style="color: light-dark(#B57614, #FABD2F);">send_form</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#076678, #83A598);">body</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#7C6F64, #A89984);">;</span></span></code></pre>
<p>All of this is to say: chaining like this is common in Rust.</p>
<h2 id="don-t-break-the-chain">Don’t break the chain</h2>
<p>Now, there isn’t any error handling visible in the above code. What does error handling look like with chaining? Does it break the chains? It used to! The <code>?</code> error propagation symbol is new to Rust since I first started using it, and its introduction has made writing error handling a lot nicer.</p>
<p>Rust allows you to express that an operation might fail by returning a <a rel="noopener external" target="_blank" href="https://doc.rust-lang.org/std/result/"><code>Result</code></a>. This is a sum type:</p>
<pre class="giallo" style="color-scheme: light dark; color: light-dark(#3C3836, #EBDBB2); background-color: light-dark(#F9F5D7, #1D2021);"><code data-lang="rust"><span class="giallo-l"><span style="color: light-dark(#AF3A03, #FE8019);">enum</span><span style="color: light-dark(#B57614, #FABD2F);"> Result</span><span style="color: light-dark(#7C6F64, #A89984);">&lt;</span><span style="color: light-dark(#B57614, #FABD2F);">T</span><span style="color: light-dark(#7C6F64, #A89984);">,</span><span style="color: light-dark(#B57614, #FABD2F);"> E</span><span style="color: light-dark(#7C6F64, #A89984);">&gt;</span><span style="color: light-dark(#7C6F64, #A89984);"> {</span></span>
<span class="giallo-l"><span style="color: light-dark(#B57614, #FABD2F);">   Ok</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#B57614, #FABD2F);">T</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#7C6F64, #A89984);">,</span></span>
<span class="giallo-l"><span style="color: light-dark(#B57614, #FABD2F);">   Err</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#B57614, #FABD2F);">E</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#7C6F64, #A89984);">,</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">}</span></span></code></pre>
<p>If all went well, you get the <code>Ok</code> variant with your data in it. If it did not, you get the <code>Err</code> variant with your error type. The Rust compiler makes you handle both variations in your code.</p>
<p>Here’s a faked example of getting some data from a function that might fail, and doing something with that if we can.</p>
<pre class="giallo" style="color-scheme: light dark; color: light-dark(#3C3836, #EBDBB2); background-color: light-dark(#F9F5D7, #1D2021);"><code data-lang="rust"><span class="giallo-l"><span style="color: light-dark(#928374, #928374);font-style: italic;">//</span><span style="color: light-dark(#928374, #928374);font-style: italic;"> Our fetch talks to a db so it might fail for reasons</span></span>
<span class="giallo-l"><span style="color: light-dark(#928374, #928374);font-style: italic;">//</span><span style="color: light-dark(#928374, #928374);font-style: italic;"> beyond our control, so we return a result type.</span></span>
<span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">fn</span><span style="color: light-dark(#B57614, #FABD2F);"> fetch_all_animals</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#427B58, #8EC07C);"> -&gt;</span><span style="color: light-dark(#B57614, #FABD2F);"> Result</span><span style="color: light-dark(#7C6F64, #A89984);">&lt;</span><span style="color: light-dark(#B57614, #FABD2F);">Vec</span><span style="color: light-dark(#7C6F64, #A89984);">&lt;</span><span style="color: light-dark(#B57614, #FABD2F);">Animal</span><span style="color: light-dark(#7C6F64, #A89984);">&gt;</span><span style="color: light-dark(#7C6F64, #A89984);">,</span><span style="color: light-dark(#B57614, #FABD2F);"> SomeErrorType</span><span style="color: light-dark(#7C6F64, #A89984);">&gt;</span><span style="color: light-dark(#7C6F64, #A89984);"> {</span></span>
<span class="giallo-l"><span style="color: light-dark(#928374, #928374);font-style: italic;">    //</span><span style="color: light-dark(#928374, #928374);font-style: italic;"> blocking call to a db here</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">}</span></span>
<span class="giallo-l"></span>
<span class="giallo-l"><span style="color: light-dark(#928374, #928374);font-style: italic;">//</span><span style="color: light-dark(#928374, #928374);font-style: italic;"> We depend on a fallible function, so we are fallible too.</span></span>
<span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">fn</span><span style="color: light-dark(#B57614, #FABD2F);"> count_hedgehogs</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#427B58, #8EC07C);"> -&gt;</span><span style="color: light-dark(#B57614, #FABD2F);"> Result</span><span style="color: light-dark(#7C6F64, #A89984);">&lt;</span><span style="color: light-dark(#B57614, #FABD2F);">usize</span><span style="color: light-dark(#7C6F64, #A89984);">,</span><span style="color: light-dark(#B57614, #FABD2F);"> SomeErrorType</span><span style="color: light-dark(#7C6F64, #A89984);">&gt;</span><span style="color: light-dark(#7C6F64, #A89984);"> {</span></span>
<span class="giallo-l"><span style="color: light-dark(#928374, #928374);font-style: italic;">    //</span><span style="color: light-dark(#928374, #928374);font-style: italic;"> this is a Result</span></span>
<span class="giallo-l"><span style="color: light-dark(#AF3A03, #FE8019);">    let</span><span style="color: light-dark(#076678, #83A598);"> maybe_animals</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span style="color: light-dark(#B57614, #FABD2F);"> fetch_all_animals</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#7C6F64, #A89984);">;</span></span>
<span class="giallo-l"><span style="color: light-dark(#928374, #928374);font-style: italic;">    //</span><span style="color: light-dark(#928374, #928374);font-style: italic;">... so we match on it to see if we succeeded or not</span></span>
<span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">    match</span><span style="color: light-dark(#076678, #83A598);"> maybe_animals</span><span style="color: light-dark(#7C6F64, #A89984);"> {</span></span>
<span class="giallo-l"><span style="color: light-dark(#B57614, #FABD2F);">        Ok</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#076678, #83A598);">animals</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#427B58, #8EC07C);"> =&gt;</span><span style="color: light-dark(#7C6F64, #A89984);"> {</span></span>
<span class="giallo-l"><span style="color: light-dark(#928374, #928374);font-style: italic;">            //</span><span style="color: light-dark(#928374, #928374);font-style: italic;"> we got some animals! let&#39;s find the hedgies</span></span>
<span class="giallo-l"><span style="color: light-dark(#AF3A03, #FE8019);">            let</span><span style="color: light-dark(#076678, #83A598);"> count</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span style="color: light-dark(#076678, #83A598);"> animals</span><span style="color: light-dark(#427B58, #8EC07C);">.</span><span style="color: light-dark(#B57614, #FABD2F);">filter_for_hedgehogs</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#427B58, #8EC07C);">.</span><span style="color: light-dark(#B57614, #FABD2F);">len</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#7C6F64, #A89984);">;</span></span>
<span class="giallo-l"><span style="color: light-dark(#B57614, #FABD2F);">            Ok</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#076678, #83A598);">count</span><span style="color: light-dark(#7C6F64, #A89984);">)</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">        }</span></span>
<span class="giallo-l"><span style="color: light-dark(#B57614, #FABD2F);">        Err</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#076678, #83A598);">e</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#7C6F64, #A89984);"> {</span></span>
<span class="giallo-l"><span style="color: light-dark(#928374, #928374);font-style: italic;">            //</span><span style="color: light-dark(#928374, #928374);font-style: italic;"> We failed to get animals. We handle the error in whatever</span></span>
<span class="giallo-l"><span style="color: light-dark(#928374, #928374);font-style: italic;">            //</span><span style="color: light-dark(#928374, #928374);font-style: italic;"> way makes sense for the program. Here we just propagate</span></span>
<span class="giallo-l"><span style="color: light-dark(#928374, #928374);font-style: italic;">            //</span><span style="color: light-dark(#928374, #928374);font-style: italic;"> the error on up to the caller.</span></span>
<span class="giallo-l"><span style="color: light-dark(#B57614, #FABD2F);">            Err</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#076678, #83A598);">e</span><span style="color: light-dark(#7C6F64, #A89984);">)</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">        }</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">    }</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">}</span></span></code></pre>
<p>This error handling pattern was everywhere in my Rust code, being verbose all over the place. It’s also predictable! This makes it a good candidate for sugar. So the <a rel="noopener external" target="_blank" href="https://m4rw3r.github.io/rust-questionmark-operator"><code>?</code> syntax</a> for this was added in <a rel="noopener external" target="_blank" href="https://blog.rust-lang.org/2016/11/10/Rust-1.13.html#the--operator">Rust v1.13 at the end of 2016</a>. If all you want to do is return immediately if you have an error and carry on if you got an OK result, use <code>?</code>.</p>
<pre class="giallo" style="color-scheme: light dark; color: light-dark(#3C3836, #EBDBB2); background-color: light-dark(#F9F5D7, #1D2021);"><code data-lang="rust"><span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">fn</span><span style="color: light-dark(#B57614, #FABD2F);"> count_hedgehogs</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#427B58, #8EC07C);"> -&gt;</span><span style="color: light-dark(#B57614, #FABD2F);"> Result</span><span style="color: light-dark(#7C6F64, #A89984);">&lt;</span><span style="color: light-dark(#B57614, #FABD2F);">usize</span><span style="color: light-dark(#7C6F64, #A89984);">,</span><span style="color: light-dark(#B57614, #FABD2F);"> SomeErrorType</span><span style="color: light-dark(#7C6F64, #A89984);">&gt;</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">{</span></span>
<span class="giallo-l"><span style="color: light-dark(#AF3A03, #FE8019);">    let</span><span style="color: light-dark(#076678, #83A598);"> animals</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span style="color: light-dark(#B57614, #FABD2F);"> fetch_all_animals</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#427B58, #8EC07C);">?</span><span style="color: light-dark(#7C6F64, #A89984);">;</span><span style="color: light-dark(#928374, #928374);font-style: italic;"> //</span><span style="color: light-dark(#928374, #928374);font-style: italic;"> &lt;-- note the ?</span></span>
<span class="giallo-l"><span style="color: light-dark(#928374, #928374);font-style: italic;">    //</span><span style="color: light-dark(#928374, #928374);font-style: italic;"> if the fallible function failed, we have bopped that</span></span>
<span class="giallo-l"><span style="color: light-dark(#928374, #928374);font-style: italic;">    //</span><span style="color: light-dark(#928374, #928374);font-style: italic;"> error on out &amp; can proceed</span></span>
<span class="giallo-l"><span style="color: light-dark(#AF3A03, #FE8019);">    let</span><span style="color: light-dark(#076678, #83A598);"> count</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span style="color: light-dark(#076678, #83A598);"> animals</span><span style="color: light-dark(#427B58, #8EC07C);">.</span><span style="color: light-dark(#B57614, #FABD2F);">filter_for_hedgehogs</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#427B58, #8EC07C);">.</span><span style="color: light-dark(#B57614, #FABD2F);">len</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#7C6F64, #A89984);">;</span></span>
<span class="giallo-l"><span style="color: light-dark(#B57614, #FABD2F);">    Ok</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#076678, #83A598);">count</span><span style="color: light-dark(#7C6F64, #A89984);">)</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">}</span></span></code></pre>
<p>You can see that error handling is a lot less verbose when it can fit into this pattern. In fact, the idiomatic Rust way to implement the above function is to chain it all together:</p>
<pre class="giallo" style="color-scheme: light dark; color: light-dark(#3C3836, #EBDBB2); background-color: light-dark(#F9F5D7, #1D2021);"><code data-lang="rust"><span class="giallo-l"><span style="color: light-dark(#AF3A03, #FE8019);">let</span><span style="color: light-dark(#076678, #83A598);"> count</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span style="color: light-dark(#B57614, #FABD2F);"> fetch_all_animals</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#427B58, #8EC07C);">?</span><span style="color: light-dark(#427B58, #8EC07C);">.</span><span style="color: light-dark(#B57614, #FABD2F);">filter_for_hedgehogs</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#427B58, #8EC07C);">.</span><span style="color: light-dark(#B57614, #FABD2F);">len</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#7C6F64, #A89984);">;</span></span></code></pre>
<p>Which is super-compact and might not need its own function at all. This stays super-compact if our hedgehog filter is fallible as well, though I’m not sure why it would be fallible. It would look like this:</p>
<pre class="giallo" style="color-scheme: light dark; color: light-dark(#3C3836, #EBDBB2); background-color: light-dark(#F9F5D7, #1D2021);"><code data-lang="rust"><span class="giallo-l"><span style="color: light-dark(#AF3A03, #FE8019);">let</span><span style="color: light-dark(#076678, #83A598);"> count</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span style="color: light-dark(#B57614, #FABD2F);"> fetch_all_animals</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#427B58, #8EC07C);">?</span><span style="color: light-dark(#427B58, #8EC07C);">.</span><span style="color: light-dark(#B57614, #FABD2F);">filter_for_hedgehogs</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#427B58, #8EC07C);">?</span><span style="color: light-dark(#427B58, #8EC07C);">.</span><span style="color: light-dark(#B57614, #FABD2F);">len</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#7C6F64, #A89984);">;</span></span></code></pre><h2 id="finally-we-get-to-async-and-await">Finally we get to <code>async</code> and <code>await</code></h2>
<p>Now! Let’s suppose we have moved to the magic land of async Rust programming and have a non-blocking db fetch for our animals.</p>
<pre class="giallo" style="color-scheme: light dark; color: light-dark(#3C3836, #EBDBB2); background-color: light-dark(#F9F5D7, #1D2021);"><code data-lang="rust"><span class="giallo-l"><span style="color: light-dark(#928374, #928374);font-style: italic;">//</span><span style="color: light-dark(#928374, #928374);font-style: italic;"> we must say the magic word</span></span>
<span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">async</span><span style="color: light-dark(#9D0006, #FB4934);"> fn</span><span style="color: light-dark(#B57614, #FABD2F);"> fetch_all_animals</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#427B58, #8EC07C);"> -&gt;</span><span style="color: light-dark(#B57614, #FABD2F);"> Result</span><span style="color: light-dark(#7C6F64, #A89984);">&lt;</span><span style="color: light-dark(#B57614, #FABD2F);">Vec</span><span style="color: light-dark(#7C6F64, #A89984);">&lt;</span><span style="color: light-dark(#B57614, #FABD2F);">Animal</span><span style="color: light-dark(#7C6F64, #A89984);">&gt;</span><span style="color: light-dark(#7C6F64, #A89984);">,</span><span style="color: light-dark(#B57614, #FABD2F);"> SomeErrorType</span><span style="color: light-dark(#7C6F64, #A89984);">&gt;</span><span style="color: light-dark(#7C6F64, #A89984);"> {</span></span>
<span class="giallo-l"><span style="color: light-dark(#928374, #928374);font-style: italic;">    //</span><span style="color: light-dark(#928374, #928374);font-style: italic;"> we do all the same work as before</span></span>
<span class="giallo-l"><span style="color: light-dark(#928374, #928374);font-style: italic;">    //</span><span style="color: light-dark(#928374, #928374);font-style: italic;"> and maybe call some async functions here too</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">}</span></span></code></pre>
<p>Now when we call that function, what we get back is actually a <a rel="noopener external" target="_blank" href="https://doc.rust-lang.org/std/future/trait.Future.html"><code>Future</code></a>. To use it, we have to call <code>poll</code> on it, or more idiomatically, we <code>await</code> it to resolve it to a value. (There’s a link in the further reading section if you want to learn more.) This is a lot like what happens in Javascript when we get a promise back from an async function:</p>
<pre class="giallo" style="color-scheme: light dark; color: light-dark(#3C3836, #EBDBB2); background-color: light-dark(#F9F5D7, #1D2021);"><code data-lang="javascript"><span class="giallo-l"><span style="color: light-dark(#AF3A03, #FE8019);">const</span><span style="color: light-dark(#076678, #83A598);"> animals</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span style="color: light-dark(#9D0006, #FB4934);"> await</span><span style="color: light-dark(#B57614, #FABD2F);"> fetch_all_animals</span><span>(</span><span>)</span><span style="color: light-dark(#7C6F64, #A89984);">;</span></span></code></pre>
<p>But Rust’s chosen syntax uses a field-like postfix on a Future, and this is the specific thing I think is neat:</p>
<pre class="giallo" style="color-scheme: light dark; color: light-dark(#3C3836, #EBDBB2); background-color: light-dark(#F9F5D7, #1D2021);"><code data-lang="rust"><span class="giallo-l"><span style="color: light-dark(#AF3A03, #FE8019);">let</span><span style="color: light-dark(#076678, #83A598);"> animals</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span style="color: light-dark(#B57614, #FABD2F);"> fetch_all_animals</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#427B58, #8EC07C);">.</span><span style="color: light-dark(#9D0006, #FB4934);">await</span><span style="color: light-dark(#7C6F64, #A89984);">;</span></span></code></pre>
<p>Look at what happens if we’re calling fallible functions and want our error handling in-line! We stick <code>?</code> on the <code>.await</code> to propagate any errors and unwrap a result in-line:</p>
<pre class="giallo" style="color-scheme: light dark; color: light-dark(#3C3836, #EBDBB2); background-color: light-dark(#F9F5D7, #1D2021);"><code data-lang="rust"><span class="giallo-l"><span style="color: light-dark(#AF3A03, #FE8019);">let</span><span style="color: light-dark(#076678, #83A598);"> count</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span style="color: light-dark(#B57614, #FABD2F);"> fetch_all_animals</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#427B58, #8EC07C);">.</span><span style="color: light-dark(#9D0006, #FB4934);">await</span><span style="color: light-dark(#427B58, #8EC07C);">?</span><span style="color: light-dark(#427B58, #8EC07C);">.</span><span style="color: light-dark(#B57614, #FABD2F);">filter_for_hedgehogs</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#427B58, #8EC07C);">.</span><span style="color: light-dark(#B57614, #FABD2F);">len</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#7C6F64, #A89984);">;</span></span>
<span class="giallo-l"><span style="color: light-dark(#928374, #928374);font-style: italic;">//</span><span style="color: light-dark(#928374, #928374);font-style: italic;"> and if our hedgehog filter were both async and fallible....</span></span>
<span class="giallo-l"><span style="color: light-dark(#AF3A03, #FE8019);">let</span><span style="color: light-dark(#076678, #83A598);"> count</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span style="color: light-dark(#B57614, #FABD2F);"> fetch_all_animals</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#427B58, #8EC07C);">.</span><span style="color: light-dark(#9D0006, #FB4934);">await</span><span style="color: light-dark(#427B58, #8EC07C);">?</span><span style="color: light-dark(#427B58, #8EC07C);">.</span><span style="color: light-dark(#B57614, #FABD2F);">filter_for_hedgehogs</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#427B58, #8EC07C);">.</span><span style="color: light-dark(#9D0006, #FB4934);">await</span><span style="color: light-dark(#427B58, #8EC07C);">?</span><span style="color: light-dark(#427B58, #8EC07C);">.</span><span style="color: light-dark(#B57614, #FABD2F);">len</span><span style="color: light-dark(#7C6F64, #A89984);">(</span><span style="color: light-dark(#7C6F64, #A89984);">)</span><span style="color: light-dark(#7C6F64, #A89984);">;</span></span></code></pre>
<p>That is the use case that shows why I think this specific syntax choice is brilliant. Precedence is clear. We don’t have to wrap things in parens for human readability or to control precedence. If we read a chain, the operations are mentioned in the order that they happen. It works with the existing idioms rather than against them.<sup class="footnote-reference" id="fr-1-1"><a href="#fn-1">1</a></sup></p>
<p>Another thing that’s interesting to me here is that this choice is <em>not</em> what most modern languages made for their syntax. Lots of them use a prefixed <code>await</code> keyword. In javascript if we were chaining it would look like:</p>
<pre class="giallo" style="color-scheme: light dark; color: light-dark(#3C3836, #EBDBB2); background-color: light-dark(#F9F5D7, #1D2021);"><code data-lang="javascript"><span class="giallo-l"><span style="color: light-dark(#AF3A03, #FE8019);">const</span><span style="color: light-dark(#076678, #83A598);"> count</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span> (</span><span style="color: light-dark(#9D0006, #FB4934);">await</span><span style="color: light-dark(#B57614, #FABD2F);"> fetch_all_animals</span><span>(</span><span>)</span><span>)</span><span style="color: light-dark(#7C6F64, #A89984);">.</span><span style="color: light-dark(#B57614, #FABD2F);">filter_for_hedgehogs</span><span>(</span><span>)</span><span style="color: light-dark(#7C6F64, #A89984);">.</span><span style="color: light-dark(#8F3F71, #D3869B);">length</span><span style="color: light-dark(#7C6F64, #A89984);">;</span></span>
<span class="giallo-l"><span style="color: light-dark(#928374, #928374);font-style: italic;">//</span><span style="color: light-dark(#928374, #928374);font-style: italic;"> and errors will throw exceptions that we&#39;re letting bubble up</span></span>
<span class="giallo-l"><span style="color: light-dark(#928374, #928374);font-style: italic;">//</span><span style="color: light-dark(#928374, #928374);font-style: italic;"> and if we&#39;re chaining more than one async thing...</span></span>
<span class="giallo-l"><span style="color: light-dark(#AF3A03, #FE8019);">const</span><span style="color: light-dark(#076678, #83A598);"> count</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span> (</span><span style="color: light-dark(#9D0006, #FB4934);">await</span><span> (</span><span style="color: light-dark(#9D0006, #FB4934);">await</span><span style="color: light-dark(#B57614, #FABD2F);"> fetch_all_animals</span><span>(</span><span>)</span><span>)</span><span style="color: light-dark(#7C6F64, #A89984);">.</span><span style="color: light-dark(#B57614, #FABD2F);">filter_for_hedgehogs</span><span>(</span><span>)</span><span>)</span><span style="color: light-dark(#7C6F64, #A89984);">.</span><span style="color: light-dark(#8F3F71, #D3869B);">length</span><span style="color: light-dark(#7C6F64, #A89984);">;</span></span></code></pre>
<p>But I’d probably never write either of those and definitely never the second. I am far more likely to write:</p>
<pre class="giallo" style="color-scheme: light dark; color: light-dark(#3C3836, #EBDBB2); background-color: light-dark(#F9F5D7, #1D2021);"><code data-lang="javascript"><span class="giallo-l"><span style="color: light-dark(#AF3A03, #FE8019);">let</span><span style="color: light-dark(#076678, #83A598);"> count</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span style="color: light-dark(#8F3F71, #D3869B);"> 0</span><span style="color: light-dark(#7C6F64, #A89984);">;</span></span>
<span class="giallo-l"><span style="color: light-dark(#9D0006, #FB4934);">try</span><span style="color: light-dark(#7C6F64, #A89984);"> {</span></span>
<span class="giallo-l"><span style="color: light-dark(#AF3A03, #FE8019);">  const</span><span style="color: light-dark(#076678, #83A598);"> animals</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span style="color: light-dark(#9D0006, #FB4934);"> await</span><span style="color: light-dark(#B57614, #FABD2F);"> fetch_all_animals</span><span>(</span><span>)</span><span style="color: light-dark(#7C6F64, #A89984);">;</span></span>
<span class="giallo-l"><span style="color: light-dark(#076678, #83A598);">  count</span><span style="color: light-dark(#427B58, #8EC07C);"> =</span><span style="color: light-dark(#076678, #83A598);"> animals</span><span style="color: light-dark(#7C6F64, #A89984);">.</span><span style="color: light-dark(#B57614, #FABD2F);">filter_for_hedgehogs</span><span>(</span><span>)</span><span style="color: light-dark(#7C6F64, #A89984);">.</span><span style="color: light-dark(#8F3F71, #D3869B);">length</span><span style="color: light-dark(#7C6F64, #A89984);">;</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">}</span><span style="color: light-dark(#9D0006, #FB4934);"> catch</span><span> (</span><span style="color: light-dark(#076678, #83A598);">ex</span><span>)</span><span style="color: light-dark(#7C6F64, #A89984);"> {</span></span>
<span class="giallo-l"><span style="color: light-dark(#928374, #928374);font-style: italic;">  //</span><span style="color: light-dark(#928374, #928374);font-style: italic;"> handle the error at this level</span></span>
<span class="giallo-l"><span style="color: light-dark(#928374, #928374);font-style: italic;">  //</span><span style="color: light-dark(#928374, #928374);font-style: italic;"> I&#39;d omit the try/catch if I wanted the error to propagate</span></span>
<span class="giallo-l"><span style="color: light-dark(#7C6F64, #A89984);">}</span></span></code></pre>
<p>My aversion to chaining partly comes from the fact that I must use the parens to express my intent. The syntax of any programming language shapes what code feels idiomatic and most readable and what code feels like patting a cat tail to head. There’s nothing right or wrong about any of it, because all of them have found a way to express the concept.</p>
<p>Is postfix await a small bit of syntax? Yes. Is it thoughtfully chosen out of many possibilities? Yes. Is it very much in tune with the Rust syntax around it? Also yes. This is what I appreciate most about the Rust project: its concern for the experience of the human beings using the language.</p>
<h2 id="further-reading">Further reading</h2>
<p>If you would really like to understand Rust futures, you should read <a rel="noopener external" target="_blank" href="https://twitter.com/fasterthanlime">@fasterthanlime</a>’s article <a rel="noopener external" target="_blank" href="https://fasterthanli.me/articles/understanding-rust-futures-by-going-way-too-deep">“Understanding Rust futures by going way too deep”</a>.</p>
<p>If you are comfortable reading Rust, and want to know more about async executors and how they work, check out <a rel="noopener external" target="_blank" href="https://github.com/mgattozzi/whorl">whorl</a>. This repo walks you through the implementation of an async executor and shows you what <code>await</code> desugars to. (Hat tip to Chris Dickinson for telling me about this!)</p>
<p>If you have any pointers to other posts about why this syntax is neat, please send them to me and I will link! Also, if you have your own reasons about why this syntax is nice, please do write them up and I will link those too!</p>
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn-1">
<p>You might say that the people who chose the await syntax had a strong hold on <a rel="noopener external" target="_blank" href="https://blog.ceejbot.com/posts/programming-as-theory-building/">the theory of Rust</a>. <a href="#fr-1-1">↩</a></p>
</li>
</ol>
</section>
]]></content:encoded>
      </item>
      <item>
          <title>Programming as Theory-Building</title>
          <link>https://blog.ceejbot.com/posts/programming-as-theory-building/</link>
          <pubDate>Thu, 12 May 2022 10:00:00 +0000</pubDate>
          <author>ceejceej@gmail.com (C J Silverio)</author>
          <guid>https://blog.ceejbot.com/posts/programming-as-theory-building/</guid>
          <description>How reading &quot;Programming as Theory-Building&quot; changed my priorities as a technical leader and a programmer.</description>
          <content:encoded><![CDATA[<p>A couple of years ago now I read Peter Naur’s <a rel="noopener external" target="_blank" href="https://gist.github.com/onlurking/fc5c81d18cfce9ff81bc968a7f342fb1">“Programming as Theory-Building”</a> (alternative <a rel="noopener external" target="_blank" href="https://pages.cs.wisc.edu/~remzi/Naur.pdf">PDF link</a>) and it was a mind-blower. Yes, this Naur is the same Naur you know from <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/Backus%E2%80%93Naur_form">Backus-Naur Form aka BNF</a>. Anyway, you should go and read this essay now. It’s not very long. I’ll wait while you read it.</p>
<p>Back? Okay! Let’s do a bit of a crawl through some main points.</p>
<h2 id="highlights-of-the-essay">Highlights of the essay</h2>
<p>Naur opens with his thesis statement, the argument he’s about to make:</p>
<blockquote>
<p>[P]rogramming properly should be regarded as an activity by which the programmers form or achieve a certain kind of insight, a theory, of the matters at hand. This suggestion is in contrast to what appears to be a more common notion, that programming should be regarded as a production of a program and certain other texts.</p>
</blockquote>
<p>Programming isn’t about writing the code; it’s about understanding the problem and expressing that understanding through code. That understanding is what allows us to modify the code without harming its design. Naur discusses three real-world cases of existing programs being modified over time by a team closely connected to the original team, a team that had only documentation to go on, and the same team. The team with the closest connection was most successful at making additions that worked with the existing design, and not against it:</p>
<blockquote>
<p>The conclusion seems inescapable that at least with certain kinds of large programs, the continued adaption, modification, and correction of errors in them, is essentially dependent on a certain kind of knowledge possessed by a group of programmers who are closely and continuously connected with them.</p>
</blockquote>
<p>Naur then goes into what “theory” means, philosophically and in practice. Here are his three points about what a programmer having the theory can do, lightly edited:</p>
<blockquote>
<ol>
<li>Explain how the solution relates to the affairs of the world<sup class="footnote-reference" id="fr-3-1"><a href="#fn-3">1</a></sup> that it helps to handle.</li>
<li>Explain why each part of the program is what it is, in other words is able to support the actual program text with a justification of some sort.</li>
<li>Respond constructively to any demand for a modification of the program so as to support the affairs of the world in a new manner.</li>
</ol>
</blockquote>
<p>Naur is talking about “programs” here, and today we add on top of that the systems in which many programs interoperate to do complex things. So the demands are a little higher: we need to have theories of the systems we build and maintain, as well as theories of the pieces of that system.</p>
<p>The term I’m more likely to use myself for what Naur calls “the theory of the program” would be “a mental model of the system”. Some accumulation of facts in my head has allowed me to build a map of the territory– where things are implemented or “happen” in the system– and to predict behaviors given inputs. My mental model might be shallow in places where I haven’t had to make changes and very deep and detailed where I have recently worked. I need to keep that model constantly refreshed through review and rehearsal. While I’m model-building I find myself making small changes to the code, like renaming variables for clarity once I understand them, tweaking log lines, or adding comments.</p>
<p>The ability to fluidly <em>make functional changes</em> to the system I’ve modeled requires even deeper understanding of how it’s implemented, because there are patterns and structures in the code that I have to work with rather than against. This gets closer to what Naur means when he talks about the theory of a program. It’s not just the what, but the how and the why. Why does having a theory of the program matter? Because this enables rapid and effective modification of the program to respond to changing requirements without piling up technical debt or hacks.</p>
<blockquote>
<p>It must be obvious that built–in program flexibility is no answer to the general demand for adapting programs to the changing circumstances of the world.</p>
</blockquote>
<p>Alas, it is not obvious, we say, looking at all the premature generalization happening around us.</p>
<p>I’m going to quote this entire paragraph because of how important it feels to me, with some commentary.</p>
<blockquote>
<p>On the basis of the Theory Building View the decay of a program text as a result of modifications made by programmers without a proper grasp of the underlying theory becomes understandable. As a matter of fact, if viewed merely as a change of the program text and of the external behaviour of the execution, a given desired modification may usually be realized in many different ways, all correct. At the same time, if viewed in relation to the theory of the program these ways may look very different, some of them perhaps conforming to that theory or extending it in a natural way, while others may be wholly inconsistent with that theory, perhaps having the character of unintegrated patches on the main part of the program.</p>
</blockquote>
<p>We might call these “unintegrated patches” technical debt or hacks, but either way, we know it’s a problem when we see it. Somebody has worked against the grain of the wood when carving a new feature into the system, and it feels wrong. The hacks get in the way when you need to make the next change. They might not work well because they’re at odds with other design choices. They might be sitting right next to an already-existing affordance to add that new behavior!</p>
<p>Continuing on in this paragraph:</p>
<blockquote>
<p>This difference of character of various changes is one that can only make sense to the programmer who possesses the theory of the program. At the same time the character of changes made in a program text is vital to the longer term viability of the program. For a program to retain its quality it is mandatory that each modification is firmly grounded in the theory of it. Indeed, the very notion of qualities such as simplicity and good structure can only be understood in terms of the theory of the program, since they characterize the actual program text in relation to such program texts that might have been written to achieve the same execution behaviour, but which exist only as possibilities in the programmer’s understanding.</p>
</blockquote>
<p>People who do have the theory of the program can make changes that work with what’s there already. They know where the affordances are. Naur says that simplicity and quality only make sense in the context of that code to begin with, and this point is a good one. Let’s try another metaphor: Writing a program is like finding a domain-specific language to express the problem and its solution, a language that expresses your understanding of the domain. It’s a truism that code is communication. It is primarily communication with other humans, not with a compiler, with a set of verbs and nouns chosen by you as the best expression of your understanding of the problem. Other people reading your code must learn to read your new language, and to make changes they need to write it.</p>
<p>How do programmers learn a theory of the system? Naur says documentation and the source code are not enough, and someone new to a system needs hands-on mentoring:</p>
<blockquote>
<p>What is required is that the new programmer has the opportunity to work in close contact with the programmers who already possess the theory, so as to be able to become familiar with the place of the program in the wider context of the relevant real world situations and so as to acquire the knowledge of how the program works and how unusual program reactions and program modifications are handled within the program theory.</p>
</blockquote>
<p>Humans learn through guided practice with others. Spend time working with people who understand a system, and you’ll begin to understand it too.</p>
<p>This is great if the people who have the theory of a system are still around to talk to.</p>
<h2 id="nobody-s-around-any-more">Nobody’s around any more</h2>
<p>Real world is often more like Naur’s “group B” case, where further development on software happened without the benefit of close contact with theory-holders. I’ve described this a couple of times as a being a software archaeologist, digging out bits of architecture and sorting through refuse pits to figure out how a past software team lived and what the heck they were thinking about. Given reality, let’s ask two practical questions in response to Naur’s article:</p>
<ol>
<li>How can you rebuild the theory of a program or system if its original authors aren’t around to teach you?</li>
<li>How can you leave useful information behind yourself to help future maintainers rebuild the theory you have?</li>
</ol>
<p>In my experience, I had to spend a lot of time reading code and building my own mental model of the software, how it was constructed, and how the pieces worked together– the archaeologist metaphor I mentioned above. I had to reconstruct the theory by looking at the textual artifact and what it was doing in practice.</p>
<p>My colleague <a rel="noopener external" target="_blank" href="https://neversaw.us">Chris Dickinson</a> does some interesting things while doing code spelunks. He generates other artifacts as he goes, such as textual call diagrams that he can turn into graphs with graphviz. He’ll do this as a vertical slice for a specific code path as well, ending up with a detailed call flow chart showing every network traversal or call made to build a single web page, for instance. There are cognitive reasons why making drawings or notes like this is helpful– you improve your understanding of a concept by expressing it in a different form than you’re receiving it. (This is related to why taking notes in a lecture is helpful. Hear -&gt; write -&gt; read.)</p>
<p>I often tried to use commit logs to figure out why specific changes were made, but was more often frustrated than enlightened. The past generations of programmers at this company did not often write helpful commit messages.<sup class="footnote-reference" id="fr-1-1"><a href="#fn-1">2</a></sup> One specific commit that introduced an incredibly expensive bug had a commit message like “scope fixes”. Why was the change made? What did the programmer intend? Nobody knows.</p>
<p>I had to supplement by talking with people who weren’t familiar with the source code but could tell me what it did and why that was desirable or not. These people might not be the programmer-operators that Naur discusses, but they are expert user-operators. They have a theory of the program too! The one remaining programmer on staff who really understood a particular piece of software was priceless, and I’m grateful they were as amiable about explaining things as they were.</p>
<h2 id="an-aside-about-retention">An aside about retention</h2>
<p>I wish now to point out what might be obvious to you about the cost of team turnover. When there’s one human being left on a team who understands how that pile of legacy code works, you’re in trouble. If there are none, you’re in worse trouble. You have to hire people who can walk into messes cold and figure them out without help, and those people don’t come cheap.</p>
<p>Keep people around. Give them raises rather than making them leave to get more money. Keep them feeling good in their daily work.</p>
<h2 id="since-we-haven-t-prevented-let-s-try-curing">Since we haven’t prevented, let’s try curing</h2>
<p>Given this experience, and given the lightbulb moment that reading Naur gave me, I changed my development practice. I started thinking about ways I could help other programmers– maybe programmers I’d never meet– build useful theories about the software they inherited to maintain. If I found good ways to do this, I could then socialize those approaches and turn them into team practices.</p>
<p>Here are some of the things I’ve started doing.</p>
<p>I deliberately distinguished <em>maintainer documentation</em> from <em>user documentation</em>. The people who need to consume a service’s API are a completely different audience from the people who need to maintain that service. They might have different skillsets and programming languages: a person working on a website is probably writing in Typescript, while the service might be in Rust, C#, or anything at all. They have completely different concerns as well. The maintainer needs to dip into the source and needs to read details about the internals. Making the consumer of an API dip into the internals of its implementation would be a waste of their time.</p>
<p>Maintainers are the people who need the theory, so I invested time writing documentation for them. That documentation belongs as close to the source code as possible, and at least partly inside it in the form of comments. Don’t waste time documenting what can be seen through simple reading. Document <em>why</em> that function exists and what purpose it serves in the software. When might I call it? Does it have side effects? Is there anything important about the inputs and outputs that I might not be able to deduce by reading the source of the function? All of those things are clues about the thinking of the original author of the function that can help their successor figure out what that author’s theory of the program was.</p>
<p>Chunk up a level: writing about that function might help a maintainer fix a bug with it, but that isn’t sufficient for getting across the theory of the program. There are structural choices you make as you put together the program, as well as major decisions you make that inform the design. More specifically, the program exists to solve a problem, some “affair of the world” that Naur refers to. What was that problem? Is there a concise statement of that problem anywhere? What approach did you take to solving that problem statement? What tradeoffs did you make and why? What values did you hold as you made those tradeoffs? Why did you organize the source code in that particular way? What belongs where?</p>
<p>I started putting design documents, decision records, notes about spikes, and so on into the same repo as the source code. If you’re doing code deep-dives and generating artifacts about what you learn, check those artifacts into the source repo too! Put the videos somewhere durable and link to them. I duplicated some of these documents in the company’s documentation platform of choice, but the duplicates were not as important to me. <em>I can’t guarantee a future maintainer will discover those artifacts.</em> In fact, I can’t predict that any documentation platform <em>other</em> that the source code repo will exist in the future.</p>
<p>Toward the end of our joint tenure at out most recent job, my colleague Chris and I recorded videos of us doing code-centric deep dives through specific interesting, important, or especially difficult-to-understand aspects of the system. My hope is that these more conversational artifacts substitute in some way for having human mentors present to give people personal guided tours.</p>
<p>Oh yeah, and I continued writing tomes in my PR/commit messages, and being fussy about what actually lands into the mainline branch from my PRs.</p>
<h2 id="a-theory-of-theory-building">A theory of theory-building</h2>
<p>Here are Naur’s points, rephrased:</p>
<ul>
<li>The original authors of a system develop a theory of that system as they work, which comes from their understanding of the problem they’re solving and the decisions they made designing the solution.</li>
<li>Programmers need to share that theory in order to make changes to what the system does successfully, without degrading its quality.</li>
<li>People learn how complex systems work by being taught about them by other humans.</li>
<li>Documentation isn’t enough, even if it a) exists, b) is truthful, and c) is discovered and read.</li>
</ul>
<p>And here are my reactions:</p>
<ul>
<li>Do as much teaching of other programmers as possible while you’re in a job.</li>
<li>Since turnover is a fact of life, and you won’t stay at any one job forever, do your best to leave artifacts behind that help your successors theory-build.</li>
<li>Bargain with Naur’s position about documentation not being enough to by investing time into writing documentation <em>aimed at describing the theory of the program</em>.</li>
<li>Put that documentation as close to the source code as possible, because the source code is the only artifact guaranteed to survive.</li>
<li>Write good commit messages.</li>
</ul>
<p>The next people to come along will still have to put in the work, but you’ll have at least tried to to make it easier on them.</p>
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn-3">
<p>I love this “affairs of the world” phrasing for “the thing the program is supposed to do”. It points right at the fact that the requirements come from external context. <a href="#fr-3-1">↩</a></p>
</li>
<li id="fn-1">
<p>I think that <code>git commit -m</code> is a culprit here. It encourages people to type very short messages. The Github concept of the “pull request” improves on this by giving people a big text box to type in, so they aren’t pushed to keep their messages to 50 characters tops. This isn’t enough, though, if teams don’t have a culture of encouraging each other to write good PR descriptions, or of squashing branches down into single commits with a good message. <a href="#fr-1-1">↩</a></p>
</li>
</ol>
</section>
]]></content:encoded>
      </item>
      <item>
          <title>Dysfunction junction</title>
          <link>https://blog.ceejbot.com/posts/dysfunction-junction/</link>
          <pubDate>Tue, 10 May 2022 10:00:00 +0000</pubDate>
          <author>ceejceej@gmail.com (C J Silverio)</author>
          <guid>https://blog.ceejbot.com/posts/dysfunction-junction/</guid>
          <description>The connection between technical dysfunction and organizational dysfunction, and why you have to deal with the organization first.</description>
          <content:encoded><![CDATA[<p>You know <a rel="noopener external" target="_blank" href="https://en.wikipedia.org/wiki/Conway&#x27;s_law">Conway’s Law</a>, of course. I’ll recap it here, just in case:</p>
<blockquote>
<p>Any organization that designs a system (defined broadly) will produce a design whose structure is a copy of the organization’s communication structure.
— Melvin E. Conway</p>
</blockquote>
<p>This law is a deep one, because communication drives everything that humans do. We are deeply social; we cooperate with each other on large projects by communicating with each other. Communication is a reflection of our social structures because it defines our social structures! Of course the things we make reflect those structures. (Put like that it sounds obvious, but Conway’s statement is so pithy.)</p>
<p>Conway’s Law has some fun expressions, like:</p>
<ul>
<li>If two teams are communicating well with leaders who get along, their software will work well together. Conversely, if two managers <em>don’t</em> get along, the software written by their teams won’t work together well.</li>
<li>Bugs form along organizational fracture lines.</li>
<li>Organizational silos become software silos.</li>
<li>The engineers in the org right now don’t like the software written by the engineering team two generations ago. <sup class="footnote-reference" id="fr-1-1"><a href="#fn-1">1</a></sup></li>
<li>Broken organizations produce broken software.</li>
</ul>
<h2 id="broken-human-systems-make-broken-software-systems">Broken human systems make broken software systems</h2>
<p>The darkest expression of Conway’s Law is that organizational dysfunction is expressed as software dysfunction. If the leader of an organization is failing, the software it builds starts to fail as well. If you as a technical leader are tasked with fixing a broken software system, you might need to start by fixing the broken organization that produced it.</p>
<p>Is the site falling down on the regular? Is shipping new features impossible? Is the tech debt at meme levels? Look for:</p>
<ul>
<li>A toxic culture.</li>
<li>Teams that don’t communicate or cooperate.</li>
<li>A rapid feature-shipping pace that devalues software quality.</li>
<li>Managers who aren’t caring for their reports.</li>
<li>Managers who don’t accept organizational priorities.</li>
<li>Leadership that incentivizes bad behavior or unsustainable behavior.</li>
</ul>
<p>And by “leadership”, I mean all the way up to the CEO.</p>
<p>I’ve been struck by the influence of company leadership on overall work quality over and over again. Each company has a culture that it promotes, consciously or unconsciously, suffusing its personality through everybody there. The culture starts with the CEO and cascades downward. Intelligent, thoughtful CEOs inspire the people around them to be the same. Bullying CEOs drive good people away. Outright stupid CEOs (and I’ve seen several of these in recent years) make the teams around them stupid and inspire bad work.</p>
<p>The management that’s right next to your engineering team has even more direct effect than an exec team. Management can turn an engineering team into a 0.1x team. I’ve seen leadership that inspired what people told me was the worst work of their careers. This leadership actively made their teams worse. Depressing, yes, but flip it: leadership can turn a team into a 10x team through inspiration, support, good incentives, emotional safety, and all the other things that good management can do. <em>You</em> can provide this leadership.<sup class="footnote-reference" id="fr-2-1"><a href="#fn-2">2</a></sup></p>
<p>I maintain that you <em>must</em> if you want to be effective as a technical leader.</p>
<h2 id="systems-are-self-reinforcing">Systems are self-reinforcing</h2>
<p>I resisted coming to the conclusion that I had to think at the organizational level to fix technical problems for a long time. Surely, I thought, <em>surely</em> the organization will welcome solid, sensible suggestions aimed at fixing the trouble everybody agrees we’re in. The reality was far messier than this. Sensible suggestions might be nodded at as sensible, but their implementation will be resisted overtly and covertly.</p>
<p>The reasons why are squishy and human. People respond to incentives, and they do more of what the organization around them rewards them for doing. They get practice at whatever that is. They unconsciously build structure around themselves to support whatever they’re being rewarded for doing. The human organization that produces the software is a system all on its own, with feedback loops and incentive structures. When the org is healthy, those feedback loops promote a virtuous cycle. When the org is unhealthy, the feedback loops promote a doom spiral.</p>
<p>If the organization remains dysfunctional and you try to fix its output, it will resist your technical fixes, subvert improvement projects, and continue to drift back to the status quo. Managers will refuse to accept org-wide priorities, refuse to move staff to critical projects, refuse to cooperate with each other. Teams will attempt to maintain their silos. People will dig in rather than change their practices. Even more toxic things can happen in extremely broken organizations.</p>
<p>Systems are self-sustaining and self-reinforcing. To repair a software system, you have to repair the human system that built it, and you do that by breaking those self-reinforcing loops<sup class="footnote-reference" id="fr-3-1"><a href="#fn-3">3</a></sup>.</p>
<h2 id="people-will-tell-you-what-s-wrong">People will tell you what’s wrong</h2>
<p>The first step, therefore is to discover the incentive loops. Investigate and diagnose organization dysfunction methodically, the way you’d diagnose software.</p>
<p>One exhausting but effective way to figure out what the worst problems are is to sit down in a one-on-one with every single member of the engineering team <em>and</em> with adjacent teams. Your team is smart and they’ll tell you what’s wrong if they feel safe with you. Adjacent teams will have insights that people directly in the mess might not have, so include them too.</p>
<p>Ask everybody the same questions. I let everybody know in advance that I am doing this and to expect a calendar invite. I give everybody the questions in advance, and message people with personal reminders that they aren’t in trouble and I am meeting with everybody. Not everybody will need that reminder, but the ones who do will appreciate it keenly.</p>
<p>What questions should you ask? It depends on the circumstances, but you might try asking people if they feel they can do their jobs effectively, and then ask them to expand on their answer. Keep the questions open-ended and focused on the experiences of the person you’re talking to.</p>
<p>Take notes as you talk with people. (I ask for permission as I do this.) Themes will emerge. After talking to everyone in your organization, you’ll know what’s not working and what needs to be repaired.</p>
<h2 id="be-kind-to-people">Be kind to people</h2>
<p>If you’re like me, you might find yourself angry about some of the things you learn in this exploration of the organization. Anger is a important response to learning that your values have been violated. It’s a signal you should pay attention to. That doesn’t mean you vent it at anybody else. Instead, use it to identify urgent work.</p>
<p>Remember also that you’re not the only person who has figured out that things are broken. People in the org know it and have responded in their own ways. People will work very hard to fix things in their own specific corners, with what control they have. These efforts aren’t going to be coordinated and might end up at cross-purposes with each other, unintentionally making things worse. Honor the intent. You are here to coordinate.</p>
<h2 id="you-must-exercise-power">You must exercise power</h2>
<p>You have diagnosed the problem. Now you want to fix it, because you are a software engineer and fixing things is what you do.</p>
<p>If you have reporting authority and can start making organizational changes, this is the time to use that authority. If you’re a technical leader without that authority, it’s harder. You will need an ally who <em>does</em> have it, and you will need to have the respect and trust of that person. Perhaps this person is an exec who agrees that the company isn’t getting what it needs from engineering. Maybe this person is a new leader brought in at your request.</p>
<p>Spend time with this ally getting aligned with them on the values you’re bringing to the problem. What matters most? What does a healthy organization look like?</p>
<p>You will also need to invest time to gain the respect and trust of the team around you. How do you gain trust? By behaving in a trustworthy way. Demonstrate that you know your trade. Show leadership in small ways. Handle incidents well (there might be lots of opportunity to do this). You might not have hard power, but you definitely have soft power. Influence. Persuasive skills. Moral authority. The ability to inspire people. Learn to use these as best you can.</p>
<p>If all else fails, get that promotion to a pure leadership position you’ve been avoiding so you can stay technical. You can always escape it later (I say, as a person who escaped it).</p>
<h2 id="change-is-about-alignment">Change is about alignment</h2>
<p>Let’s assume the happy path: you have an ally with organizational power, and you’re aligned with that person on values and goals. What happens next is alignment and reinvention for the broader organization. Now you make sure everyone in the org, starting with line managers, shares leadership’s values and understands the end goal. Now you clean up or change how the org accepts work and executes on it, giving all your processes a good shakeout.</p>
<p>Entire books could be written about this part and have been, of course, because what you’re doing is setting up a healthy engineering organization.</p>
<p>This is what happens next:</p>
<ul>
<li>The company around you has to understand that the engineering org is not going to ship features for a bit.</li>
<li>Managers need to get into alignment first, or leave.</li>
<li>If managers have burned their relationships with their reports too badly, they might have to leave anyway.</li>
<li>People stuck in toxic patterns must leave.<sup class="footnote-reference" id="fr-4-1"><a href="#fn-4">4</a></sup></li>
<li>You need to design a new way for the org to accept work and execute on it.</li>
<li>Leadership must rebuild trust by offering trust.</li>
<li>Leadership must rebuild respect by offering respect.</li>
</ul>
<p>Now give the team a project they can succeed with that will improve something noticeably. Give everybody a win. Reduce that technical mess just a tiny bit. The team will be behind you, helping. (Finally! You get to fix some of the technical problems you’ve been itching to fix!)</p>
<p>Remember that lots of the people in that org want change too. They’re good engineers who are doing bad work, and they know it. They’ll be relieved to execute on a plan to make things better. They will shine if you let them.</p>
<h2 id="looking-back-at-doing-this">Looking back at doing this</h2>
<p>Change is difficult to coax into being. You know the saying about how people have to want to change? This is true of organizations as well, as an collective decision from everyone in that org. The organization needs to feel that the cost of changing is less than the cost of continuing to operate as it has. Change <em>is</em> costly and risky; local maxima are attractive even if they’re not very maximal. Also note that moving from a local maximum to a higher spot requires going downhill first. That’s something that looks very dangerous to an embattled exec who’s already under pressure because their org is underperforming.</p>
<p>Doing this at an organization was the hardest thing I’ve done in my career. I feel I was only partially sucessful. I have notes from early in the process about things that needed to change that were still applicable two years later. You might not be willing to take this work on, which is okay. If you can’t make traction, it’s okay to leave. Organizations deserve to fail sometimes.</p>
<p>It was incredibly satisfying to hear from my colleagues that the changes made their work lives better, though, so the rewards are deep.</p>
<h2 id="acknowlegements">Acknowlegements</h2>
<p>Thanks to <a rel="noopener external" target="_blank" href="https://twitter.com/isntitvacant">Chris Dickinson</a> who read early drafts of this post. <a rel="noopener external" target="_blank" href="https://www.neversaw.us">Read his blog!</a></p>
<blockquote>
<p>I keep thinking about that insight @isntitvacant had about Conway’s Law being true over time as well, and the software that felt good in the hands of a past time being uncomfortable to a team that has entirely turned over.
— Ceej “oh well” Silverio (@ceejbot) February 8, 2021</p>
</blockquote>
<section class="footnotes">
<ol class="footnotes-list">
<li id="fn-1">
<p>This is Conway’s Law over time. Teams are immutable: adding or removing a person to a team produces a different team. After enough change, the team is different enough that it no longer recognizes itself in the software system it produces. The result is people being vaguely unhappy about software that might be working perfectly well. This probably deserves its own short blog post. <a href="#fr-1-1">↩</a></p>
</li>
<li id="fn-2">
<p>It’s always a leadership problem. <a href="#fr-2-1">↩</a></p>
</li>
<li id="fn-3">
<p>My thinking here is influenced by <a rel="noopener external" target="_blank" href="https://donellameadows.org">Donella Meadows</a>’s <em>Thinking in Systems</em>. You should read the book, but <a rel="noopener external" target="_blank" href="https://niklashemmer.medium.com/summary-thinking-in-systems-by-donella-h-meadows-cfcffa0c3109">this blog post has a great summary</a> if you’d like a teaser. <a href="#fr-3-1">↩</a></p>
</li>
<li id="fn-4">
<p>Orgs in failure modes can burn people so badly that they can’t get past that emotionally. Burned people will often be perfectly fine and healthy if they get to press a big reset button and go somewhere else. <a href="#fr-4-1">↩</a></p>
</li>
</ol>
</section>
]]></content:encoded>
      </item>
      <item>
          <title>Problem statement</title>
          <link>https://blog.ceejbot.com/posts/problem-statement/</link>
          <pubDate>Sun, 08 May 2022 10:09:21 -0700</pubDate>
          <author>ceejceej@gmail.com (C J Silverio)</author>
          <guid>https://blog.ceejbot.com/posts/problem-statement/</guid>
          <description>It&#x27;s good to start with a clear statement of the problem you&#x27;re trying to solve. Here&#x27;s the problem statement for this blog.</description>
          <content:encoded><![CDATA[<p>The first problem with blogging is deciding where and how to do it. I’ve written static site generators myself in python, ruby, and javascript. I considered writing another one in Rust, my current language of choice, but I decided that this would be too much of a distraction. I have a month between jobs at best, so I need to focus.</p>
<p>If you’re curious, this one is generated with <a rel="noopener external" target="_blank" href="https://gohugo.io">Hugo</a>, which I chose because of how many themes are available without me having to fuss and write one. The theme is a lightly-modified <a rel="noopener external" target="_blank" href="https://themes.gohugo.io/themes/risotto/">Risotto</a>. The static files are hosted on AWS S3 behind Cloudflare’s free personal plan, because I don’t yet need any of their paid features. I manage the AWS and Cloudflare settings using terraform. I don’t yet have all my personal cloud infrastructure terraformed, but I might take the time to do that over my month+ between jobs. I have flinched from doing that in the past because it felt like too much work, but it was less work than I feared, and I now have reproducible results.</p>
<p>It’s been a long time since I blogged regularly. I still have a <a rel="noopener external" target="_blank" href="https://ceejbot.tumblr.com">presence on tumblr</a> but I’ve let it lag behind in the last couple of years. Work became intense and draining, and burnout meant I dropped a number of hobbies. I think it’s time to resume this one, though, with a focus on the kinds of topics I would have pitched as conference talks before the COVID pandemic.</p>
<p>Here are some topics I’d like to visit:</p>
<ul>
<li>The connection between technical dysfunction and organizational dysfunction, and why you have to deal with the organization first.</li>
<li>How reading Peter Naur’s <a rel="noopener external" target="_blank" href="https://gist.github.com/onlurking/fc5c81d18cfce9ff81bc968a7f342fb1">“Programming as Theory-Building”</a> changed my priorities as a technical leader.</li>
<li>Why technical correctness is the least useful kind of correctness in the real world (and how none of us ever make decisions based on this anyway).</li>
<li>Problem statements + values statements: how to arrive at decent solutions to problems if technical correctness isn’t useful.</li>
<li>What to do with a legacy monolith implemented with a language and framework you don’t know and/or dislike.</li>
<li>Where “performance” comes from in the messy real world, and where it doesn’t come from.</li>
<li>Why it took me a year to arrive at a one-line fix for a massive performance problem, and how I hope to shorten that time should I encounter a similar situation again.</li>
<li>Why a relentless focus on reducing developer friction pays off in team productivity, and some ways to do this.</li>
<li>The cost of breaking tech hiring so badly that productive, product-shipping engineers feel they have to “grind leetcode problems” to pass an interview.</li>
<li>How dogmatism is an enemy: let’s not be dogmatic about methodologies, “best practices”, “design patterns”, or anything, really.</li>
<li>When microservices are appropriate and when they’re not, and some varied approaches to slicing systems up.</li>
<li>Data-centric analysis for systems design and how to coax people into doing it.</li>
</ul>
<p>I am not a fan of single right answers for any of these topics. Most real-world tasks are complex enough that many approaches to them are viable, and the “right” approach will depend on the context. What’s your starting point? What does the system around you do right now? What are the people working on it comfortable with? What do they do well and what do they struggle with? Which tools fit their hands best? So in my theory, the best advice I can give anybody is about how to ask and answer questions about that context, and tell some stories about what worked and didn’t work for me.</p>
<p>I also hope to learn enough that five years from now I’ll be convinced that the Ceej writing these posts in 2022 was a prime chump who got much of this wrong. I’ll write another set of blog posts, and keep the blogging economy chugging.</p>
]]></content:encoded>
      </item>
      <item>
          <title>About</title>
          <link>https://blog.ceejbot.com/about/</link>
          <pubDate>Sat, 07 May 2022 11:38:39 -0700</pubDate>
          <author>ceejceej@gmail.com (C J Silverio)</author>
          <guid>https://blog.ceejbot.com/about/</guid>
          <description>About the author</description>
          <content:encoded><![CDATA[<p>I’ve been on the internet since 1987, which places me in the early generation of people being idiots online. Every mistake it’s possible to make, I’ve made. Blogging is one of those mistakes, and it’s one I have repeated since back when they were called “web journals”.</p>
<p>I decided to start blogging again during an interlude between jobs. I’d like to write about what I learned through my last decade-plus of work in a changed Silicon Valley, in a startup scene that’s quite different from the scene I entered in the early 90s. The technical problems have changed in the cloud era, which is fairly easy to observe, and many people have written about this.</p>
<p>I’d like to try writing about the human factors that you need to consider when approaching technical problems. After thirty years in this profession, I seem to have accidentally learned some things, mostly through making mistakes. Maybe I can help you avoid those mistakes!</p>
<p>Anyway, here’s another blog.</p>
]]></content:encoded>
      </item>
    </channel>
</rss>
