Alexis Monville (en)

Navigating the Evolution of Cloud Infrastructure: Insights from Michael Galloway

In this enlightening episode of “Le Podcast on Emerging Leadership,” we dive into the dynamic world of cloud infrastructure with Michael, a seasoned tech industry expert. With over 20 years of experience, including significant roles at Yahoo, Netflix, and now HashiCorp, Michael offers a rich perspective on the evolution of technology and leadership.

Michael’s journey started at Yahoo, right before the Google IPO, setting the stage for a career among some of the brightest minds in tech. His path led him to a startup venture, then to Netflix, where he developed his passion for platform engineering. At Netflix, he contributed to the Spinnaker project, enhancing delivery engineering and pioneering DevOps approaches.

Today, at HashiCorp, Michael is a leader in infrastructure, navigating the complexities and innovations of the field. He shares insightful anecdotes, like his early experiences requesting hardware at Yahoo, illustrating the dramatic shift from physical hardware to the virtualization and cloud infrastructure we see today.

The conversation takes a deeper turn as we explore the principles underlying cloud infrastructure. Michael emphasizes the importance of understanding the foundational elements beneath interfaces and abstractions. This knowledge is critical for solving complex problems and ensuring operational efficiency in a DevOps environment.

We also delve into the challenges and strategies for leading in cloud infrastructure. Michael recounts a crisis at HashiCorp, where he took ownership of a significant problem, demonstrating the importance of accountability and clear communication in leadership. His approach to resolving the crisis and building a trustworthy platform offers valuable lessons for emerging leaders.

Michael’s reflections extend beyond technical solutions, touching on organizational dynamics and the essence of change management. He shares his experiences from Netflix and other roles, highlighting the importance of creating urgency, aligning with executives, and setting achievable targets. These insights are crucial for leaders striving to make a meaningful impact in their organizations.

In conclusion, Michael offers advice to emerging leaders: understand your stakeholders, deliver early wins, and define a clear purpose for your team. His journey is a testament to the power of leadership, adaptability, and strategic thinking in the ever-evolving world of technology.

Join us in this episode as we unpack these themes and more, gaining invaluable insights from a leader who has navigated the frontlines of tech industry transformation.

Here are a few links:

Here is the transcript:

Alexis: [00:00:00] Welcome to Le Podcast on Emerging Leadership. I’m your host, Alexis Monville. In this episode, we are excited to welcome Michael Galloway, a visionary leader in the tech industry with over two decades of experience. Currently shaping the future of cloud infrastructure at HashiCorp, Michael brings a wealth of knowledge from his dynamic roles at companies like Yahoo and Netflix.

Today, he shares his journey, insights on platform engineering and the evolving landscape of technology leadership.

Welcome to the podcast on Emerging Leadership. Michael, how do you typically introduce yourself to someone you just met?

Michael: Yes. Thank you for inviting me. Alexis. The way I think I typically introduce myself is I live in California. I’ve been working in the tech industry for about 20 years. Father of two rambunctious girls and husband to a wife of [00:01:00] almost 20 years now.

Alexis: Wow. Wow, wow. I would love to unpack all those things, but maybe we’ll time for, for some of it. Let’s look at your, your journey in the tech industry. A fascinating journey. I’ve heard about your experiences both at Netflix and now at Hashicorp. Could you give us a, a snapshot of your trajectory and what drew you to that field of cloud infrastructure?

Michael: Sure. Well, like I mentioned, I’ve been in the industry for more than 20 years. I was actually part of the early two thousands crew at Yahoo. Just before the Google IPO. So that was an interesting experience to start off my career. yeah, that was you know, everything, everything was possible.

And some of the most brilliant minds that I have the opportunity to work with many years later in my career started there. In fact, my current boss at HashiCorp was also part of that crew back at Yahoo and And and, you know, it’s, [00:02:00] it’s, the Valley is ultimately very small from Yahoo. I went through a number of different ranges of companies.

So I actually did a startup in the enterprise software space, which I was fortunate to sell. I would say it’s more of an Acqui-hire but it was a great experience to go through what is a startup life like in Silicon Valley. Eventually, I landed in Netflix around 2016. And moved into the platform engineering organization. From there I led a bunch of teams in Delivery Engineering. I think the most famous part of Netflix that people may know of is the Spinnaker product that was developed for the most part between Netflix and Google. And that’s what we evolved and, and worked on. After that, that was really where I fell in love with platform engineering as, as a concept.

The whole concept of full cycle development and DevOps as we were pioneering it at Netflix was just fascinating and working with some of the greatest minds I I’ve had the opportunity to work in that space. I eventually moved to leading platform organizations [00:03:00] at mid-tier companies, and now I’m over at HashiCorp. Running the infrastructure part of the organization personally. you asked about infrastructure. Infrastructure specifically is a fascinating and evolving space. You know, I actually have experience going in front of David Filo, one of the founders of Yahoo, and making physical hardware requests.

I remember standing in line a little anecdote there as I I was we all queued up at, at, at, at these hardware request committee meetings. And David Filo is one of several members. And I was right behind this gentleman. I had just started my job. Maybe I was a month, two months in. And the, the person in front of me was from Yahoo Photos, and he goes up to David Filo and he’s making requests for several multimillion dollar filer machines that we needed for the Yahoo Photos footprint. And they discussed and, and, you know, okay, we will ultimately approve. And then my number’s called, and I, I get up and. I said, I’m, I’m looking for $300 to buy a hard drive for one of our [00:04:00] machines. And Yeah. Philo had this look on his face of like, yeah, maybe this, maybe we can do, do some efficiency improvements for this meeting.

Might not be the best use of everybody’s time. And I say, I, I really appreciated that he saw it that way. But you know, the, so I was, you know, I think a lot of us have experience with the actual tin, but now, but with the introduction of virtualization that really came out many years later, unlocked, you know, all kinds of, of capabilities like you know, immutable deployment patterns and, and real ephemeral infrastructure started to become a thing. And, and finally I think. So, so what we’re seeing is, is the outcome of those innovations and the, and the, this idea that you can allocate virtual global infrastructure in minutes. But I truly think that that’s actually just the beginning of where, where we’re headed as an industry. So it’s an exciting space to be.

Alexis: Oh, whoa, whoa. Wait, that’s, that’s very [00:05:00] interesting because yeah, with, the introduction of virtualization, basically a lot of people thought, oh yeah, that you, you don’t need to care really about cloud infrastructure anymore, and.

Michael: Right,

Alexis: Anyway, everything will be fine with, that’s just infrastructure as code and, and let’s, let’s do everything.

But that’s not really what happened. Even if we I don’t remember DevOps. It’s what, 2009? Something like that. We are still not there yet. Completely. In your, in your, during your talk at Plato Elevate you mentioned that. Cloud Infra was not about hiding complexities, but setting the right defaults.

I would like to you to discuss that a little bit more, because that will maybe tell us what, what is coming, what, what the future looks like.

Michael: Yeah. This is a very fascinating conversation. It’s, it’s something that we can quickly get into Modern applications and lose a sense of principles. So I [00:06:00] like to come at this more from a principles first approach than, than just you know, the common conversation that I hear in many platform organizations or many companies is, how do we become, should we present a Heroku environment? And I think that that’s missing some grounding. you’re talking about how to use it. As opposed to the philosophies of behavior that you want to encourage or support in an organization. So like everything else in software development, the answer is maybe The answer is nuanced, right? But let’s start with the an early, I’ll give you an early story that, that really grounded my thinking on this. It goes back to my Yahoo days actually. So I was a software engineer there and I worked on the Confabulator product. It was a Desktop Widgets product that worked on both Mac and Windows. Actually, the modern Apple widgets experience on the iPhones, as well as Netflix’s tv or Netflix’s video capabilities on tv. And, and all the modern TV [00:07:00] widgets all are actually born from some of the actual same humans that worked on Confabulator. I, I actually worked with some of those guys.

One of ’em I actually hired into to Yahoo. So just a little short history there. So all things are connected, but I was working on this and I was around much smarter minds than mine. And one of the lead engineers in the group emphasized to me, he said, don’t just use the interfaces to these libraries that, that are, you know, available to us from the, the, the TVs or from the, the OSS systems that we’re trying to operate on.

Don’t just use the interfaces. You said you need to understand What they do underneath, you need to understand those in order for you to be able to solve the real problems, the hard problems. And he was right. How often we end up grabbing a library and just using it without thought of how is it actually performing these actions.

And when you do that, and we’ve seen this in software development all the time, where you have higher and higher level frameworks, and [00:08:00] the understanding of the magic underneath is ultimately .Limited to the few that actually care to try to introspect, and some of those frameworks actually actively try to encapsulate and block the ability for you to really understand what’s under the covers. Why does that matter? Is because if it fails to do the thing I need it to do, if my application calls into an interface and for whatever reason that interface has an unexpected side effect, I now have no ability other than to just abandon that interface. To solve that problem. And that becomes really a, a limiting factor. So if you take it from that perspective and you, you, you view software platforms and you view infrastructure platforms or platform engineering platforms, they’re all the same concept, right? They’re encapsulation their abstraction. There’s the same software principles. You start to get to the point where you realize, where do you want to put The responsibility for resolving and solving problems. In a true [00:09:00] DevOps world, you ideally want to enable application teams to ultimately have the ability to understand and operate their products in production. And if you don’t, don’t enable them to be able to see below the details for how something is being done. They have no ability to perform that task. They have to rely on a central team to do it. just like if I am, if I am the provider of a framework, but they can never see into the code of that framework. If that framework fails to do the thing they need to do, they’re going to abandon it or do something different, which will create heterogeneity in the environment and more complexity. So when I think about the right experience, what I look at is Not about hiding the complexity per se. I think you can follow abstraction or present a an interface facade if you want to simplify the zero to one problem that most of the time, this is what they’re talking about. I just wanna get my application out.

I just wanna get a database. I that’s a zero [00:10:00] to one problem. Provide a simple facade. That’s where the abstraction actually can have value, but. It should be an abstraction that you can drill further down if you want to. You can go further and you can see what actually was done. How did how does this machine perform the actual instantiation of that database? What is the instance size? If it was, say, Amazon, of that database that was set up, I should be able to introspect these things because those can lead to me understanding why a failure occurred. In my production system or how better to architect. A good example of this is a situation that we just recently encountered you know in, in my current universe at HashiCorp, where one of our products has a stateful, it wants to perform in a very stateful way. Well, it is a stateful application. And stateful is a particularly tricky monster to, to from an infrastructure standpoint, right? We really, [00:11:00] very much on the infrastructure side, wanna see the world as I. As, as cattle they say not pets, right? That’s a common euphemism. And and so the idea that I can truly lose or blow away my infrastructure if I needed to and that the resiliency is actually supported both at the application tier as well as other parts of the infrastructure to support the idea that any virtual thing can fail. And the truth is, is that whether anybody likes to think of it or not, I have a lot of experience with yeah. Virtual things fail because physical things fail. So you very much need to have that. If you a stateful application doesn’t like to operate that way. It likes to believe that, that there is a permanence with the thing that it’s in. This is a really tricky problem with infrastructure systems to date. If we have a full abstraction of what is actually happening on the infrastructure tier, especially when we need to version the infrastructure underneath the covers, it can, [00:12:00] it can be a real problem for that, that application team, because they don’t understand why systems are periodically being disconnected or broken or having any predictability around it. So as a result of that, they have to offload all of the operation problems and all of the ops that are specific to their application universe, to the central and infrastructure, the central infrastructure team. And that is the anti-pattern that we all wanna avoid, that the whole point of DevOps was to move out of a central team operating applications in production as much as possible. So that was a long-winded answer. The short nugget here I would say is predictability is more valuable than velocity.

Alexis: Mm. Yeah. , I guess that that summary helps really to to understand the, the whole thing. Could you tell us about a particular challenge you, you facedworking on that realm of cloud of platform at Corp and how you approached it.[00:13:00] 

Michael: Yeah I’ll give you a different challenge ’cause life’s full of those. When I joined Hashi Corp let’s see, I joined December of 2022, so December last year. So I’m almost at my one year anniversary actually. About a month and a half in, I would say, so sometime in January all these alarms started going off. It was not my fault. I had just started. That’s okay. I don’t mind if it is. But it was not alarms are going off all these, you know, 3:00 AM things blown up. And so the issue was a big portion of our System relies on a workflow. It, it’s basically, it’s a workflow engine that, that a lot of our use cases require to be operating effectively.

It’s a, it’s a, the engine’s cadence, it’s used, it’s pioneered by Uber. And temporal is, is maybe a more well known modern name is a, is a Next iteration of that workflow engine. Anyway, [00:14:00] so this thing started to blow up, and the reason it started to blow up was that it was backed by a, a single, very hard, a large database instance. And that database instance was struggling to, to keep up with. An unanticipated load. And this was not necessarily a new issue. In fact, cadence had rather this, this, nothing to say about the cadence Service is perfectly fine workflow engine, but the design was just not well des it was not well designed to be very scalable. And so as a result over the last several years, people had kind of wanted to avoid This system ’cause it was known to be problematic and it had burned people out trying to support it. So, but it had finally tipped over and, and by tipped over, I meant it actually stopped keeping up with the abil, all of the workflows coming in.

So it started building a history list. I think something on the order of maybe a million. Runs behind and it was continuing to fall behind. Yeah. So, you know, when you see that it’s, [00:15:00] it’s a downward spiral, right? It’s, it, it, and so we brought in AWS people and we performed a quick crew. I. To, to set up basically like a war room situation, to try to triage and stop the internal bleeding.

And so what’s the first thing you do? You say, okay, well let’s, let’s, if we can’t horizontally scale because we hadn’t sharded this system, let’s scale up. Right? And whenever I hear scale up, I think all of us, and especially in the infrastructure space, kind of cringe ’cause you know, there is a finite limit to scaling up and scaling up. Doesn’t actually solve the underlying problem. Ultimately it just delays the problem. yoU know so we did, there’s, again, our first focus was stop the bleeding. We scale up. It, it helped. Still some things were, were not quite as stable as we wanted. anD this is where I think the more interesting part of the story it comes in because all these kinds of technical problems in my whole 20 year experience. I’ve very rarely been [00:16:00] on what I would consider you know, a Mars landing kind of problem where you’re maybe doing something fairly novel and even that maybe isn’t as novel anymore because we’ve done it before. Uh, most problems are not, in other words, insurmountable technical problems, where there just is no answer. generally, I’ve found that 99% of problems that I’ve had to deal with are more about organizational problems. And, you know, you might even go to say leadership problems in the sense of how do you, how do you think about approaching this kind of crisis? What are the right things to do when a crisis like this happens? And so the steps we took first, the very first thing is recognize that. Upper leadership partners, customers who are relying on this thing all want somebody to say, I’m gonna raise my hand and say, I’ll take ownership of this problem. That’s the very first thing everybody needs. They need to hear you

Alexis: Mm-Hmm.

Michael: And so [00:17:00] we did. I I basically said, okay, we recognize this as a problem. I’m not gonna make up stories about this. It’s a problem and it needs to be resolved, so we’re gonna take ownership of it. And what we did was formed a permanent team around this. And that sent a very clear signal, we’re gonna own this problem.

We’re going to move it to a, a place where you can trust it. anD that was actually a really important thing, not just for the ownership aspect, but there was real lack of trust in building these, these workflows by teams because of the instability history. And so, as a result, teams started to look for alternative approaches, and that would’ve led to a much more complicated universe to manage. So it was very important that they, they knew somebody was going to own solving it. Once we did that we defined some specific outcomes towards stability and scalability that we needed to be able to achieve. It needs to be horizontally scalable, not vertically. I think that was one of the most important things that we emphasized, that the thing we did today [00:18:00] to bandaid, this is not a solution. It’s, it’s a bandaid. What we need is not to try to put all our cargo on one ship. We need multiple ships. And, and so once some of the fundamental, and these are not complicated concepts, but they are complicated to execute on because having multiple ships means a whole lot of additional complexity and logistics up front for figuring out what goes on those ships and so on.

I don’t know, I just suddenly jumped into a nautical analogy. But these, this is You know, establishing this is what we are, are, this is our success criteria, this is our strategy was critical to get out early. What are the outcomes, not the physical deliverables. The next thing we had to do deliver short-term wins. And by that I mean short term, what anybody ever cared about was stability in, in the short term, as well as enabling products to launch. So the products that we’re afraid to Right on this. We [00:19:00] immediately engaged them, prioritize, making sure that they were stable. They had the resources within the system to be reliable.

And so we enabled those product launches. And then we pumped out every week what the reliability status was, what were there any issues and any updates or communication on progress towards those outcomes that we had. This was critical. Those two things were vital for us to establish credibility and for people to actually feel like the wind had changed and that this ship was actually going to turn that built.

Confidence and trust gave us momentum. And, and as we continued to execute, this team has completely revamped the architecture, the system. They’ve migrated a bunch of the critical systems to starting to be able to Have better resource isolation, which are fundamental things in an infrastructure universe to be able to isolate workloads and manage resource consumption by each of those workloads. We didn’t have some of these fundamental abilities before. Now we’re in a state where we’re executing on the we’ve moved away from RDS and we’re bringing in. A scalable [00:20:00] backend, which is, you know, a, a Cassandra backend, which will allow us to horizontally scale. So we’re in a much different space, to the point where a leader about a week ago said to me ” not only do I no longer worry about cadence, I, I’ve basically entirely forgotten that it was ever a problem”. wHich is great except that I said just make sure that we don’t think we can remove people from this team right now. I’m glad you are confident.

Alexis: Yeah, exactly. But yeah, I, I believe that’s, that’s very interesting. The what, what you offer as a solution. If I put aside the technical solution I, we could apply that to basically a lot of different problems that we have. Having a team that is able to say, okay, we are owners of that thing. And now we own that problem and we will solve it.

Being really clear about what are the outcomes, where we are today, how we measure those ourselves compared to those outcomes. That’s very, very critical. And and [00:21:00] knowing that you will not win the trust of people by announcing 24 months plan. You will win the trust of people because you are delivering something now.

yes.

Michael: yes. 

Alexis: Getting into that mindset is critical also. So I, I love what you’re saying about all that. Have I missed anything in what you, what you propose?

Michael: No, I think you sum some, summarized it exceptionally well. I will say generally this you are, I fully agree with you. This is not a unique situation. This is a pattern and a strategy for approaching a, what is, what comes up fairly often in every job I’ve taken, there is always a crisis and I’m going to misquote the person.

But it’s what I think the famous saying is, never let a good crisis go to waste.

Alexis: Yeah.

Michael: these are hugely valuable opportunities to actually have a tangible impact on the business.[00:22:00] anD you know where others may be afraid to tread. These are the opportunities that really enable you to shine as a leader.

Alexis: I, I really like that. Are there other pivotal moments in your career when, when you, you really learn something significant about change and leadershIp? 

Michael: Oh my gosh, yes. Well first, if anything I’m saying here sounds at all polish, please understand it comes from the many battle scars that I have over my history of, of making mistakes and reading and learning from, from the wisdom of others, and then having the opportunity again to apply them. But yes, let me answer your question more directly.

So at Netflix .We and delivery engineering embarked on this initiative called Managed Delivery. It was a very ambitious project that is still very near and dear to my heart. It’s it’s, it’s fundamentally what it is, is delivery in [00:23:00] Spinnaker is done using pipeline, basically articulating pipelines. And what we found from From the way that we were operating where every team was defining their own pipelines. In Spinaker, I think we had about 16,000 pipelines at that time. We across about 4,000 applications, about 400 teams was about the size we were at. Platform Engineering has some challenges. One of the specific challenges was as we, we were still very VM based as we would release new base OS AMI s. That might include security improvements, patches, other things that needed to be there. We had an adoption rate of it took on the order of months to years for certain patches or updates to be rolled out. That was really problematic for us because you can imagine that there is, sure. I mean, if you did a security sev one incident, they could broadcast across the company and people might take action, but that’s a pretty disruptive thing to do. [00:24:00] What you want is, is a design that helps enable the, the bottom tier to be as evergreen as possible.

Right. But we had a, we had a problem. All the teams owning their own pipelines Spinnaker had no intelligence about those pipelines. It, it just knew, run this, it, it was a workflow engine in many ways. Right. Run this step if that step Gives me a green light. Go to this next step, go to this next step, and, and maybe some conditional logic, but what do those steps represent? And, and what is the confidence after you know, step two as to whether this, this new update is safe to roll out? All of that was opaque to the engineering system. So what we needed was a way that we could evolve our infrastructure and we could evolve our amis, we could evolve our strategies under the covers. anD do so without having to get all the teams involved. So that was one of the motivations. Another motivation was we thought it would make it easier for [00:25:00] teams to also not need to articulate or come up with strategies in their pipelines for safe delivery, right? We teams would deliver applications to multiple regions. What’s the right sequence of steps that would enable you to catch a problem and roll back the change? If, if a failure happened in, say, the second region you rolled out to, which is a very complicated problem, right? First region successful, second region fails, most of the time pipelines would just die. And now you have this very confusing universe where you have different versions of your shafter running and, and problems

can surface. So we thought, Hey, let’s take that problem away from teams two. Let’s create a declarative form of delivery that basically enables people to define the Criteria for success that would enable promotion from one lower environment to higher environments.

That was essentially the goal of managed delivery, was move them towards the description of what needed to happen as opposed to defining how it should happen. [00:26:00] Very ambitious on the size that I was mentioning, especially because Netflix culture very much operated with a freedom and responsibility concept, and so that meant that teams were never Really obligated to use a service or a new system. So imagine operating in an environment where you have lots of very smart and talented people from all around the world that are working on their problems, their projects, and you ask them, you, you need to engage them on something that they honestly would prefer to not really have to think about.

Right? I don’t con like, it, it, the water company doesn’t reach out to me to talk about Repiping .You know, pipes to my house. Like I have no interest in that conversation. If you need to do it, sure.

Go ahead. Right. It, it’s the same way in delivery, engineering and reaching out to these teams. I don’t know it, my software always continues to deliver.

It’s fine. Why do I need to care about this? thIs is a very common problem in [00:27:00] platform engineering, but also come from for library producers, API producers, anybody that’s producing something that others are consuming

 you almost always have more interest in in making that happen. Than they do especially when you, the value proposition may be more on one side and the other.

And that was the key mistake I made. At that time you know, we very much wanted to take the approach of, if we built something really valuable and very interesting for folks they would adopt it. And I think there was merit to that. And so we spent a lot of time thinking about, you know, the early adopters.

We got some early successes. We got some people to enjoy it. But then we hit that classic crossing the chasm problem where we couldn’t get past the early innovators to the early adopters. And we struggled on that. What was it, was it some combination of features? Was it some combination of capabilities, something that this could do that other things couldn’t? What, what I miscalculated personally was the actual value to the business was the platform engineering side of the, the, the [00:28:00] equation platform engineering needed to see this adopted. Across the fleet for there to be real value. And so given that the strategy may not necessarily be one of slow adoption, but rather it may be more important to take a little bit stronger of, of a, of a, of an approach. And John Kotter talks about this in, in leading he has an article in HBR called Leading Change, but he has a book called, why Transformations Fail. And I will say I read that book during that time and I failed in probably at least the top three even after I read it. So it’s, I will tell you, there is a very, I, I learned how big the gap is between knowledge and wisdom. And, and, and that that gap being how wide experience needs that gap, that that which is experience is that gap, right? And how much of that you actually need. Long story short [00:29:00] you know, managed deliveries, value proposition. Very much is alive, it is moving forward. But that was an experience where I realized because our adoption was very slow, you could imagine that we did not take as an aggressive of an approach, specifically by aggressive, I mean, we didn’t establish a sense of urgency. So teams were necessarily complacent in the adoption. And it’s not no fault to them, that’s the way the culture was designed to operate. But as a result it’s getting adoption, getting that change to actually happen. It was much harder. Now I know that they are doing amazing stuff now over there in terms of, of growing it.

They, we’ve learned a lot of those lessons and the impact of that approach is really being felt. In fact, years later, I landed at HashiCorp. My peer came from Samsung. Smart Things. He recognized me and said, oh you know, managed delivery. And they apparently larger footprint than Netflix much higher traffic than Netflix.

All the iot devices right, call into their[00:30:00] and they. Overnight, basically. Maybe it’s not quite overnight, but they, they they fully adopted it and saw some of the benefits of that adoption as a result. And, and and so it was, it was a cathartic to hear or comforting rather to hear. But yes, it was a, it was a good experience in the challenges of change.

Alexis: Yeah, it’s, it’s very interesting that we are coming, going, going back to that idea of a team owns a problem and now tries to solve it. Unfortunately it’s really a problem for the business, but it’s not necessarily a problem for the other teams. thAt are consuming something from that team. And now how do you create a sense of urgency for the other team when they are not even aware that it’s really a problem for the business and you cannot count on that for them to investigate that part.

So maybe that it’s other nudge.

Michael: Well, and I have a, a story about creating the urgency because I, [00:31:00] that’s what one of the things I learned, there’s actually two pieces to that that I learned. And I applied at the next job, actually after I left Hashi after, sorry, after I left Netflix. It was a mid-tier company. We were on a, a, all the entire fleet was on a, a Heroku actually.

We were hitting problems with that platform. Going back to the ability to introspect and understand how things work, Heroku was too abstract, too high level for us to be able to operate it effectively for the things that we wanted to be able to do. You know, it got us the zero to one, but that, that hard abstraction. mAde a a, it made it impossible for us to get past that one. loNg story short, though, we needed to migrate, we decided the business decided we needed to migrate off. But even with that, we wanna migrate off like all things that happen in a business, they are good goals. They’re, they’re set, like you said, the 24 month goal.

Oh yes, we should be But how important is that? How urgent is that? I. [00:32:00] This is from my experience with managed delivery, this is what I, I learned. Okay, so two things. One you need a sense of urgency. So how do we create that urgency? You need to get a date set and that date needs to have consequences. So we talked specifically about setting a nine month target from the point that I had started that job and, and the reason for nine months is nine months. Feels close enough that it will happen, but far enough away that virtually no engineering team says no. Right? And, and and I mean this very much affectionately, we all believe that the world is possible in nine months, not three months, but nine months.

Yes. Nine months. I for sure we’ll have time. So we we got alignment that in nine months we would, we would hit this target. We made sure that the other aspect of this was we were going to shut off Heroku. We were going to actually disable and tear up the contract. And so that was the, the cliff date. [00:33:00] That’s great to have that date. And there’s a lot to unpack on the importance of setting dates, but the other bit that was vital was we needed to get executive, Alignment with that, that needed to be something that the executives would back. And by that I mean you know, the term leadership or executives is, is nebulous just someone in a position of authority at, at the right level that can basically say once you get to that three months away from landing this. That this is a date that will not move. And we, we were able to get that. And those two things ensured that this, that project very ambitious. We moved the entire fleet out and over to Azure, and we had zero service disruption. It was a, it was a remarkable feat. The, the team did an amazing job, but I truly believe having both of those factors Enabled us to do that Herculean task because the last three, three months you can imagine were brutal, stressful[00:34:00] you know we we bought lots of DoorDash for people to, to and, and, you know, and supported them as they were executing on all of this stuff. But once we landed that the entire crew, Could look back and they did and said this was an amazing thing we were able to accomplish, and there was real pride with being able to do it. So very good lessons learned.

Alexis: I love it. I love it. And once again, that’s, that’s really interesting to, to unpack the learnings about that. Yeah. You need a date and when, when people hear that. They can hear that, yeah, that’s a date, but maybe we can be late and no, that’s really a cliff that’s, there’s nothing behind. And and you need that support, that alignment.

So nobody will dare to change the date. There’s no option around that. And that’s absolutely clear for everybody. So now they can make plans. They have the time. Nine months is, is is a good one. We were thinking, yeah, it’s feasible. And, and, and I, and then, you know, thing about it, I, I realized that when you [00:35:00] were saying it, that if you would’ve said three months, I would’ve say, oh, no.

That I would’ve started to think why it was not possible. But nine months I was comfortable to say, yeah, okay. And I know nothing about the challenge, the reality of the challenge. funny. So yeah, you can start making plans. That’s a, that’s a, that’s a.

Michael: That’s right.

Alexis: What, what would be your advice to emerging leaders or who want to make a meaningful impact?

Michael: The first thing I would say is you need to understand your stakeholders. I have learned the enormous value in getting, developing those relationships and deeply understanding who your customers are who your peers are. Who and what leadership is expecting of your organization? A lot of people, I think, focus, especially emerging leaders, they focus on their team and down. I have a lot of experience in [00:36:00] doing that and failing beautifully because I misunderstood what was expected, what was not spoken, but expected by my peers and by upper leadership. And so you really need to understand not just the the surface statements of here’s our goals, here’s our outcomes. What you want to ask is what keeps you up at night You want to ask where things have failed in the past. You want to hear the, the, the reactions. More than you want to hear the thoughtful process of, of desires, right? It’s those emotional reactions, those small perceptions of your team and of what is expected of, of your organization that actually will influence whether or will, will influence whether or not you are. Well, it’ll affect whether or not you are successful because those are the micro perceptions that actually determine whether they are are, they’re going to think of your team as a team to rely upon for those next strategic steps that they want [00:37:00] to take. Right. So understand them very well, and that takes a lot of time, and there’s great books on this. But this is where it truly is around a psych the, the psychological approach far more than it is that technical execution or delivery. The next one is you need to deliver wins within the first 90 days of starting a new job. And there is a great book, first 90 Days. I think it’s a fantastic book on this topic. I Have, I’ve applied it and successfully a few times now. It very much is correct. Get that, get that win. You have to have credibility when you go into a room. You have to be able to be believed when you say we should do X or Y. Otherwise, you’re gonna stay in the tactical level always because you haven’t established that you can actually solve bigger problems. The key thing with getting that credibility in the first 90 days is you don’t need a big win. You just need something meaningful, something that addresses a concern. Peers of mine had actually mentioned this to me years before too. Don’t [00:38:00] try to run after. The biggest thing you can run after, especially when you first start, start with something. yoU, you, you can own and influence, so it’s something within your control. Don’t do something that’s gonna require a bunch of other folks to be aligned, especially when you first start. It’s challenging to do that, so it should be something for the most part, you can control. I. Second part, it’s gotta be something that matters to other people.

It doesn’t really matter what it is. It doesn’t have to be a technical solution. It could be an organizational solution. It could be an information solution. It could be a communication solution. It could be any of these things, but it needs to be something that actually addresses a, a, a fear or concern. A great example of this is just starting a monthly newsletter for your organization and ensuring the rest of the business understands even what your team does or your group does. That’s surprisingly a big problem in many places is just the awareness factor, and doing that suddenly puts you on the radar of a lot of people, and it can really, it can really move things forward.

That’s not a technical problem at all,

Alexis: Yeah.

Michael: but it is a problem and it can establish you. [00:39:00] The third thing that emerging leaders need to be taking a look at to have real meaningful impact, define The purpose for your team. And by that I mean you need to bring your team into that. But defining a purpose is one of the most fundamentally powerful actions that I have ever learned to take with my team.

And purpose is different from mission and vision. a Purpose is. It is the, it lives the lifetime of that team or that group that you are managing. And a purpose is not it, it sometimes it’s referred to as a North star. I don’t think it’s quite that. It’s not quite that right way of seeing it. A purpose.

This establishes a philosophy that everything stems from. So one of my favorite examples of this was I think he was a, gosh, and the name is gonna slip outta my mind, but he was a, a French designer actually, I think that helped establish the purpose for Disneyland and that purpose was to create happiness in the visitors. Now, if you think about that, that sounds very simple, [00:40:00] but it’s a very powerful fulcrum. Because at that point, when you have that, everything from how you name the parking lots, you name them after Mickey and Goofy, not A and B and C, the design of the trash cans, the uniforms, the decision to have very pleasing flower beds that are millions and millions of dollars of investment for each of these things. Why do you do that? Because each of these pieces maybe make somebody smile a little bit more. Establishing a purpose for your organization enables you to prioritize. It gives your teams freedom to execute and to think more broadly and it enables you to align with what your next strategic steps need to be. It it really is the guiding, you can think of it as a guiding principle. So there’s, I, I’ve written articles on this and, but there’s much better, smarter minds than mine that have, have spoken on this

Alexis: Ah, I will link to that and we will let people [00:41:00] people decide. About that . So what, what’s next for you? Any exciting projects or initiatives you, you, you want to share?

Michael: Yeah, so well with Hashi Corp I think one of the exciting things that we have coming up next from the platform engineering organization is really trying to crack this self-service nut. You know, Hashi Corp is an organization that has, we we build tools for infrastructure management, right?

I mean, we build tools for platform engineering. How do we, how do we leverage all of the, the tools that we have and the patterns and behaviors that we wanna encourage to enable self-service within our organization? So a team being able to go from zero to one. I know this is a nut that a lot of people have cracked in the sense of they’ve created, you know, IDPs, right?

In, in internal developer platforms. But I think that that’s more of a, a, a how, and I think I wanna get back to again, the principles. What should that, what, what are we caring about enabling the actual day One [00:42:00] problem of give me a service is not a hard problem to solve. It’s been solved a lot. The day two problem of now I wanna add a database to my service. That’s a harder problem. And that’s one of the ones I’m excited to see get moved forward. Yeah, so that’s, that’s, I’m looking forward to that next

Alexis: That’s very cool. So let’s talk again. Thank you very much Michael for joining. have fun solving that.

Michael: Thank you, Alexis. 


Posted

in

, ,

by