Being new to professional work, I didn't realize what a caustic and un-healthy environment I had through most of the early years. I remember being verbally abused for not being good at object oriented programing. I was unlucky and didn't get a mentor until about 10 years into my career, so I didn't know what to expect. The pay was so much better than my previous jobs, so I just put up with it.
After about 7 years, I was really frustrated. I kept seeing open conflict between team members, delayed projects, and panic during service outages.
I assumed it was me, I didn't know how to build software. I went back to school and got a masters degree at night, while continuing to work full time building a music subscription service (now called Napster).
Three years later, I graduated with the conclusion... actually it isn't just me. Our industry still doesn't know how to build software and that is okay, we just need to improve incrementally.
By 2008 I had shipped a lot of code, personally caused a lot of production outages, and jumped in to help others out. Incidents were often terrible experiences with a VP or CTO standing in your cubicle saying "go faster!" but no real plan.
That all changed when I was hired at Amazon.com. They had a "Tier-1 resolver" training and incident management framework. I was able to observe world class call leaders "stopping the bleeding" in a high stakes environment. Over the years I would learn to drive incidents to resolution quickly and calmly.
Eventually, I knew I wanted to become an entrepreneur. I began interviewing engineering managers to find a problem that I could solve. I was surprised to find that some managers wished their teams were more effective during incident management.
For my last 6 months at AWS, I coached four teams and ran practice drills to help them improve their skills and knowledge. I studied Incident Command Systems outside of Amazon, to understand what alternative methods were in use.
My passion is sharing what I've learned. Helping folks have calmer on-call shifts and being more effective at driving incidents to resolution.