Kyndryl’s Post

View organization page for Kyndryl, graphic

396,058 followers

1mo

What happens when you ask a software engineer to design an operations team? You get site reliability engineering (SRE), a practice designed to keep IT systems and applications humming as efficiently and predictably as possible. Rod Anami and Gregory Pruett share their thoughts on how deploying SRE at scale can help boost application dependability, reduce downtime and support business growth. They also provide steps for you to start your own SRE program. Read their article here: https://lnkd.in/e5RxE5ds #Devops #SRE #Monitoring

4 Comments

Gregory Pruett

VP, Distinguished Engineer, Infrastructure/Cloud Architecture

1mo

SRE is a hot technical career path in Kyndryl! Both for helping our clients transform, and also for our own continuous improvement.

1 Reaction

Rod Anami ☸

1mo

Thanks Kyndryl for the opportunity!

1 Reaction

Don Osborn

1mo

Very helpful!

Shankar Balasubramanian

Director - Software Engineering - Architecture at Kyndryl

1mo

Great advice!

See more comments

To view or add a comment, sign in

More Relevant Posts

Arie-jan Snel

Marketing Expert in Lead Generation, Marketing Automation, Account Based Marketing (ITSMA certified) and Events.
4w
Report this post
What happens when you ask a software engineer to design an operations team? You get site reliability engineering (SRE), a practice designed to keep IT systems and applications humming as efficiently and predictably as possible. Rod Anami and Gregory Pruett share their thoughts on how deploying SRE at scale can help boost application dependability, reduce downtime and support business growth. They also provide steps for you to start your own SRE program. Read their article here: https://lnkd.in/e5RxE5ds #Devops #SRE #Monitoring
Like Comment
To view or add a comment, sign in
Fabrizio Biscotti

Managing Vice President at Gartner
3mo
Report this post
Software engineering leaders often struggle to assess and prioritize product reliability, leading to a lack of focus on Site Reliability Engineering (SRE). The recommendations include using system-level indicators and service-level objectives for data-driven decisions, making reliability a shared priority, and integrating SRE practices. The strategic planning assumption predicts that by 2027, 75% of enterprises will adopt SRE practices organization-wide. The research emphasizes the importance of the partnership between software engineering leaders and site reliability engineers in improving product reliability. Read the full note for more insights. #SoftwareEngineering #Reliability #SRE #CustomerExperience #ContinuousQuality https://bit.ly/3JJSHZA How Software Engineering Teams Should Work With Site Reliability Engineers
Like Comment
To view or add a comment, sign in
Certo Modo

57 followers
4mo Edited
Report this post
Is your software engineering team bogged down with incidents, slow and painful release cycles, and tons of manual tasks? Are customers beginning to notice and getting frustrated? You might be curious about Site Reliability Engineering as a way to address these issues, but perhaps you don’t have the budget to hire an entire team or department to implement it like larger tech companies. Don’t let that stop you! Join us on Thursday, April 11th, 2024 at 2PM ET (11AM PT) to attend a FREE webinar ‘Lean SRE’ where we discuss implementing Site Reliability Engineering practices quickly and effectively without a huge staffing investment! Sign up here: https://lnkd.in/ekryP7MD #devops #sre #softwareengineering #saas #productmanagement
Like Comment
To view or add a comment, sign in
Yogesh Gupta

Associate Vice President - R&D
4mo
Report this post
If you think you are Agile with a capital A and proud of your velocity but have not put any effort to implement Site Reliability Engineering in your systems (automated monitoring, fault tolerance, scalability, and performance adjustments etc.) than you are Agile only in principle but not in practice. It is alright to take some hit on velocity in the short run if you do not want it to come to a complete halt when there is an outage in production. #SRE #DevOps #DevSecOps

2 Comments
Like Comment
To view or add a comment, sign in
Shay Pletcher

Site reliability goblin, engineering leader
10mo
Report this post
Lately I've been noticing how little of my work as an SRE is around actually hardening our product infrastructure. I almost never personally add redundancy to a service or eliminate a source of downtime. Ultimately, the primary source of system instability is process and lack of information: people aren't aware of when they're introducing unstable or unsafe architectures, they don't have the tools to verify whether their service is failing, or your application is set up so that designing stable features is difficult and time consuming. None of these are things an SRE can fix by personally making code or cloud configs more reliable. The only way to sustainably improve site reliability at an organization is to build systems and processes that make it easy to see when services are failing and easy to write services that don't. That looks like monitoring, training, internal tooling, and patterns to follow that make it difficult to build services that aren't reliable. It looks like getting buy-in from product so they understand their product's performance and are working with you and their team to keep it high. I've worked at organizations where one brilliant engineer stepped in to clean up each service's stability personally. Those organizations ground to a halt when that engineer wasn't available, and that engineer was actively making their team less efficient. Software is about building levers. If you always insist on personally being the lever, eventually you're going to snap in half. #sre #devops #engineering #process #leadership

1 Comment
Like Comment
To view or add a comment, sign in
Craig Munson
8mo
Report this post
The latest blog from Resolve Systems is out, covering site reliability engineering (SRE) with a fresh perspective. Some might say SRE is the hottest new topic that’s gaining momentum and growing in importance by the minute. But really, Google made SRE a thing back in 2003. So — you might ask — what’s the most relevant info on SRE that IT teams need to know today, as well as we approach 2024? From properly interpreting the relationship between DevOps and SRE and Ops, to fully understanding what they truly have to do with each other, and more, our engineering expert John Gorham takes you deep into the world of SRE as it applies to IT teams. https://lnkd.in/gGFUGKuu #sre #sitereliabilityengineering
1 Comment
Like Comment
To view or add a comment, sign in
Jesse McGee

NetSuite Technical Trainer - Korn Ferry (Attached to Kennicott Brothers)
7mo
Report this post
The latest blog from Resolve Systems is out, covering site reliability engineering (SRE) with a fresh perspective. Some might say SRE is the hottest new topic that’s gaining momentum and growing in importance by the minute. But really, Google made SRE a thing back in 2003. So — you might ask — what’s the most relevant info on SRE that IT teams need to know today, as well as we approach 2024? From properly interpreting the relationship between DevOps and SRE and Ops, to fully understanding what they truly have to do with each other, and more, our engineering expert John Gorham takes you deep into the world of SRE as it applies to IT teams. https://lnkd.in/gGFUGKuu #sre #sitereliabilityengineering
Like Comment
To view or add a comment, sign in
Squadcast

7,046 followers
2mo
Report this post
SRE combines software engineering with infrastructure and operations to build scalable, highly reliable systems. Discover the principles and practices of Site Reliability Engineering (SRE) in our latest blog post. https://bit.ly/3Vb5NED #SiteReliabilityEngineering #SRE #squadcast

What is Site Reliability Engineering and How it Transforms IT Operations? | Squadcast

squadcast.com
Like Comment
To view or add a comment, sign in
Venkatesh Dhanapalraj

DevOps Engineer II | Cloud Services, Cloud Infrastructure | Mentor
8mo
Report this post
🚀 Excited to share my latest Medium blog, delving into the fascinating realms of #DevOps vs. #SRE! 🌐 Discover key insights on efficient software delivery and reliability engineering. 👉 Follow me for more in-depth content and join the conversation! Read the blog here: https://lnkd.in/g__tciTc Connect with me on LinkedIn for more tech discussions: https://lnkd.in/ghVcv7rT #TechInsights #FollowForUpdates #TechInnovation ✨

DevOps vs SRE Understand the differences

link.medium.com
Like Comment
To view or add a comment, sign in

396,058 followers

View Profile Follow

Kyndryl’s Post

More from this author

#WeAreKyndryl Across The Globe

Kyndryl Names Harsh Chugh as Chief Operating Officer

Kyndryl Names David Wyshner as Chief Financial Officer

Explore topics