The Day 2 operations workshop covers the following,
The Day 2 Operations Workshop builds on your Kubernetes and container expertise, equipping you to confidently manage and operate environments at scale. By the end of the workshop, you’ll be ready to handle any scenario—whether it’s managing a growing number of nodes or ensuring critical production workloads stay operational—by leveraging the right tools and strategies.
This workshop will empower you to scale and replicate environments seamlessly, regardless of their purpose or complexity. You’ll also learn how to extend Kubernetes security to services and applications outside of the Kubernetes ecosystem, creating a unified, secure environment through a single pane of glass.
Our journey begins with automation, transitioning from manual fine-tuning to streamlined, industrial-grade operations. Next, we’ll address securing non-Kubernetes nodes and critical services that your microservices depend on. Finally, we’ll cover the essential metrics and monitoring practices needed to ensure your systems operate optimally.
Through a combination of theoretical insights, clear diagrams, and hands-on labs, you’ll gain practical experience in real-world scenarios, including troubleshooting exercises. By the end, you’ll feel confident and prepared to manage complex, scalable operations with ease.
Scope
Value
Delivery
Module 01 - Automation
- Introductions and agenda review
- Gather specific requirements and areas of interest from participants
- Overview of GitOps principles in cluster management
- Detailed discussion of cluster architecture
- Structure and best practices for repository management
- Workflow walkthrough and upgrade management
Module 02 - Security Architecture and Troubleshooting
- What it is and how it fits into the overall architecture
- Overview of Calico policy architecture
- Understanding how policies are processed
- Key tools for troubleshooting Methodologies for diagnosing and resolving common issues
Module 03 - Monitoring
- Key differences and importance of both in cluster operations
- Typha and Felix - Their roles and integration
- Licensing considerations
- ElasticSearch and its role in monitoring and policy enforcement
- Use calico-node metrics to troubleshoot deny traffic