Customer Success Center Logo
Customer Success Center:

Day 2 Operations - Automate, Consolidate, Extend, and Monitor

Master practical strategies for seamless environment scaling, policy enforcement, and real-world troubleshooting to confidently manage Day 2 operations

The Day 2 Operations Workshop builds on your Kubernetes and container expertise, equipping you to confidently manage and operate environments at scale. By the end of the workshop, you’ll be ready to handle any scenario—whether it’s managing a growing number of nodes or ensuring critical production workloads stay operational—by leveraging the right tools and strategies.

This workshop will empower you to scale and replicate environments seamlessly, regardless of their purpose or complexity. You’ll also learn how to extend Kubernetes security to services and applications outside of the Kubernetes ecosystem, creating a unified, secure environment through a single pane of glass.

Our journey begins with automation, transitioning from manual fine-tuning to streamlined, industrial-grade operations. Next, we’ll address securing non-Kubernetes nodes and critical services that your microservices depend on. Finally, we’ll cover the essential metrics and monitoring practices needed to ensure your systems operate optimally.

Through a combination of theoretical insights, clear diagrams, and hands-on labs, you’ll gain practical experience in real-world scenarios, including troubleshooting exercises. By the end, you’ll feel confident and prepared to manage complex, scalable operations with ease.

Scope

The Day 2 operations workshop covers the following,

Automating your Calico deployment using GitOps practices, with practical examples of Calico installation and security policy configuration with ArgoCD
Enhancing and consolidating your security posture by leveraging Calico's non-cluster host security policies
Identifying key metrics to monitor, defining appropriate threshold values, and outlining necessary remediation steps when alerts are triggered by Calico components

Value

Deepen your understanding of securing environments to effectively scale operations with ease
Gain hands-on expertise in automating the deployment and configuration of Calico, and seamlessly integrating it into GitOps workflows
Build a solid foundation in Kubernetes cluster and network security, extending protection to workloads outside the cluster to consolidate tools and simplify management
Acquire practical experience in operating and maintaining critical cluster components, enabling swift remediation when issues arise under pressure
Gain comprehensive insights into recommended best security practices, architectural principles, and their integration into Day 2 operations
Hone your troubleshooting skills through hands-on labs, featuring guided walkthroughs and “fix-it-yourself” scenarios for common challenges

Delivery

Module 01 - Automation

Workshop Overview:
  • Introductions and agenda review
  • Gather specific requirements and areas of interest from participants
GitOps and Cluster Architecture:
  • Overview of GitOps principles in cluster management
  • Detailed discussion of cluster architecture
Git Repository Architecture and Workflow:
  • Structure and best practices for repository management
  • Workflow walkthrough and upgrade management

Module 02 - Security Architecture and Troubleshooting

Universal Microsegmentation:
  • What it is and how it fits into the overall architecture
Policy Architecture and Processing:
  • Overview of Calico policy architecture
  • Understanding how policies are processed
Tools and Methodology:
  • Key tools for troubleshooting
  • Methodologies for diagnosing and resolving common issues

Module 03 - Monitoring

Understanding Observability vs Monitoring:
  • Key differences and importance of both in cluster operations
Core Components:
  • Typha and Felix - Their roles and integration
  • Licensing considerations
  • ElasticSearch and its role in monitoring and policy enforcement
Denied Traffic metric:
  • Use calico-node metrics to troubleshoot deny traffic

Deliverables

Automated Calico deployment and security policy management using ArgoCD
Comprehensive guides for creating and deploying policies in both cluster and non-cluster environments
Insights into metrics scraping with Prometheus, including key metrics interpretation, threshold definitions, and remediation strategies