How Kubernetes Works from Start to Finish
-
You set up a Kubernetes cluster: it has a control plane (the brain) and multiple worker nodes (the muscle).
-
You write a YAML file that tells K8s what you want: like “run this container, make 3 copies, expose it on this port”.
-
You send this YAML to Kubernetes using the
kubectl apply
command. -
The API Server receives the request and stores the desired state in etcd (a key-value store that acts like memory for K8s).
-
The Scheduler picks the best worker node to run your container, based on CPU, memory, etc.
-
The Controller Manager makes sure what you asked for actually happens: if you want 3 pods and only 2 are running, it creates the missing one.
-
A Pod is the smallest unit: it runs your container(s). Usually, 1 container per pod.
-
The pod is launched on a worker node by kubelet (an agent running on every worker node).
-
The container inside the pod is pulled from a container registry like Docker Hub or your private repo.
-
Pods can talk to each other using ClusterIP services: Kubernetes gives every pod its own IP address.
-
If you want to expose your service to the outside world, you use a Service of type LoadBalancer or NodePort.
-
You can add an Ingress for smarter routing: like URLs, SSL, etc. (reverse proxy-style).
-
Kubernetes watches everything: if a pod crashes, it restarts it automatically (self-healing).
-
If you update your app, you can do a rolling update: no downtime, one pod at a time gets replaced.
-
If a node dies, Kubernetes automatically shifts pods to healthy nodes: your app keeps running.
-
You can set resource limits: like “don’t let this container use more than 500Mi memory.”
-
You can mount volumes: for persistent storage like databases.
-
You can inject config and secrets into containers using ConfigMaps and Secrets.
-
For scaling, you can enable Horizontal Pod Autoscaler: pods go up/down based on CPU or custom metrics.
-
You can schedule cron jobs or one-time jobs easily inside the cluster.
-
Logging and monitoring is done using tools like Prometheus, Grafana, and ELK stack.
-
Everything is declared: you describe the desired state, and Kubernetes works constantly to match it.
-
You can deploy, roll back, scale, and manage any app: stateless, stateful, or batch: in a consistent, cloud-agnostic way.
Kubernetes Architecture Deep Dive
Control Plane Components
The Kubernetes control plane consists of several critical components:
-
API Server (kube-apiserver):
- The front-end for the Kubernetes control plane
- Exposes the Kubernetes API
- Processes RESTful requests and validates them
- Serves as the sole interaction point with the cluster state
- Uses RBAC for authorization and TLS for secure communication
-
etcd:
- Distributed key-value store that stores all cluster data
- The source of truth for the cluster state
- Usually deployed as a high-availability cluster
- Uses the Raft consensus algorithm
- Stores data in a structured hierarchical key space
-
Scheduler (kube-scheduler):
- Watches for newly created Pods with no assigned node
- Selects an optimal node for them to run on
- Uses sophisticated scoring algorithm taking into account:
- Hardware/software/policy constraints
- Data locality
- Resource requirements
- Inter-workload interference
- Deadlines
-
Controller Manager (kube-controller-manager):
- Runs controller processes that regulate the state of the system
- Node Controller: Notices when nodes go down
- Replication Controller: Maintains the correct number of pods
- Endpoints Controller: Populates the Endpoints object
- Service Account & Token Controllers: Create default accounts and API access tokens
-
Cloud Controller Manager:
- Embeds cloud-specific control logic
- Links cluster to cloud provider’s API
- Manages cloud-specific components like load balancers and storage
Node Components
Each worker node runs these components:
-
Kubelet:
- Agent that runs on each node
- Ensures containers are running in a Pod
- Takes PodSpecs from the API server
- Uses Container Runtime Interface (CRI) to talk to container runtime
- Reports node and pod status to the master
-
Container Runtime:
- Software responsible for running containers
- Options include containerd, CRI-O, or Docker
- Pulls images and runs containers
-
Kube-proxy:
- Network proxy that runs on each node
- Implements part of the Kubernetes Service concept
- Maintains network rules on nodes
- Performs connection forwarding or load balancing for service IPs
Networking
Kubernetes networking addresses four concerns:
-
Pod-to-Pod Communication:
- Every Pod gets its own IP address
- All containers within a Pod share the network namespace
- Implemented using Container Network Interface (CNI) plugins
- Popular implementations: Calico, Flannel, Cilium, Weave Net
-
Pod-to-Service Communication:
- Services abstract pod IPs behind a stable virtual IP
- Implemented through kube-proxy which sets up iptables rules
- ClusterIP is the default service type (internal only)
-
External-to-Service Communication:
- NodePort: Exposes the Service on each Node’s IP at a static port
- LoadBalancer: Exposes the Service externally using a cloud provider’s load balancer
- ExternalName: Maps a Service to a DNS name
-
DNS Resolution:
- CoreDNS serves as the cluster DNS server
- Service discovery via DNS names
- Each Service gets an internal DNS entry:
<service-name>.<namespace>.svc.cluster.local
Storage System
Kubernetes handles storage with these abstractions:
-
Persistent Volumes (PV):
- Cluster resource abstracting physical storage
- Lifecycle independent of any Pod that uses it
- Provisioned statically by cluster admin or dynamically via Storage Classes
-
Persistent Volume Claims (PVC):
- Request for storage by a user
- Claims can request specific size and access modes
- Acts as a storage consumption mechanism
-
Storage Classes:
- Define different classes of storage
- Allow dynamic volume provisioning
- Specify provisioner, parameters, reclaim policy
-
Volume Types:
- Block storage: AWS EBS, GCE PD, Azure Disk
- File storage: NFS, Azure File, AWS EFS
- Object storage: via Custom Storage Integrations
RBAC and Security
Kubernetes security model includes:
-
Role-Based Access Control (RBAC):
- Regulates access to resources based on roles
- Roles: Permissions within a namespace
- ClusterRoles: Cluster-wide permissions
- RoleBindings: Bind roles to users in a namespace
- ClusterRoleBindings: Bind cluster roles cluster-wide
-
Pod Security:
- Pod Security Standards define different security levels
- SecurityContext: Configure security settings at Pod or Container level
- PodSecurityPolicies (deprecated) / Pod Security Admission: Enforce security standards
-
Network Policies:
- Firewall-like rules for Pod networking
- Specify how Pods communicate with each other
- Based on labels, namespaces, and IP blocks
-
Secrets Management:
- Store sensitive information like passwords, tokens, keys
- Mounted as files or environment variables
- Base64 encoded by default (not encrypted)
- Integration with external secret managers possible
Advanced Topics
-
Custom Resource Definitions (CRDs):
- Extend Kubernetes API with custom resources
- Define new object types
- Operators use CRDs to encode domain-specific knowledge
-
Service Mesh Integration:
- Istio, Linkerd, Consul provide advanced networking features
- Traffic management, security, observability
- Implemented with sidecar pattern
-
Admission Controllers:
- Intercept requests to the API server
- Modify or validate object configurations
- Examples: PodSecurityPolicy, ResourceQuota, LimitRanger
-
Stateful Applications:
- StatefulSets provide ordered, stable network identifiers
- Stable, persistent storage
- Ordered, graceful deployment and scaling
- Great for databases, distributed systems
Performance Optimization
-
Resource Management:
- Requests: Minimum resources guaranteed
- Limits: Maximum resources allowed
- Quality of Service (QoS) classes:
- Guaranteed: requests=limits
- Burstable: requests < limits
- BestEffort: no requests or limits
-
Pod Affinity/Anti-Affinity:
- Control pod placement relative to other pods
- Ensure related pods run on same node (affinity)
- Keep competing pods on different nodes (anti-affinity)
-
Node Affinity:
- Place pods on nodes with specific attributes
- Hard requirements: requiredDuringSchedulingIgnoredDuringExecution
- Soft preferences: preferredDuringSchedulingIgnoredDuringExecution
-
Taints and Tolerations:
- Taints: Mark nodes to repel certain pods
- Tolerations: Allow pods to schedule onto tainted nodes