March 26, 2025 • Niels Weistra • kubernetes, storage, performance

Why IOPS, Cluster Sizes, and Filesystems Matter in Kubernetes

Deep dive into storage performance considerations for Kubernetes workloads, including IOPS optimization, cluster sizing strategies, and filesystem selection for optimal performance.

Why IOPS, Cluster Sizes, and Filesystems Matter in Kubernetes

When designing Kubernetes clusters for production workloads, three critical factors often determine the success or failure of your deployment: IOPS (Input/Output Operations Per Second), cluster sizing, and filesystem selection. These foundational elements directly impact application performance, scalability, and reliability.

Understanding IOPS in Kubernetes Context

What Are IOPS?

IOPS measure how many read and write operations your storage system can handle per second. In Kubernetes environments, this translates to how quickly your pods can:

Start up and load container images
Read configuration files and secrets
Write logs and application data
Handle database operations
Process file-based workloads

IOPS Requirements by Workload Type

Different workloads have vastly different IOPS requirements:

# Example: Database workload requiring high IOPS
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: database-storage
spec:
  accessModes:
    - ReadWriteOnce
  resources:
    requests:
      storage: 100Gi
  storageClassName: fast-ssd
  # Annotations for cloud providers
  annotations:
    volume.beta.kubernetes.io/storage-provisioned-iops: "3000"

Workload Categories:

Databases: 1000-10000+ IOPS
Log aggregation: 500-2000 IOPS
Web applications: 100-500 IOPS
Batch processing: 50-200 IOPS
Static content: 10-50 IOPS

Cluster Sizing: The Foundation of Performance

Node Sizing Strategies

Cluster sizing isn’t just about CPU and memory—storage performance scales with your infrastructure choices:

Small Clusters (1-10 nodes)

# Optimal for development and small production workloads
Node Specs:
  CPU: 4-8 cores
  Memory: 16-32 GB
  Storage: 100-500 GB SSD
  Network: 1-10 Gbps
  Expected IOPS: 1000-3000 per node

Medium Clusters (10-50 nodes)

# Production workloads with moderate scaling
Node Specs:
  CPU: 8-16 cores
  Memory: 32-64 GB
  Storage: 500-1000 GB SSD
  Network: 10-25 Gbps
  Expected IOPS: 3000-8000 per node

Large Clusters (50+ nodes)

# High-scale production environments
Node Specs:
  CPU: 16-32+ cores
  Memory: 64-128+ GB
  Storage: 1000+ GB NVMe SSD
  Network: 25+ Gbps
  Expected IOPS: 8000-20000+ per node

Storage Distribution Patterns

# Example: Distributing storage across availability zones
kubectl get nodes -o custom-columns=NAME:.metadata.name,ZONE:.metadata.labels.'topology\.kubernetes\.io/zone',STORAGE:.status.allocatable.ephemeral-storage

Filesystem Selection: The Hidden Performance Factor

Filesystem Comparison for Kubernetes

Filesystem	Use Case	IOPS Performance	Pros	Cons
ext4	General purpose	Good	Stable, widely supported	Limited scalability
XFS	Large files, databases	Excellent	High performance, scalable	Complex tuning
Btrfs	Advanced features	Good	Snapshots, compression	Less mature
ZFS	Enterprise storage	Excellent	Data integrity, features	Resource intensive

Filesystem Tuning for Kubernetes

XFS Optimization Example

# Mount options for high-performance XFS in Kubernetes nodes
/dev/sdb1 /var/lib/kubelet xfs defaults,noatime,largeio,inode64,allocsize=16m 0 2

ext4 Tuning

# High-performance ext4 configuration
/dev/sdb1 /var/lib/kubelet ext4 defaults,noatime,data=writeback,barrier=0,nobh 0 2

Real-World Performance Impact

Case Study: E-commerce Platform

Before Optimization:

20-node cluster with spinning disks
150 IOPS per node average
Pod startup time: 45-60 seconds
Database query latency: 500-1000ms

After Optimization:

Same cluster with SSD + XFS + proper sizing
5000 IOPS per node average
Pod startup time: 5-10 seconds
Database query latency: 50-100ms

Monitoring IOPS in Kubernetes

# Prometheus monitoring for storage performance
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: node-exporter-storage
spec:
  selector:
    matchLabels:
      app: node-exporter
  endpoints:
  - port: metrics
    path: /metrics
    interval: 30s

Key metrics to monitor:

# IOPS utilization
rate(node_disk_reads_completed_total[5m]) + rate(node_disk_writes_completed_total[5m])

# Disk latency
rate(node_disk_read_time_seconds_total[5m]) / rate(node_disk_reads_completed_total[5m])

# Queue depth
node_disk_io_now

Best Practices for Production

1. Storage Class Design

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: high-iops-ssd
provisioner: kubernetes.io/aws-ebs
parameters:
  type: io2
  iops: "3000"
  fsType: xfs
  reclaimPolicy: Retain
allowVolumeExpansion: true
volumeBindingMode: WaitForFirstConsumer

2. Resource Limits and Requests

apiVersion: v1
kind: Pod
spec:
  containers:
  - name: database
    resources:
      requests:
        memory: "2Gi"
        cpu: "500m"
        ephemeral-storage: "10Gi"
      limits:
        memory: "4Gi"
        cpu: "1000m"
        ephemeral-storage: "20Gi"

3. Node Affinity for Storage

apiVersion: apps/v1
kind: Deployment
spec:
  template:
    spec:
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
            - matchExpressions:
              - key: node.kubernetes.io/instance-type
                operator: In
                values: ["i3.xlarge", "i3.2xlarge"]  # NVMe instances

Conclusion

The intersection of IOPS, cluster sizing, and filesystem selection creates a performance triangle that determines your Kubernetes cluster’s capabilities. Ignoring any one of these factors can create bottlenecks that no amount of CPU or memory can overcome.

Key Takeaways:

Match IOPS to workload requirements - Don’t over-provision, but ensure sufficient headroom
Size clusters based on storage patterns - Consider both compute and storage scaling together
Choose filesystems deliberately - XFS for high-performance, ext4 for stability
Monitor continuously - Storage performance degrades over time without proper monitoring
Test under load - Storage performance characteristics change dramatically under pressure

By treating storage as a first-class citizen in your Kubernetes architecture decisions, you’ll build more resilient, performant, and cost-effective clusters.

Want to discuss storage optimization strategies for your Kubernetes infrastructure? Connect with me on LinkedIn or check out more articles on cloud-native architecture.

Tags: # IOPS # filesystems # cluster-sizing # storage # kubernetes # performance-optimization

Share this post:

Back to Blog