Smaller Containers, Fewer Vulnerabilities: Optimizing ITL Providers from 1.2GB to 308MB
Reduced ITL provider containers from 1.2GB to 308MB (74% smaller) with zero functionality loss. Learn the three optimizations that improve security, deployment speed, and reduce attack surface.
Last week I realized we were shipping 1.2GB containers when the application needed maybe 150MB. That’s 87% waste—and more importantly, a security liability. Every package manager, compiler, and build tool is an attack surface that doesn’t need to exist in production.
So I optimized aggressively. The result: 74% smaller with measurably better security.
How: Three Simple Optimizations
1. Alpine base image (~27% reduction)
FROM python:3.11-alpine
RUN apk add --no-cache gcc musl-dev curl postgresql-dev
Alpine is 130MB smaller than Debian slim and includes only essential tools.
2. Multi-stage builds (~30% reduction)
FROM python:3.11-alpine AS builder
RUN apk add --no-cache gcc musl-dev
RUN pip install -r requirements.txt
FROM python:3.11-alpine
RUN apk add --no-cache curl # Only runtime deps
COPY --from=builder /usr/local/lib/python3.11/site-packages /usr/local/lib/python3.11/site-packages
Compilers stay in stage 1. Runtime stage has only what’s essential.
3. Aggressive cleanup (~17% reduction)
# Remove SDK source (redundant after pip install)
RUN rm -rf ./ITL.ControlPanel.SDK ./build *.egg-info pyproject.toml
# Clean Python packages
RUN find /usr/local/lib/python3.11/site-packages -type d -name "__pycache__" -exec rm -rf {} + && \
find /usr/local/lib/python3.11/site-packages -type d \( -name "tests" -o -name "*docs*" -o -name "*example*" \) -exec rm -rf {} + && \
find /usr/local/lib/python3.11/site-packages -type f -name "*.pyc" -delete
Result: 1.2GB → 308MB (74% smaller)
Why Smaller = Safer
Every component in a container is a potential vulnerability. The original 1.2GB image included gcc, git, build tools, and the entire SDK source code—all with CVEs you don’t need.
Removing them is straightforward:
- No compiler in runtime = can’t compile malware if someone gains access
- No SDK source duplication = removed 100MB of redundant code
- Fewer packages = fewer CVEs = original scan: 87 vulnerabilities, optimized: 12 (only runtime-critical)
- Principle of least privilege = include only what you actually use
The Cleanup: Remove What You’ve Already Installed
Here’s the surprising part: after pip install, your source code is duplicated. Once the SDK is installed to /usr/local/lib/python3.11/site-packages/, the source directory is just dead weight.
# After pip install, remove redundant files
RUN pip install . --no-deps && \
rm -rf ./ITL.ControlPanel.SDK ./build *.egg-info && \
rm -f pyproject.toml
This cleanup removed an extra 24MB from the runtime image—files that were never going to be used after installation.
Similarly: tests, documentation, examples? All gone. Package metadata that only matters during build? Deleted. The image now contains exactly what’s needed to run the application.
The Payoff
Faster deployments (60% quicker pulls from registry). Quicker startup (~3-5s faster). Less disk space per node. Cleaner vulnerability scanning. All real, all measurable.
Every provider went through validation: built successfully, started without errors, health checks responded, database migrations ran correctly. Status: ✅ All tested, running in production.
Trade-offs
Debugging is hard. No shell means no docker exec <container> /bin/sh. No curl, no cat, no debugging tools whatsoever. For production that’s ideal. For troubleshooting, it’s painful.
Applications must handle startup entirely in code—no shell script shortcuts. This is cleaner architecturally but requires discipline.
Not suitable for: Development (need debugging). Legacy apps with shell scripts. Anything needing runtime shell access.
Best for: Production services that are mature and tested. Hardened microservices.
Could We Go Further?
Google Distroless would get us to ~180-200MB (no shell, no package manager, minimal libc only). The trade-off: zero debugging capability. Can’t run any tools inside the container. For hardened production, it’s worth it. For anything with operational needs, Alpine with tools is a better balance.
Comparison: Alpine vs Distroless
| Factor | Debian Slim | Alpine | Distroless |
|---|---|---|---|
| Base Size | 80MB | 7MB | 2-5MB |
| Final Image | 1.2GB+ | 308MB | 150-200MB |
| Vulnerabilities | 87 | 12 | 2-4 |
| Shell Access | Yes | Yes | No |
| Package Manager | Yes (apt) | Yes (apk) | No |
| Build Tools | Yes | Yes (with –no-cache) | No |
| Python Packages | Full | Full | Full |
| Debugging | Easy | Easy | Extremely hard |
| Security | Moderate | Good | Excellent |
| Dev vs Prod | Both | Both | Prod only |
Distroless Implementation
Moving to distroless requires a different multi-stage approach:
FROM python:3.11-alpine AS builder
RUN apk add --no-cache gcc musl-dev postgresql-dev
WORKDIR /build
COPY requirements.txt .
RUN pip install --target site-packages -r requirements.txt
FROM gcr.io/distroless/python3.11:nonroot
# Copy Python packages from builder
COPY --from=builder /build/site-packages /usr/local/lib/python3.11/site-packages
# Copy application code only
COPY --chown=nonroot:nonroot ./app /app
WORKDIR /app
# Distroless defaults to nonroot user (65532)
# Explicit ENTRYPOINT required (no shell to interpret CMD)
ENTRYPOINT ["python", "/app/main.py"]
Key differences:
- No shell interpreter (
/bin/shdoesn’t exist) - No package manager (can’t run
aptorapk) - Explicit
ENTRYPOINTinstead ofCMD(no shell to parse it) - Must use
--chownfor file ownership (noRUN chownlater) - Built-in nonroot user for security
Production Results (Distroless)
- Final Size: 165MB (86% smaller than original Debian, 47% smaller than Alpine)
- CVE Count: 2 (only critical libc vulnerabilities, no application deps)
- Build Time: 22 seconds (faster, fewer layers)
- Startup Time: 1.2 seconds (fastest of all three)
- Disk Space per 1000 nodes: 165GB vs 1.2TB vs 308GB
Should You Use Distroless?
Use Distroless if:
- ✅ Service is production-hardened and tested
- ✅ Logs/monitoring streams to external system (not parsed in-container)
- ✅ Security & minimal attack surface is highest priority
- ✅ Debugging happens via live logs, not container shell
- ✅ You have CI/CD automation for rollbacks
Don’t use Distroless if:
- ❌ Debugging in production containers is required
- ❌ Health checks use in-container shell scripts
- ❌ Config is managed via in-container tools
- ❌ Team isn’t comfortable with no shell access
- ❌ Legacy startup scripts expect
/bin/sh
Our Production Setup
We’re using a hybrid approach:
- Development: Debian slim (easy debugging)
- Staging: Alpine (production-like, but debuggable)
- Production: Distroless (maximum hardening)
Same codebase, different Dockerfile stages selected at build time. This gives us:
- Development velocity with full debugging
- Production security without compromise
- Easy promotion through environments
Infrastructure: Our CI/CD automatically builds these optimized images. From 1.2GB down to 308MB, every provider benefits—faster deployments, smaller registry footprint, fewer CVEs to track.
The Bottom Line
Smaller containers are faster, cheaper, and more secure. Not magic—just the principle of removing everything that doesn’t have to be there.
Apply it beyond containers: remove unnecessary code, unnecessary dependencies, unnecessary access. Build systems that are as simple as they need to be, not as complex as they can be.
That’s the philosophy behind the ITL Control Plane. Abstract first, optimize second.
If you’re shipping containers, follow this checklist:
- Switch to Alpine base (or Distroless for maximum hardening)
- Multi-stage builds (builder stage + runtime stage)
- Remove SDK/source directories after pip install
- Remove build artifacts (build/, *.egg-info)
- Strip test files and documentation
- Remove
__pycache__and.pycfiles - Test end-to-end before production
- Run vulnerability scans pre/post optimization
Part of the ITL Control Plane series. Previously: Building Your Own Cloud: ITL Control Plane Alpha
Related Posts in This Series
- Building Your Own Cloud: ITL Control Plane Alpha — Architecture and design principles behind the ITL Control Plane
- Core IAM Providers: Multi-Realm Identity Architecture — How identity federation scales across multiple Keycloak realms