Contact
Illustration de la compétence DevOps, Cloud & Production Industrialization - Jose DA COSTA
Technical skillDevOps & Cloud

DevOps, Cloud & Production Industrialization

CI/CD pipelines, monitoring, and continuity at Pichet, Smile, and ACCENSEO. AWS plus OVH VPS Docker on the current infra. Industrialize delivery, observability, and incident response on production SaaS systems.

Personal Confidence
Expert5/5
FoundationalDevelopingProficientAdvancedExpert
How this competency evolved over time

Each segment is a period (journey or achievement) where the competency was applied. The colour and size of the end dot reflect the level reached during that period.

My definition

DevOps and cloud production, in my definition, is the practice that turns a piece of code into a reliable, observable, recoverable production system. It covers CI/CD, infrastructure-as-code, monitoring, continuity, testing strategy, and advanced Git workflows. Without mature DevOps, the team pays in on-call what it gains in velocity - and observability debt is never paid back at a reasonable cost.

I run this competency on 3 scopes in parallel. Local dev: Docker Compose, pnpm/Turborepo, reproducible environments via Vagrant or devcontainers. CI/CD: GitHub Actions / Bitbucket Pipelines / GitLab CI per customer context, Terraform plans validated before any apply. Cloud production: AWS (EC2, RDS, S3, Lambda, EKS, VPC) + OVH VPS Docker, observability ELK or SOFT Monitor depending on legacy. 11 years of progression from manual deployment at Zend (2014) up to multi-tenant Terraform AWS IaC at ACCENSEO (2025-2026), with 15 DevOps + 7 cloud + 7 monitoring + 7 deployment references in the portfolio.

In 2026, the observability stack is consolidating around OpenTelemetry, now CNCF-graduated and natively integrated across Google Cloud, AWS X-Ray, Azure Monitor, Datadog, New Relic, and Honeycomb. The CNCF documents the move from proprietary agents to an open pipeline in How to build a cost-effective observability platform with OpenTelemetry, with a 50% observability cost reduction and a measurable MTTR improvement at the end. For a CTO starting a platform today, OpenTelemetry + explicit FinOps (Infracost) have become the non-negotiable baseline.

My evidence

Achievement

Anecdote 1 : Codifying every ACCENSEO environment in Terraform on AWS

When I founded ACCENSEO in 2024, I set a non-negotiable rule from the very first customer: no manual configuration anywhere. Customer engagements touched healthcare, institutional real estate and finance, meaning production databases with several 100 GB of RAM (PostgreSQL, MongoDB), regular audits, and a need for full reproducibility across dev, staging, and production. Without IaC from day one, drift would set in within months.

I codified the entire infrastructure in Terraform: EC2 (application servers), RDS PostgreSQL (managed databases), S3 (object storage and backups), CloudFront (CDN), Lambda (serverless), API Gateway (REST exposure), EKS (container orchestration), VPC + Security Groups + IAM (network and security). Every customer environment has its own Terraform workspace with plans validated in GitHub Actions / Bitbucket Pipelines CI before any apply, Infracost plugged into the pipeline for explicit FinOps discipline (cost-delta review on every merge), and SSH tunnels for secured database access. Deployments are zero-downtime, backups automated, and disaster-recovery plans tested quarterly.

Zero manual configuration across the customer fleet, environments rebuildable in minutes on incident, and explicit FinOps present in every PR - any infra change displays its cost delta before being approved.

That discipline reshaped my commercial posture: I can promise a customer a reproducible environment and a transparent infra budget right from the quote, which sets me apart from consultants stacking ad-hoc servers. It is also the baseline I will replay in the next CTO scale-up role - treat infrastructure as a product deliverable, not as an ops chore.

Achievement

Anecdote 2 : Wiring observability into the Pichet PSR platform

The Pichet PSR platform (partner leads ingestion) ingested up to one lead every 2 seconds at peak, from a dozen external partners (SeLoger, Myopla, Cooper Advertising...) with strict SLAs. Each lost lead represented potentially tens of thousands of euros in missed real-estate revenue. Without per-partner observability we were flying blind - and an outage on a partner API could go unnoticed for hours.

I built observability partner by partner: dedicated SOFT Monitor dashboards (volume, error rate, latency) with one tab per connected API, real-time email alerts on every critical threshold, and native APIM observability (analytics, throttling, OAuth) on the Microsoft API Manager. I versioned the API across 5 consecutive documented versions on Confluence, with a progressive migration strategy for legacy partners. On infrastructure, I deployed on AWS EKS with Kubernetes + Docker + GitLab CI, and passed a formal 2023 security audit that hardened access controls and firewall rules.

Zero major lead-loss incident across 3 years, diagnostic time on cross-system anomalies reduced (from hours to minutes), SLAs respected on every partner, and partner integration lead time dropped from several weeks to a few days thanks to pipeline industrialisation.

That project locked in a reflex: invest in monitoring tooling on day 1 of a critical platform, because observability debt is never paid back at a reasonable cost. On every ACCENSEO engagement, that is the first deliverable I now lay down on any customer infrastructure I take over.

Achievement

Anecdote 3 : Industrialising the Pichet ESB pipeline over 4 years

The Groupe Pichet ESB scope was more than 100 production integration flows across 20 business applications, 18K euros per month of Docker/Kubernetes hosting OPEX at Claranet, and 24/7 critical traffic on the accounting and financial flows. When I joined, every flow deployment relied on scattered manual operations and the SOFT Monitor system was firing 2,377 notifications per month with no triage capability.

I industrialised the pipeline brick by brick. On the CI/CD side, I rolled out a complete GitLab CI chain with explicit kill criteria on every deployment (tests, lint, Terraform plans). On operational quality, I imposed blameless post-mortems on every critical incident, formalised 7 types of technical documentation (DAA application architecture, DAT technical architecture, DAU automation, DEX exploitation, DFX flows, DIN installation, DMI migration), and one runbook per flow kept up to date. For observability, I ran the ELK Stack evaluation (Elasticsearch + Logstash + Kibana) to replace SOFT Monitor, and scoped the move to MongoDB Atlas for non-relational flows.

Single-digit incident rate maintained across 4 consecutive CIO changes (2021 to 2024), a fact often called out at the COPIL because it was unprecedented in the department. The post-mortem framework I shipped became the department-wide standard for every critical incident.

On this project I understood that DevOps maturity is not a tooling question but a discipline question: a simple system held in light SRE always beats a complex one abandoned after its purchase. That is the philosophy I lay down on every ACCENSEO engagement and that I will impose on the next scale-up platform.

My self-critique

Senior, on 11 years of progression from manual deployment at Zend (2014) to multi-tenant Terraform AWS IaC at ACCENSEO (2025-2026). Coverage is complete: GitHub Actions CI/CD, Terraform infrastructure-as-code, Docker containerisation, observability (SOFT Monitor + dashboards), continuity (cross-region backups, tested rollback), Git workflows tuned to context. 15 DevOps + 7 cloud + 7 monitoring + 7 deployment portfolio references. What still needs strengthening: Kubernetes in production beyond EKS-via-Terraform, large-scale OpenTelemetry, and advanced FinOps.

Core to a CTO scale-up role. Without mature DevOps, the team pays in on-call what it gains in velocity. It is what makes the other competencies shippable: an architecture without a pipeline stays theoretical, a strategy without observability is unmeasurable. For a CTO position in regulated industries, it is also what unlocks audits and certifications.

First significant use: BTS IG (IT Management). Progression up to CTO · Founder · technical director, now at 5/5 (Expert). The continuity of these contexts signals a robust acquisition, battle-tested by repetition and diversity.

My operating principles

  • treat the deployment pipeline like product code (PR review, tests, ADR)
  • automate early and idempotently, especially the operations done in panic (rollback, restore, credential rotation)
  • measure one indicator (DORA elite cycle, for example) before stacking ten
  • prefer *a simple system held in light SRE to a complex one abandoned after the purchase*

My evolution in this skill

DevOps and cloud are what makes my CTO decisions measurable. In the 24-month plan, they let me run a production without unmanageable on-call, defend an infra budget in front of a board with explicit FinOps, and pass a security or compliance audit without surprise. Without them, perceived customer value silently degrades as the base grows.

The observable goal is to run a multi-environment EKS cluster with transparent budget, non-noisy alerts and rollback automated and tested every quarter. The main effort axis is Kubernetes in production (beyond EKS-via-Terraform), large-scale OpenTelemetry and advanced FinOps.

Daily hands-on Terraform on ACCENSEO projects, OVH VPS Docker migration in progress (2026) with Traefik as reverse proxy, GitHub Actions for monorepo CI/CD. Master in Software Engineering active until 2026.

AWS Solutions Architect Associate (SAA) certification planned 2026, AWS DevOps Engineer Professional or Kubernetes CKA targeted 2027. Possible intensive SRE cohort (Google SRE workbook + cohort) triggered upon landing the CTO scale-up role.

My operational guardrails

Annual reread of *Site Reliability Engineering* (Google), continuous follow of OpenTelemetry release notes and Bret Mullinix's FinOps writing. Weekly intake of Cloudflare blog, AWS Architecture, GitHub Engineering. Active homelab in Docker + Tailscale to experiment with novelties without production risk.

Circular navigation