---
title: "Reverse Engineering & Algorithms - José DA COSTA"
description: "Reverse engineering and algorithms is, in my profile, **the high-leverage card to play on pointed technical challenges**: custom search engines, combinatorial optimisations, binary audits, undocumente"
locale: "en"
canonical: "https://portfolio.josedacosta.info/en/skills/reverse-engineering-algorithms"
source: "https://portfolio.josedacosta.info/en/skills/reverse-engineering-algorithms.md"
html_source: "https://portfolio.josedacosta.info/en/skills/reverse-engineering-algorithms"
author: "José DA COSTA"
type: "skill"
slug: "reverse-engineering-algorithms"
generated_at: "2026-04-26T21:12:15.885Z"
---

# Reverse Engineering & Algorithms

Icon: 🔬

## My definition

Reverse engineering and algorithms is, in my profile, **the high-leverage card to play on pointed technical challenges**: custom search engines, combinatorial optimisations, binary audits, undocumented APIs, alternatives to a vendor closing its API. It is not a daily skill, it is a **force multiplier** when the team is stuck and no off-the-shelf solution fits.

### Context

I activate it on **2 specific triggers**. **Customer requirement without an off-the-shelf solution**: SOL's combinatorial pricing engine (15,000 variations), multilingual TF-IDF inverted index for B2B (7 languages), Babel AST + PostCSS static analysis for Tailwind v4. **Vendor lock-in to break**: rebuilding an endpoint to free yourself from an API that closes or starts charging. Theoretical roots at the **Master ESIEA Expert in Software Engineering** (CLRS, complexity, data structures) and production exposure at **Celiane** (Google, MSN, Voilà.fr ranking algorithm reverse engineering, TOP 10 Carpediem affiliates) and **European Sourcing** (B2B catalogue search engine).

### Relevance

In 2026, reverse-engineering is regaining value through the combined effect of **three trends**: closed APIs that change their T&Cs or pricing, opaque AI models that get deprecated (DALL-E 3 retired in May 2026), and **agentic vendor lock-in** becoming a quantified operational risk - **89% of enterprises** went multi-cloud precisely for this reason. Kai Waehner maps the topic in [Enterprise Agentic AI Landscape 2026: Trust, Flexibility, and Vendor Lock-in](https://www.kai-waehner.de/blog/2026/04/06/enterprise-agentic-ai-landscape-2026-trust-flexibility-and-vendor-lock-in/). The team that can reverse-engineer an API when the vendor closes the door earns several months of head start over its competitors.

## My evidence

### Computing SOL's combinatorial pricing in real time

**Context:** On the European Sourcing extranet, some suppliers like **SOL's** went down to product complexity I had never seen before: **up to 15,000 variations** for a single product (a t-shirt declined in sizes, colors, V-neck or round neck, with or without sleeves, finishings...), **more than 50 degressive pricing grids per product**, **32 currencies** indexed against the ECB, **36 fields per marking option**. No PIM on the market knew how to compute that pricing in real time at the time.

**Action:** I designed an in-house **combinatorial pricing engine** capable of producing the price of a given variant on the fly, from a rule matrix and a chain of surcharges (size × color × marking × quantity × country × currency). I picked **pre-computed indexed data structures** built in the background rather than a brute-force calculation per request, added a **Memcache plan** to absorb catalogue spikes, and normalised combinatorial attributes so that the same rule could apply equally to a SOL's t-shirt or a BIC pen. Algorithmic complexity was the real differentiator.

**Result:** Extranet catalogue served in **real time across 7 European languages** (FR, EN, DE, ES, IT, NL, PT), acceptable latency on the heaviest suppliers, and the same mechanic held **for more than 5 years** in production without algorithmic rewrite.

**Value added:** That raw algorithmic discipline I forged in my early Celiane days (Google reverse engineering) and at the Master ESIEA (CLRS, complexity). It remains rare on the 2026 market where most CTOs no longer dive into data structures. That is exactly the card I want to be able to play as a scale-up CTO when a domain falls outside the usual cases.

### Building an in-house TF-IDF search engine for the European B2B catalogue

**Context:** European Sourcing was the **B2B search engine** for promotional products at European scale - a precursor of today's marketplaces, **multi-vendor** (Midocean, PF Concept, BIC, SOL's, TopTex, Makito...), **multilingual** (7 languages), with **up to 15,000 variations** per product. No SaaS solution tested at the time could handle a B2B catalogue this atypical - engines like Algolia or stock ElasticSearch were calibrated on consumer-grade B2C e-commerce.

**Action:** I built an in-house **weighted TF-IDF inverted index** tuned to the domain. On the analysis side, **multilingual scoring** across the 7 European languages, **normalisation of combinatorial attributes** (size, color, marking, MOQ, country), B2B synonym disambiguation (promotional product, goodies, business gifts...). On the infrastructure side, a dedicated **cache plan** to absorb catalogue spikes, **index rebuilt in the background** during supplier imports (26+ automated connectors). I later extended the engine with an **ElasticSearch** layer for facet aggregations and typo-tolerant matching, but the scoring algorithm stayed in-house.

**Result:** **Search relevance superior** to the tested SaaS engines on the promotional-product domain, **sub-second latency** across the 7 languages, and the engine was **reused by downstream apps** (MyEasyWeb reseller sites, PhoneGap/Cordova mobile apps) without rebuild.

**Value added:** On this project I durably understood that **raw algorithmics remain a product differentiator** the moment we step outside the standard case. It is a niche but high-leverage competency in my profile: when a future ACCENSEO customer or a future scale-up faces an atypical search domain (regulated catalogue, demanding multilingual, combinatorial structures), I can ship an in-house engine instead of a SaaS that will not go the distance - and that gap counts in months of delay for those who cannot build it.

## My self-critique

### Mastery level

Level **Confirmed**, with theoretical roots at the Master ESIEA (algorithmics, complexity, data structures) and production exposure on three challenge types: **TF-IDF + inverted index** search engine for a 7-language B2B catalogue, real-time **combinatorial pricing engine** on 15,000 product variations, and more recently **Babel AST + PostCSS static analysis** for tailwindcss-obfuscator. What still needs strengthening: advanced binary reverse engineering (Ghidra / IDA Pro at scale) and competitive CTF.

### Importance in my profile

**Niche but high-leverage.** It is the card I play when no off-the-shelf solution fits. vendor lock-in, undocumented API, combinatorial constraint. Rare, but when the need lands, the gap between a team that knows and one that does not counts in months of delay. For a scale-up CTO, it is also a force multiplier when a vendor tries to capture value (license change, API closure).

### Advice (for myself and others)

*Document every reverse-engineering session* with a decision journal (method tried, signal observed, hypothesis kept), without documentation, the competency is lost between two uses. To others: do not confuse reverse engineering and tinkering, always validate the legality of the use case (licenses, T&C, copyright) before the session, and invest in algorithmic depth (CLRS, Skiena) rather than kata quantity.

## My evolution in this skill

### Role in my professional project

Reverse engineering and algorithmics are the **niche competency that secures ACCENSEO's technical autonomy** and any future CTO scale-up role. In the 24-month plan, they let me unblock a customer project in a domain without a standard solution (search, scoring, combinatorial optimization) or respond to a vendor license change by building an internal alternative. Not a primary axis but a **decisive safety net**.

### Mid-term target level

Maintain the level and **ship at least one OSS or customer project per year** mobilizing this competency. Keep the CLRS / Skiena algorithmic reading current enough to step in on senior code reviews - that is the observable indicator, not a score.

### Current training

Algorithmic katas on rotation ([LeetCode](https://leetcode.com/) hard + Project Euler), practice on ACCENSEO codebases (tailwindcss-obfuscator static analysis). Master in Software Engineering active until 2026.

### Future training

Optional CTF (Capture The Flag) participation triggered by target context (security industry). Possible deep-dive on binary reverse engineering (advanced Ghidra, NoStarch or OpenSecurityTraining course) in 2027.

## Progression across journey

This skill was developed across 2 different journey items.

- **1999** - [CTO · Founder · technical director](https://portfolio.josedacosta.info/en/journey/celiane-founder.md) (entrepreneurship) - Confidence: 4/5
- **2023** - [Master Expert in Software Engineering](https://portfolio.josedacosta.info/en/journey/master-software-engineering.md) (education) - Confidence: 5/5

## Related achievements

- [European B2B Search Engine for Promotional Products (European Sourcing)](https://portfolio.josedacosta.info/en/achievements/moteur-de-recherche-europeen-b2b-objets-publicitaires.md) - Implemented combinatorial pricing algorithms (Cartesian product generating thousands of combinations per product: sizes × colors × quantities × marking types × marking zones × finishes) and search-engine indexing theory (inverted index, analyzers, TF-IDF scoring)
- [PIM Extranet for B2B Promotional Products Search Engine (European Sourcing)](https://portfolio.josedacosta.info/en/achievements/extranet-pim-b2b-objets-publicitaires.md) - Cartesian-product pricing engine: thousands of combinations per product (sizes × colors × markings × zones × finishings)

Interactive version with navigation: https://portfolio.josedacosta.info/en/skills/reverse-engineering-algorithms
