Flux - Supplier Data Synchronization

An internal ETL platform automating the daily synchronization of product data (stock, prices, catalogs) from 30+ European suppliers via FTP, HTTP and REST APIs into the European Sourcing marketplace.

2013 - 2016

~3 years

Senior Software Engineer - ETL & Data Integration

PHP 5.3+Symfony 2.4MySQLDoctrine DBALTwigBootstrap 3/4jQuerySelect2SwiftMailerPHPExcelFTPHTTP/HTTPSREST APICSVXMLJSONElasticsearchGitCronLinux

Flux - Supplier Data Synchronization

An internal ETL platform automating the daily synchronization of product data (stock, prices, catalogs) from 30+ European suppliers via FTP, HTTP and REST APIs into the European Sourcing marketplace.

2013 - 2016

~3 years

Senior Software Engineer - ETL & Data Integration

PHP 5.3+Symfony 2.4MySQLDoctrine DBALTwigBootstrap 3/4jQuerySelect2SwiftMailerPHPExcelFTPHTTP/HTTPSREST APICSVXMLJSONElasticsearchGitCronLinux

Integrated Suppliers

30+

8+ European countries

Connector Classes

Strategy pattern (Lib/*.php)

Data Formats

CSV, XML, JSON, XLS, TSV, TXT, GZ

Database Size

~15 Go

Full SQL dump (March 2019)

Presentation

The nerve center of European supplier data integration

Flux (v1) and FluxV2 are internal web applications developed for European Sourcing, a company specialized in sourcing promotional and advertising products across Europe. These applications form the backbone of product data synchronization between suppliers and the European Sourcing platform.

The platform operates in the B2B e-commerce / promotional products marketplace domain. European Sourcing acts as a catalog aggregator for promotional product suppliers (goodies, textiles, accessories, office supplies, etc.) targeting European resellers. The company collects, normalizes and redistributes product data (stock levels, pricing, docs, technical specifications) from dozens of suppliers to its online platform, serving approximately 60 reseller websites.

Multi-protocol data retrieval

FTP, HTTP, REST API with varied authentication (API keys, login/password, tokens, hash)

Heterogeneous data normalization

30+ proprietary formats transformed into a unified internal schema

Near real-time stock & price updates

Daily automated synchronization via cron with degressive pricing grids

Automatic search re-indexing

Triggers Elasticsearch sync after each supplier update

Detailed reporting & traceability

Timestamped execution reports with downloadable logs, CSV files and ZIP archives

Email failure alerts

Automatic email notification to the team on synchronization failures

System Architecture

Overall architecture of the Flux system within the European Sourcing ecosystem

Objectives, Context, Stakes & Risks

Understanding the strategic vision behind the data pipeline

Objectives

Automate 100% of supplier stock and price synchronization (daily cron execution)
Guarantee data freshness: stocks and prices updated daily for each active supplier
Normalize 30+ heterogeneous proprietary formats into a unified internal schéma
Ensure traceability: each execution generates a timestamped report with detailed logs
Enable fine-grained control: per-supplier and per-data-type activation

Context

Part of a 20+ application ecosystem under *.europeansourcing.com (extranet, API, search engine, export, statistics, translation, etc.)
Development under the medialeads GitHub organization (~8 developers) based in Bordeaux, France
Shared MySQL master/slave database (~15 GB) with all other platform projects
Multiple environments: developer (local), staging (es-recette.com), production (OVH)

Business Stakes

Major commercial impact: outdated stock or prices generate unfulfillable orders, directly impacting credibility and revenue
International coverage: suppliers across 8+ European countries with country-specific variations
Protocol multiplicity: FTP, HTTP, REST API with varied authentication methods

Identified Risks

Supplier format dependency

Any format change by a supplier requires code adaptation - risk of silent breakage

SQL Injection vulnerability

Settings POST handler builds SQL clauses directly from user data without sanitization

Hardcoded credentials

FTP credentials and API keys are hardcoded in PHP classes (improved in v2)

No automated testing

The only test file is an untouched Symfony boilerplate (~0% coverage)

The Steps - What I Did

A concrete, phase-by-phase journey through the build

Phase 1

Flux v1 - Core Platform

Before 2016 - 2017

Developed 30+ supplier connectors individually (Lib/*.php classes)
Built the complete ETL pipeline: download → parsing → matching → DB update → report generation
Implemented web interface with dashboard, per-supplier configuration, report viewing & file download
Created CLI execution via Symfony command (php app/console lanceramaj)
My contributions: BIC France connector, PF Concept v2 connector with fixes, Midocean connector (new FTP), initial commit & SVN import

Phase 2

FluxV2 - Modernized UI & Architecture

2017 - 2019

Modernized interface migrated from Bootstrap 3 to Bootstrap 4
Improved architecture: AppBundle\Flow\ namespace replacing the legacy bundle
Externalized configuration: URL and authentication type configurable via web interface (vs hardcoded in v1)
Added real-time "Status" column to the dashboard for execution monitoring
Developer notification system on any flow modification

ETL Pipeline - Execution sequence per supplier

Full ETL pipeline showing extraction, transformation and loading steps

Data Model

Database tables used by Flux for synchronization configuration and reporting

Supplier Map - 30+ connectors by protocol

Complete map of 30+ integrated suppliers, classified by protocol (FTP, HTTP, API REST) with country of origin and data format

Codebase Metrics

Infrastructure & Deployment

Deployment environments from local development to OVH production

The Actors - Interactions

A small, focused team within a larger ecosystem

The Flux project was primarily developed by 2 developers within the medialeads organization (~8 people total). Thomas C. was the lead developer, implementing the majority of supplier connectors (28 commits, 65% of the codebase). I contributed 15 commits (35%) including the initial commit, SVN migration, and key connectors for BIC France, PF Concept and Midocean.

Git Commit Distribution (43 commits)

Key Stakeholders

Marina Lalague

Catalog Manager / Product Owner

Business décisions on data sync priorities (stock-only vs full sync)

Jennifer

Data/Catalog Manager

Alert recipient, API credential manager for suppliers

External Partners

The project interacted directly with 30+ European supplier APIs and data feeds - each with their own protocols, formats, authentication methods and technical contacts. This required constant adaptation and communication with external technical teams.

Suppliers by Protocol

Data Formats Distribution

Results

Measurable impact for the business and personal growth

Business Impact

Daily automated synchronization guarantees up-to-date product data for resellers on europeansourcing.com → improved conversion, reduced order errors
6,120 product variants updated in a single 45-second execution (Anda supplier, August 2019)
~100 daily executions across all suppliers, ensuring continuous data freshness
Massive catalog coverage: database of ~15 GB serving 60+ reseller websites
Pan-European geographic reach: suppliers from 8+ countries integrated seamlessly

Synchronization Types

Geographic Coverage

Personal Growth

Deep expertise in heterogeneous data integration (ETL) across multiple protocols and formats
Mastery of the Strategy design pattern applied to real-world supplier connector architecture
Understanding of B2B e-commerce catalog management at European scale
Experience with MySQL master/slave architecture and large-scale data operations
Practical knowledge of Symfony 2 CLI commands, services and event system

Project Aftermath

Beyond delivery - lifecycle and évolution

1
FluxV2 was operational and functional in August 2019, with logs showing successful daily executions
2
The system ran continuously in production for at least 3 years, processing hundreds of daily synchronizations
3
The évolution from v1 to v2 demonstrated the team's ability to iterate and improve: externalized configuration, modernized UI, better monitoring
4
Complete backups were archived in August 2019 (code, database dumps, screenshots), suggesting a transition or decommissioning period
5
The project is now archived - European Sourcing has likely evolved its data synchronization infrastructure since then

Critical Reflection

Honest retrospective on strengths, weaknesses and lessons learned

Strengths

Impressive functional coverage: 30+ suppliers integrated with very diverse protocols and formats, each with specific parsing requirements - demonstrating deep understanding of each supplier's data
Robust pipeline: the Base/child pattern allows adding a new supplier by only implementing transformerEnProduits(), while inheriting the entire pipeline
Exemplary traceability: each execution produces a timestamped report with detailed logs and result files, enabling precise diagnosis
Continuous évolution: creating FluxV2 shows capacity to question and improve the existing system (modernized UI, externalized configuration)
Pragmatic architecture: raw SQL over Doctrine ORM was justified by the nature of operations (bulk updates, batch queries)

Areas for Improvement

Security: SQL injection in the settings POST handler, hardcoded FTP/API credentials, basic shared session authentication
No tests: zero unit or integration tests - the only test file is an untouched Symfony boilerplate
No CI/CD: probably manual deployment via git pull on the server, no continuous integration pipeline
Tight coupling: connector classes contain download, parsing, matching and DB update logic mixed together
Missing documentation: the doc/index.rst is an empty placeholder, no custom README

Lessons Learned

Heterogeneous data integration is a normalization problem, not a coding problem - understanding each supplier's format is the real work before writing any code
The Strategy pattern is ideal for variations of the same process - Base defines the "what" (sync pipeline), each child defines the "how" (specific parsing)
Traceability is not a luxury - in an automated daily system, logs and reports are indispensable for rapid problem diagnosis
Externalizing URLs and authentication from code to a configurable interface is a major maintainability win (v2 improvement)

Related journey

Professional experience linked to this achievement

Senior Software Engineer · Lead PHP Symfony Developer

Journey

Skills applied

Technical and soft skills applied

Hard Skills

softwareDevFull-Stack Development

architectureSystem Architecture & Design

architectureREST API Design

Soft Skills

communicationProblem Solving & Critical Thinking

Image gallery

Project screenshots and visuals