
Flux - Supplier Data Synchronization
An internal ETL platform automating the daily synchronization of product data (stock, prices, catalogs) from 30+ European suppliers via FTP, HTTP and REST APIs into the European Sourcing marketplace.
Integrated Suppliers
30+
8+ European countries
Connector Classes
37
Strategy pattern (Lib/*.php)
Data Formats
7
CSV, XML, JSON, XLS, TSV, TXT, GZ
Database Size
~15 Go
Full SQL dump (March 2019)
Presentation
The nerve center of European supplier data integration
Flux (v1) and FluxV2 are internal web applications developed for European Sourcing, a company specialized in sourcing promotional and advertising products across Europe. These applications form the backbone of product data synchronization between suppliers and the European Sourcing platform.
The platform operates in the B2B e-commerce / promotional products marketplace domain. European Sourcing acts as a catalog aggregator for promotional product suppliers (goodies, textiles, accessories, office supplies, etc.) targeting European resellers. The company collects, normalizes and redistributes product data (stock levels, pricing, docs, technical specifications) from dozens of suppliers to its online platform, serving approximately 60 reseller websites.
Multi-protocol data retrieval
FTP, HTTP, REST API with varied authentication (API keys, login/password, tokens, hash)
Heterogeneous data normalization
30+ proprietary formats transformed into a unified internal schema
Near real-time stock & price updates
Daily automated synchronization via cron with degressive pricing grids
Automatic search re-indexing
Triggers Elasticsearch sync after each supplier update
Detailed reporting & traceability
Timestamped execution reports with downloadable logs, CSV files and ZIP archives
Email failure alerts
Automatic email notification to the team on synchronization failures
Objectives, Context, Stakes & Risks
Understanding the strategic vision behind the data pipeline
- Automate 100% of supplier stock and price synchronization (daily cron execution)
- Guarantee data freshness: stocks and prices updated daily for each active supplier
- Normalize 30+ heterogeneous proprietary formats into a unified internal schéma
- Ensure traceability: each execution generates a timestamped report with detailed logs
- Enable fine-grained control: per-supplier and per-data-type activation
- Part of a 20+ application ecosystem under *.europeansourcing.com (extranet, API, search engine, export, statistics, translation, etc.)
- Development under the medialeads GitHub organization (~8 developers) based in Bordeaux, France
- Shared MySQL master/slave database (~15 GB) with all other platform projects
- Multiple environments: developer (local), staging (es-recette.com), production (OVH)
- Major commercial impact: outdated stock or prices generate unfulfillable orders, directly impacting credibility and revenue
- International coverage: suppliers across 8+ European countries with country-specific variations
- Protocol multiplicity: FTP, HTTP, REST API with varied authentication methods
Supplier format dependency
Any format change by a supplier requires code adaptation - risk of silent breakage
SQL Injection vulnerability
Settings POST handler builds SQL clauses directly from user data without sanitization
Hardcoded credentials
FTP credentials and API keys are hardcoded in PHP classes (improved in v2)
No automated testing
The only test file is an untouched Symfony boilerplate (~0% coverage)
The Steps - What I Did
A concrete, phase-by-phase journey through the build
- Developed 30+ supplier connectors individually (Lib/*.php classes)
- Built the complete ETL pipeline: download → parsing → matching → DB update → report generation
- Implemented web interface with dashboard, per-supplier configuration, report viewing & file download
- Created CLI execution via Symfony command (php app/console lanceramaj)
- My contributions: BIC France connector, PF Concept v2 connector with fixes, Midocean connector (new FTP), initial commit & SVN import
- Modernized interface migrated from Bootstrap 3 to Bootstrap 4
- Improved architecture: AppBundle\Flow\ namespace replacing the legacy bundle
- Externalized configuration: URL and authentication type configurable via web interface (vs hardcoded in v1)
- Added real-time "Status" column to the dashboard for execution monitoring
- Developer notification system on any flow modification
The Actors - Interactions
A small, focused team within a larger ecosystem
The Flux project was primarily developed by 2 developers within the medialeads organization (~8 people total). Thomas C. was the lead developer, implementing the majority of supplier connectors (28 commits, 65% of the codebase). I contributed 15 commits (35%) including the initial commit, SVN migration, and key connectors for BIC France, PF Concept and Midocean.
Marina Lalague
Catalog Manager / Product OwnerBusiness décisions on data sync priorities (stock-only vs full sync)
Jennifer
Data/Catalog ManagerAlert recipient, API credential manager for suppliers
The project interacted directly with 30+ European supplier APIs and data feeds - each with their own protocols, formats, authentication methods and technical contacts. This required constant adaptation and communication with external technical teams.
Results
Measurable impact for the business and personal growth
- Daily automated synchronization guarantees up-to-date product data for resellers on europeansourcing.com → improved conversion, reduced order errors
- 6,120 product variants updated in a single 45-second execution (Anda supplier, August 2019)
- ~100 daily executions across all suppliers, ensuring continuous data freshness
- Massive catalog coverage: database of ~15 GB serving 60+ reseller websites
- Pan-European geographic reach: suppliers from 8+ countries integrated seamlessly
- Deep expertise in heterogeneous data integration (ETL) across multiple protocols and formats
- Mastery of the Strategy design pattern applied to real-world supplier connector architecture
- Understanding of B2B e-commerce catalog management at European scale
- Experience with MySQL master/slave architecture and large-scale data operations
- Practical knowledge of Symfony 2 CLI commands, services and event system
Project Aftermath
Beyond delivery - lifecycle and évolution
- 1FluxV2 was operational and functional in August 2019, with logs showing successful daily executions
- 2The system ran continuously in production for at least 3 years, processing hundreds of daily synchronizations
- 3The évolution from v1 to v2 demonstrated the team's ability to iterate and improve: externalized configuration, modernized UI, better monitoring
- 4Complete backups were archived in August 2019 (code, database dumps, screenshots), suggesting a transition or decommissioning period
- 5The project is now archived - European Sourcing has likely evolved its data synchronization infrastructure since then
Critical Reflection
Honest retrospective on strengths, weaknesses and lessons learned
- Impressive functional coverage: 30+ suppliers integrated with very diverse protocols and formats, each with specific parsing requirements - demonstrating deep understanding of each supplier's data
- Robust pipeline: the Base/child pattern allows adding a new supplier by only implementing transformerEnProduits(), while inheriting the entire pipeline
- Exemplary traceability: each execution produces a timestamped report with detailed logs and result files, enabling precise diagnosis
- Continuous évolution: creating FluxV2 shows capacity to question and improve the existing system (modernized UI, externalized configuration)
- Pragmatic architecture: raw SQL over Doctrine ORM was justified by the nature of operations (bulk updates, batch queries)
- Security: SQL injection in the settings POST handler, hardcoded FTP/API credentials, basic shared session authentication
- No tests: zero unit or integration tests - the only test file is an untouched Symfony boilerplate
- No CI/CD: probably manual deployment via git pull on the server, no continuous integration pipeline
- Tight coupling: connector classes contain download, parsing, matching and DB update logic mixed together
- Missing documentation: the doc/index.rst is an empty placeholder, no custom README
- Heterogeneous data integration is a normalization problem, not a coding problem - understanding each supplier's format is the real work before writing any code
- The Strategy pattern is ideal for variations of the same process - Base defines the "what" (sync pipeline), each child defines the "how" (specific parsing)
- Traceability is not a luxury - in an automated daily system, logs and reports are indispensable for rapid problem diagnosis
- Externalizing URLs and authentication from code to a configurable interface is a major maintainability win (v2 improvement)
Related journey
Professional experience linked to this achievement
Skills applied
Technical and soft skills applied
Image gallery
Project screenshots and visuals