The Non-Wizard’s Guide to Supercharging Data Pipelines
Atelier
Cloud/Data Integration
Data Engineering
Data pipeline
Open Source
25/09/2023 | 12h30 - 13h00 | Salle 5
Description
As data increasingly becomes the lifeblood of businesses and organizations, optimizing data pipelines is becoming more and more crucial for engineering organizations. How does one know where to start?We walk through our journey of pushing the performance boundaries of Airbyte’s pipelines to achieve a 4x speed up.We debunk the myth that performance optimization is solely the realm of engineering wizards, concocting magical algorithms and techniques behind closed doors. Instead, we showcase how understanding the system as a whole and employing iterative strategies can incrementally unlock significant performance gains with minimal engineering complexity.We illustrate some hidden pitfalls we stumbled into around pipes and backpressure, and identify some practical lessons and techniques we hope all devs can benefit from.