Designing Scalable Data Engineering Pipelines for Machine Learning in Cloud-Native Ecosystems
Abstract
Modern machine learning systems depend heavily on robust and scalable data engineering pipelines capable of handling high-volume, high-velocity, and heterogeneous data sources. This research explores architectural patterns and engineering best practices for building cloud-native data pipelines that support machine learning workloads at scale. The study evaluates distributed ingestion frameworks, schema evolution strategies, data validation layers, and orchestration mechanisms using real-world enterprise scenarios. Experimental results demonstrate improvements in pipeline reliability, data freshness, and downstream model performance when modular and event-driven architectures are adopted.
Cite this article
(2022). Designing Scalable Data Engineering Pipelines for Machine Learning in Cloud-Native Ecosystems. Research Explorations in Global Knowledge & Technology (REGKT), 1 (1). Retrieved from https://regkt.com/article.php?id=746&slug=designing-scalable-data-engineering-pipelines-cloud-native-ml