Data Versioning Strategies for Reproducible Machine Learning Experiments
Abstract
Reproducibility is a major challenge in machine learning research and production systems. This paper evaluates data versioning strategies that enable consistent experimentation across distributed teams. The findings highlight best practices for integrating version control into large-scale data engineering workflows.
Cite this article
(2024). Data Versioning Strategies for Reproducible Machine Learning Experiments. Research Explorations in Global Knowledge & Technology (REGKT), 3 (3). Retrieved from https://regkt.com/article.php?id=752&slug=data-versioning-strategies-reproducible-ml-experiments