International Conference on Machine Learning, Artificial Intelligence and Data Science

Sessions

Data Engineering and Big Data Infrastructure

This session focuses on the architecture and technologies required to support large-scale AI and ML systems. Topics include data pipelines, distributed data processing frameworks (e.g., Apache Spark, Kafka), data lakes vs. data warehouses, and data versioning. Participants will gain hands-on understanding of how data engineers prepare, store, and manage massive volumes of data in real-time. The session will also address emerging trends in cloud-based platforms (AWS, GCP, Azure) and containerized environments using Kubernetes for scalable deployment.