International Conference on Machine Learning, Artificial Intelligence and Data Science

Aladin Djuhera Profile

Aladin Djuhera

Biography

Aladin Djuhera, received his B.Sc, M.Sc., and PhD degrees from the Technical University of Munich (TUM). His research focuses on developing practical solutions for the scalable, efficient, and safe deployment of AI models, particularly in edge computing. He has worked on federated learning, distributed inference, and AI workload orchestration for large language models (LLMs). His current interests include LLM post-training, alignment, and reasoning.

Research Interest

Abstract

Advances in LLM Post-Training: The Role of High-Quality Data Curation Recipes: Post-training is central to aligning large language models (LLMs) with human preferences, spanning supervised fine-tuning (SFT) and preference optimization. Recent studies show that beyond algorithms, the composition and quality of datasets determine how well models generalize across reasoning, math, coding, and instruction-following tasks. In this talk, I will highlight systematic approaches to dataset analysis and curation, including quality annotation frameworks, reward-based filtering, and task-aware balancing. I will discuss how principled curation recipes enabled us to produce leaner mixtures that outperform state-of-the-art open mixtures, while reducing compute cost. The talk concludes with open challenges and opportunities for transparent, reproducible, and data-centric LLM alignment.