AI and Digital Science Research Centre
The Artificial Intelligence Cross-Center Unit is the machine learning powerhouse of TII, working in close collaboration with our other research centers to harness the full benefits of AI across our projects – and drive innovation from new computing paradigms, designing and delivering new AI methodologies, technologies, solutions, and systems that address challenging issues across multiple sectors of the economy – from technology to healthcare, cybersecurity, and government, among others.
We incorporate core elements of intelligence (perception, sensing, planning, and language) in the ideation, design, and prototyping of next-generation systems with human-like intelligence. We build advanced AI computing and scalable AI-based software stacks and hardware systems to deliver significant enhancements in systems infrastructure. Our AI researchers, scientists, and engineers collaborate to ensure innovative outcomes, from AI theory to AI technologies towards better intelligence.
Job description
We are looking for a highly skilled and experienced Senior Engineer. In this role, you will be responsible for architecting, designing, and implementing state-of-the-art training pipelines for our cutting-edge models. You will work closely with our research and engineering teams to develop and train models at scale, ensuring our organization stays ahead in the field.
- Model Training: Lead and manage the end-to-end process of training large-scale deep learning models, from data collection and preprocessing to model development and optimization.
- Architect and Design Training Pipelines: Design efficient and scalable training pipelines, incorporating best practices, distributed computing techniques, and the latest model training methodologies.
- Research and Innovation: Stay current with the latest developments in model training and implement innovative techniques to enhance model performance and efficiency.
- Performance Optimization: Identify and resolve performance bottlenecks in model training processes, optimizing for speed and resource utilization.
- Documentation: Maintain thorough documentation of model training processes, making it accessible to the broader team.
- Mentoring and Knowledge Sharing: Mentor junior engineers and actively participate in knowledge-sharing initiatives to foster a culture of learning and growth
Skills Required
- Proven experience in training large-scale machine learning and deep learning models
- Strong proficiency in popular deep learning frameworks such as TensorFlow, PyTorch, or similar
- Solid understanding of distributed computing, GPU acceleration, and parallel processing
- Familiarity with cloud computing platforms (e.g., AWS, GCP, Azure) and containerization technologies (e.g., Docker, Kubernetes)
- Excellent problem-solving skills and the ability to troubleshoot complex issues in model training pipelines
- Strong communication and teamwork skills
- Experience with productionizing machine learning models is a plus
Qualifications Required
To qualify for this position, you will need to meet the following requirements
- Bachelor's, Master's, or Ph.D. in Computer Science, Machine Learning, or a related field