Overview
Work within our clients machine learning team to deploy and optimize models for applications like low-latency speech recognition and large language models (LLMs). Initial focus will be on improving our clients speech recognition model’s training pipeline on multi-GPU systems to boost performance and quality.
Responsibilities:
-Train and deploy state-of-the-art ML models.
-Apply optimization techniques (distillation, pruning, quantization).
-Enhance speech models with features such as diarization, multilingual support, and keyword boosting.
-Optimize models for low-latency inference on accelerators.
-Improve training workflows and GPU utilization.
-Use data augmentation to improve performance.
-Stay updated on ML research to guide strategy.
Requirements:
-Master’s or PhD in a relevant field with strong ML foundations.
-Training ML models for production use.
-PyTorch or TensorFlow.
-Handling large datasets (multi-terabyte).
-Familiarity with Linux, version control, and CI/CD systems.
-Knowledge of model compression (e.g., reduced precision).
Never supply bank or financial information to advertisers. If bank details or money are requested for a job vacancy, email
support@tiptopjob.com.
If you encounter any technical problems in applying for this job, please
click here.