Role Description :
The engineer will be responsible for the design and development of optimization tools for neural networks, transformers, and large language models (LLMs). This role involves applying post-training, training-aware, and other advanced optimization techniques to enhance model efficiency and performance.
Key responsibilities :
- Develop optimization toolchains for computer vision models and large language models (LLMs).
- Perform hardware-aware model optimization and porting for Ambarella platforms.
- Research and evaluate emerging technologies, including pruning, quantization, and fine-tuning techniques for convolutional neural networks (CNNs), transformers, and LLMs.
- Provide technical support and solutions to customers regarding model optimization and deployment.
Requirements :
Education background : Master degree or degreeMinimum experience : At least one year of relevant work or academic experienceSimilar or other experiences :Experience of model deployment would be a plusExperience of model optimization such as pruning and quantization would be a plusExperience of LLM fine-tuning would be a plusExperience of model porting would be a plusSkillsBackground of machine learning based computer vision or LLM knowledgeFamiliar with machine leaning frameworks like Pytorch, TensorFlow, or HuggingfaceSkilled in Python and C / C++ programing.