Company Overview:JR Software Solutions is looking for an experienced and innovative Azure Open AI and Gen AI Architect to lead the development of our AI capabilities. Join us in shaping cutting-edge AI infrastructure and collaborating with our team of researchers and engineers.
Responsibilities: - AI Infrastructure Development: Lead initiatives in building large-scale distributed training clusters, deploying LLMs on GPU instances, supporting AI research, and enhancing decisioning systems in our public cloud infrastructure. - MLOps and Azure Services Integration: Drive MLOps practices by leveraging Azure ML, Databricks ML, Cognitive Services, and other Azure-native tools to streamline AI development workflows, ensuring scalability, reliability, and efficiency. - Collaboration and Implementation: Work closely with cloud and container infrastructure teams, alongside AI researchers, to design and implement advanced AI capabilities.
Project Examples: - Deploy a thousand-node training cluster optimizing storage and networking in the public cloud. - Design fault-tolerant infrastructure for large-scale training tasks using containers and checkpointing libraries. - Develop run-time infrastructure for serving large ML models like LLMs and FMs in our public cloud. - Create infrastructure for deploying search indexes and embeddings in vector databases, integrated with our existing capabilities.
Basic Qualifications: - Bachelor's degree in Computer Science, Computer Engineering, or a technical field. - Minimum of 8 years of experience designing and building data-intensive solutions using distributed computing. - At least 4 years of experience with HPCs, vector embedding, or semantic search technologies. - Minimum of 4 years of programming experience with Python, Go, Scala, or Java. - At least 3 years of experience building, scaling, and optimizing training and inferencing systems for deep neural networks.
Preferred Qualifications: - Master's or Doctoral degree in Computer Science, Computer Engineering, Electrical Engineering, Mathematics, or similar. - Expertise in MLOps practices, leveraging tools like Azure ML, Databricks ML, and Cognitive Services for AI development. - Proficiency in machine learning frameworks like TensorFlow, PyTorch, Lightning, or Mosaic ML. - Ability to navigate ambiguous environments, iterate rapidly with researchers and engineers, and prioritize effectively in a fast-paced tech-driven atmosphere. - Experience deploying large neural network models in demanding production environments. - Knowledge in building GPU clusters in the public cloud with tightly-coupled storage and networking.
We are proud to be an equal opportunity and affirmative action employer, committed to providing equal employment opportunities to all applicants and employees regardless of race, color, religion, age, sex, sexual orientation, gender identity/expression, national origin, protected veteran status, disability status, or any other legally protected basis, in accordance with applicable law. We value diversity and strive to create a workplace where all individuals feel valued and respected.
For more information, or to apply now, you must go to the website below. Please DO NOT email your resume to us as we only accept applications through our website.
https://jrssinc.isolvedhire.com/jobs/1101020-373858.html
Cohere
CDW
M365Connect
Sinergidea s.r.l.
Advansys Solutions Inc.