Staff Engineer, Machine Learning Operations job opportunity at NextGen Healthcare.



bot
NextGen Healthcare Staff Engineer, Machine Learning Operations
Experience: 8-years
Pattern: full-time
apply Apply Now
Salary:
Status:

Machine Learning Operations

Copy Link Report
degreeGeneral
loacation Work From Anywhere-Bangalore, India
loacation Work From Anyw..........India

Job Description: The Staff Engineer, Machine Learning Operations will provide technical leadership for our AI platform, define architecture and standards for training, evaluation, and high-scale, low-latency inference of models and AI agents. This role will be responsible to develop and implement strategy for CI/CD, governance, and reliability across multiple AI models, partnering with security, compliance, and leadership to deliver resilient, cost-effective AI. Aside from the core responsibilities, Machine Learning Operations Engineers will also have responsibilities shared with other engineering functions. Establish the technical vision for end-to-end ML-AIOps (from data to model/agent to product integration). Design and evolve multi-region, multi-tenant inference/training platforms with strong isolation. Design and Implement CI/CD strategy for models/agents/data pipelines (policy gates, canary/rollbacks, approvals). Institutionalize model/agent monitoring (quality, safety, drift) and business KPIs; sponsor continuous evaluations. Lead major reliability programs (capacity planning, disaster recovery, chaos testing, incident management). Establish and implement governance methodologies for datasets, prompts, models, and agents (lineage, approvals, etc.). Collaborate on security architecture with security teams (zero-trust, key management, vaults, secrets rotation, audit). Evaluate and integrate platforms/vendors; influence build-vs-buy; manage technical debt and roadmap. Mentor/prioritize other engineers; build a culture of documentation, runbooks, and post-incident learning. Perform other duties that support the overall objective of the position. Education Required: Bachelor’s degree in Computer Science, Information Technology, Electronics/Electrical Engineering, or a related field. Or, any combination of education and experience which would provide the required qualifications for the position. Experience Required: 5-8 years of hands-on experience in MLOps, DevOps, or related roles involving operation of an AI/ML platform at-scale with 10 – 12+ years of experience in overall IT experience. IaC with Terraform at an organizational scale and strong experience in Unix based environments. Expert with Continerization and orchestration (Docker/Kubernetes) and cloud, including networking, security, and autoscaling. Strong AWS experience is expected. Experience in building CI/CD pipelines using tools like BitBucket Pipelines, AWS Code Pipelines or similar. Experience with mature observability stacks (e.g. DataDog/Dynatrace). Experience with LLM observability frameworks is a plus. Deep experience with operationalizing ML/AI models. Experience with LLMs or AI agents is a plus. Knowledge, Skills & Abilities: Knowledge of: Familiarity with database technologies and data pipelines (Data Lakes, Lakehouse, Warehouse, NoSQL, ETL/ELT processes). Solid understanding of model monitoring, logging, and debugging tools. Strong command of platform SRE practices, and cost governance. Familiarity with feature stores, lakehouse patterns, distributed computing systems (Spark) and model versioning systems (MLFlow). Skill in: Strong problem-solving skills and a detail-oriented mindset. Excellent communication skills. Ability to: Excellent collaboration ability. Ability to have a clear view of complete systems and the ability to understand and work on different components as and when required.  The company has reviewed this job description to ensure that essential functions and basic duties have been included. It is intended to provide guidelines for job expectations and the employee's ability to perform the position described. It is not intended to be construed as an exhaustive list of all functions, responsibilities, skills and abilities. Additional functions and requirements may be assigned by supervisors as deemed appropriate. This document does not represent a contract of employment, and the company reserves the right to change this job description and/or assign tasks for the employee to perform, as the company may deem appropriate. NextGen Healthcare is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees.

Other Ai Matches

remote-jobserver Remote
AI Solutions Architect Applicants are expected to have a solid experience in handling Job related tasks
remote-jobserver Remote
Director, Product Marketing & Client Advocacy Applicants are expected to have a solid experience in handling Product Marketing & Client Advocacy related tasks
Staff Engineer, Machine Learning Operations Applicants are expected to have a solid experience in handling Machine Learning Operations related tasks
Staff Engineer, Data Modeler Applicants are expected to have a solid experience in handling Data Modeler related tasks
remote-jobserver Remote
Product Marketing Manager Applicants are expected to have a solid experience in handling Job related tasks
remote-jobserver Remote
Consultant II, Specialty Client Success Applicants are expected to have a solid experience in handling Specialty Client Success related tasks
remote-jobserver Remote
ML Operations Engineer Applicants are expected to have a solid experience in handling Job related tasks
remote-jobserver Remote
Velocity Specialist Applicants are expected to have a solid experience in handling Job related tasks
Data Engineer Applicants are expected to have a solid experience in handling Job related tasks
Human Resources Business Partner Applicants are expected to have a solid experience in handling Job related tasks
Interoperability & Integration Solutions Advisor Applicants are expected to have a solid experience in handling Job related tasks
remote-jobserver Remote
Sr. Product Manager, Clinical Content Applicants are expected to have a solid experience in handling Clinical Content related tasks
Program Manager Applicants are expected to have a solid experience in handling Job related tasks
Engineer II - Dotnet, T-SQL Applicants are expected to have a solid experience in handling T-SQL related tasks
Data Modeler Applicants are expected to have a solid experience in handling Job related tasks
Engineer II, Security Applicants are expected to have a solid experience in handling Security related tasks
Sr. Python Developer Applicants are expected to have a solid experience in handling Job related tasks
Specialist I, Information Systems Application Delivery Applicants are expected to have a solid experience in handling Information Systems Application Delivery related tasks
User Experience Designer Applicants are expected to have a solid experience in handling Job related tasks
remote-jobserver Remote
Data Scientist Applicants are expected to have a solid experience in handling Job related tasks
remote-jobserver Remote
Technical Product Owner Applicants are expected to have a solid experience in handling Job related tasks
Engineer II, Information Operations Applicants are expected to have a solid experience in handling Information Operations related tasks
remote-jobserver Remote
ITSM Problem Manager Applicants are expected to have a solid experience in handling Job related tasks