33
Flo Health LTU, UAB
SITE RELIABILITY ENGINEER - MACHINE LEARNING
Flo Health LTU, UAB
Flo Health LTU, UAB

SITE RELIABILITY ENGINEER - MACHINE LEARNING

Flo Health LTU, UAB

Additional information

Site Reliability Engineer is responsible for keeping all our production ML infrastructure running smoothly. In this role you establish Service Level Objectives (SLOs) that define how reliability looks like in our production Machine Learning environments, and engineer observable solutions so that SLOs are measured by Service Level Indicators (SLIs).

Responsibilities:

  • Establish SLOs and SLIs for data and machine learning pipelines in collaboration with data and ML engineers.
  • Design and implement observability of the production ML systems -- making it easy to instrument, monitor, alert, troubleshoot and resolve.
  • Monitor model prediction performance.
  • Maximize the utilisation of CPU/GPU clusters across data science teams.
  • Improve and advance DataOps and MLOps infrastructure and operational processes.

Basic Qualifications:

  • Deep understanding of SRE fundamentals.
  • 4+ years of industry experience in applied ML.
  • Experience in designing, building and operating distributed systems at scale.
  • Strong proficiency with Python, Scala and/or Go.
  • Experience with AWS or other cloud providers.
  • Experience with Kubernetes.
  • Experience with ML Inference servers such as Triton, KFServing, or Seldon Core.
  • Experience with observability tools such as Prometheus, Thanos, Cortex or Sensu.

Preferred Qualifications:

  • Experience with infrastructure as code tools such as Terraform, Cloudformation, Ansible, Puppet or Chef.
  • Experience at the tier 1 product company or related experience working within the product organization.
Mėnesinis bruto atlyginimasBruto/mėn.  € 5000

Vietovė

    Vilnius, Vilniaus apskritis, Lietuva

Laikas

  • Visa darbo diena

Įgūdžiai

 Pyhon AWS Kunernetes MLflow Triton Seldon Core KFServing

Kalbos

  •  Anglų
Kontaktinis asmuo
Aliaksandra Ozal-Varabyova
+375298827527

Flo has a world-changing mission — to improve the health and wellbeing of every girl and woman worldwide. Our mission fuels our everyday work; we’re proud of the impact we’re making.

Chosen by 170 million people, with over 38 million monthly active users, in 100+ countries, Flo topped the list of most downloaded apps worldwide in the App Store's Health & Fitness category in 2020 by App Annie and Sensor Tower.
At Flo, we constantly search for opportunities to provide unique value to our users. Driven by gut-feeling and expertise, we push the limits of user value creation. Our innovation is fueled by our willingness to step into the unknown. We know that to grow value for our users, we have to grow ourselves. Through challenging goals, experimentation, radical candor, and empowering management — your journey at Flo means continuous improvement.

You’ll work as part of an international team of product builders, engineers, data scientists, growth hackers, user researchers, and medical experts. We expect each new teammate to raise the bar and rise with us to face challenges together.



Įmonės tinklalapishttps://flo.health/