Staff Backend Engineer - Adaptive Telemetry
100000GBP - 120000GBP
New York, United States
Grafana
Prometheus
Microservices
Kubernetes
Python
More jobs from this company
views: 0
Staff Backend Engineer - Adaptive Telemetry

Grafana Cloud is our composable observability platform that integrates metrics, logs, traces, and profiles with Grafana. It allows our customers to leverage the best open source observability software 6 including Prometheus, Mimir, Loki, Tempo, and Pyroscope 6 without the overhead of installing, maintaining and scaling their own observability stack.

The Databases department owns and operates the telemetry databases that are,,, and. We offer our databases as a Cloud service supporting Grafana Cloud.

The Adaptive Telemetry group, part of the Databases department, has the mission of ensuring that all telemetry stored in our databases is worthy of attention. Under that mission, the group is responsible for the development of , , and Adaptive Profiles.

Our Adaptive Telemetry solutions give users the ability to control and optimize their telemetry data. These solutions ensure that data storage is optimized based on individual usage patterns, so only the most valuable data is retained.


Responsibilities

  • Drive technical strategy and roadmap. Proactively define the architectural vision, prioritize work that unlocks major product or platform improvements, and influence product and engineering decisions.
  • Lead end-to-end delivery of large, cross-functional projects. Own planning, design, execution, rollout and long-term operation of large initiatives.
  • Own architecture, reliability, performance and cost for critical systems. Make pragmatic architecture choices that balance scalability, availability, latency and cost while ensuring systems remain maintainable and evolvable.
  • Define SLOs/SLIs and lead incident response. Establish measurable reliability targets, run high-severity incident response, lead blameless post-mortems, and drive systemic fixes and automation to prevent recurrence.
  • Improve observability, automation and operational readiness. Champion telemetry, alerting, runbooks, capacity planning and automation efforts that reduce toil, speed debugging and lower MTTR.
  • Align stakeholders and remove blockers. Coordinate across Product, Design and other teams to align priorities, negotiate tradeoffs, and unblock delivery for large initiatives.
  • Mentor and grow engineering talent. Coach senior and mid-level engineers, lead design reviews, raise engineering standards, and help teammates make sound technical tradeoffs.
  • Represent engineering internally and externally. Communicate technical strategy clearly to non-engineering stakeholders and represent the team in cross-team planning.

What makes you a great fit

You are a motivated self starter with a bias towards action. You are customer focused. We build everything with our users in mind. You have a passion for creating intuitive products that fit customers needs

  • Proven delivery of large distributed systems. Experience shipping and operating complex systems that span multiple teams, with clear evidence of technical leadership and impact.
  • Strong systems-design instincts. Deep understanding of tradeoffs around latency, consistency, availability, scaling and cost.
  • Hands-on cloud and platform experience. Solid experience with cloud-native architectures (microservices, containers/Kubernetes, IaC) and the operational practices that keep them healthy.
  • Reliability and performance ownership. Comfortable defining SLOs/SLIs, doing capacity planning, tuning performance, and driving reliability work end-to-end.
  • Excellent coding and design skills. You write clear, maintainable, well-tested code and can lead technical designs 6 we use Go, but Python/C/C++/Rust or similar translate well.
  • Comfort with AI-assisted development. We embrace AI and agentic development so we expect you to be curious and comfortable using AI-powered developer tools and ideally have practical experience folding them into a team s workflow.
  • Experience with messaging and telemetry. Familiarity with streaming/messaging systems (e.g., Kafka) and observability tooling (Prometheus/Grafana or equivalents).
  • Influence without authority. Ability to align cross-functional stakeholders, set priorities and drive outcomes in a remote-first environment.
  • Strong communicator. Clear written and verbal communication that works across engineers and non-technical stakeholders.
Staff Backend Engineer - Adaptive Telemetry
100000GBP - 120000GBP
New York, United States
Grafana
Prometheus
Microservices
Kubernetes
Python
More jobs from this company
views: 0

Be the first to know about
new jobs every week

Get 8 new jobs with salaries, once per week! Sign up here so you don't miss a single newsletter.