back to all jobs

Robinhood Logo

Sr. Site Reliability Engineer @

Location: New York City, NY

Job Posted: 3 years ago

Robinhood was founded on a simple idea: that our financial markets should be accessible to all. With customers at the heart of our decisions, Robinhood is lowering barriers, removing fees, and providing greater access to financial information. Together, we are building products and services that help create a financial system everyone can participate in.

Just as we focus on our customers, we also strive to create an inclusive environment where our employees can thrive and do impactful work. We are proud of the world class products and company culture we continue to build and have been recognized as:

  • A Great Place to Work
  • A CNBC Disruptor 50 in 2019 and 2020
  • A LinkedIn Top Startup in 2017, 2018, 2019 and 2020

Robinhood is backed by leading investors that include DST Global, Index Ventures, NEA, Ribbit Capital, Thrive Capital, and Sequoia.

Check out life at Robinhood on The Muse!

About the role

We’re a rapidly growing team serving a highly results-oriented engineering organization. The Site Reliability Engineering (SRE) team provides a specialization within engineering focused on designing, engineering, evolving, and safely making changes to large-scale distributed systems; these systems are often composed of disparate components which are each individually complex. In the process of that work, they are also responsible for analyzing, repairing, and preventing unexpected issues that emerge from such systems.

The SRE team has three goals:

1) Set high standards for reliable products at Robinhood.

2) Architect products and infrastructure that encourage and enforce high reliability.

3) Inspire change across the broader organization to embrace product reliability best practices and high reliability infrastructure.

We are seeking an experienced SRE to work with our SRE leadership team to help build, define the vision and roadmap of our newly formed SRE organization. Initially, you will embed with a product engineering team to understand their work, and look for opportunities to bring SRE wisdom to bear.

Your day-to-day will involve:

  • Combine software and systems knowledge to engineer high-volume distributed systems in a reliable, scalable, and fault-tolerant manner.
  • Continually optimize systems and workflows by improving architecture, infrastructure, automation, CI/CD, and observability.
  • Act as an owner and leader of Robinhood's infrastructure by ensuring project infrastructure needs are met and working proactively with customer teams to help them improve reliability and set best practices.
  • Provide mentorship both formally and informally to engineers at Robinhood, define and formalize the architecture design process and guide the overall architectural direction.
  • Represent Robinhood in the technology community (ex: conferences, technical blog posts, etc).

About you:

  • Fluent in one or more programming languages (e.g. Go, Python, Java).
  • Experience authoring and operating high-scale services.
  • Experience with scalable distributed systems, either built from scratch or on public Cloud (e.g. AWS) primitives.
  • A focus on software engineering best practices such as testing, static analysis, continuous integration, delivery, and deployment.
  • Willingness to learn and use new technologies, and to learn the ins-and-outs of the financial system.
  • Very data-driven.
  • A minimum of 5+ years of industry experience.
  • Ability to debug complex systems.

Bonus points:

  • Familiarity of Python/Django or Go
  • Experience with high-growth startups
  • Strong open source contributions

Technologies we use:

  • Python, Django/DRF
  • Go
  • CI/CD and test automation frameworks
  • Container and container orchestration technologies (e.g. Docker, Kubernetes)
  • Microservice-oriented architectures and related OSS technologies (e.g. Kafka, Celery/RabbitMQ, nginx, Redis, Postgres, Airflow, Consul, etc.)
  • Cloud-native infrastructure (AWS, GCP)
  • Linux internals and network configuration and protocols
  • Infrastructure as Code and configuration management (Terraform, SaltStack)