Senior DevOps Engineer - Network (100% Remote, Work From Home)

Bitcoin Devs Company

Early Applicant

3 months ago
Be among the first 50 applicants

Exp: 5-7 Years

India

Job Description

Overview

The Senior DevOps Engineer - Network is a critical role responsible for overseeing the network infrastructure and ensuring the successful integration of network operations with DevOps processes. This position plays a key role in the design, implementation, and maintenance of network solutions in alignment with the organization's DevOps objectives.

As a member of the Platform Engineering team, you will be responsible for managing and supporting the infrastructure which drives our platform. The reliability and scalability of our technology is key to our success and this position will work with our development and security teams to help design highly available and fault tolerant systems.

In particular you will be focused on monitoring and optimizing our network performance to support the low-latency, high throughput operation of our trading exchange.

Key responsibilities

Continuously improve the resiliency, throughput and latency profiles of our trading systems, by working hand-in-hand with our trading technology teams
Manage and support our AWS cloud infrastructure, EC2 instances and physical
servers
Development and management of IaC to ensure consistency of our infrastructure
Ensuring security hardening of our OS builds and configurations
Manage and maintain config management tooling to ensure consistency
Integration of our stack with Kubernetes
Ensure SRE best practices for design and operation of the stack
Design, implement and test disaster recovery capabilities to ensure our business
can continue to operate in the event of a technology failure
Participate in an on-call rota for escalations

Required Qualifications

Theoretical and practical networking knowledge, incl. but not limited to unicast and multicast routing protocols, Linux kernel's TCP stack implementation, congestion avoidance/control (e.g. BBR), traffic control, network simulation, AWS VPC / TGW & Kubernetes VPC CNI, etc. DPDK experience being a plus.
Professional experience with kernel troubleshooting: strace, bpftrace, perf profiling/tracing, navigating / reading / building the relevant kernel code.
Professional experience with userland monitoring (e.g. Thanos/Prometheus/AlertManaging), logging (e.g. Splunk/Loki), alerting, troubleshooting, profiling/tracing, etc.
Strong practical AWS knowledge, with min. 5 years of SRE / DevOps experience supporting and managing Linux based systems. Computer science, or engineering, degree preferred - strong understanding of fundamental Computer Science principles is required.
Familiarity with Kubernetes / Ansible / Chef, and with one or more programming language: Python, Golang, C, NodeJS.

Skills: devops,automation,cloud,security,networking,routing protocols