Reliability Engineer - Storage
New York, New York, United States
Two Sigma is a financial sciences company, combining data analysis, invention, and rigorous inquiry to help solve the toughest challenges in investment management, insurance technology, securities, private equity, and venture capital. Our team of scientists, technologists, and academics looks beyond the traditional to develop creative solutions to some of the world’s most complex economic problems.
Reliability Engineering is a versatile group of full stack engineers, at the front line maintaining and expanding the capabilities of Two Sigma’s many and varied systems. The team exists in the space between traditional systems administration and development, and seeks to merge the capabilities from both disciplines.
You will take on the following responsibilities:
- Primary engineering and operational support for multiple, large distributed software applications and storage infrastructure systems for the company
- Improving all aspects of system, infrastructure and software reliability, including better monitoring, alerting, and documentation
- Engaging with our software engineering teams on support issues and improvements to our tools, processes, software and infrastructure
- Leading the evaluation and implementation of both homegrown and commercial Storage services for use within Two Sigma. Storage services can consist of NAS appliances, Object storage, Block Storage environments, Ceph and other technologies
- Developing tools and automation to support our storage infrastructure environments
- Gathering and analyzing metrics from both operating systems, applications and storage infrastructure to assist in performance tuning and fault finding
You should possess the following qualifications:
- A bachelor’s degree in computer science or another highly technical, scientific discipline
- In-depth knowledge and experience in at least one of: host based networking, Linux/UNIX administration, storage technologies, systems programming, distributed systems, host based networking, databases, cloud computing, and a strong desire to learn more
- Ability to program (structured and OO) with one or more high level languages (such as Python, Java, C/C++, Go)
- The ability to leverage off the shelf and open source systems and utilities to provision production systems in a variety of domains, especially for multi-tenant use
- A proven track record of automation and a systematic approach to solving problems
- The ability to translate high level software and infrastructure requirements from idea to a high quality implementation.
- A proactive approach to spotting problems, areas for improvement, and performance bottlenecks
- The ability to understand the inherent trade-offs between various software architectures as it relates to performance, resiliency/fault tolerance, load balancing, and data consistency
- Experience with distributed storage technologies like NFS, HDFS, Ceph, S3 as well as dynamic resource management frameworks, such as Mesos, Kubernetes, or Yarn.
- Experience managing storage appliances from vendors such as EMC, Netapp, Hitachi, PURE, including building automation using shell and python to ease the support burden of these environments
- Experience with automated configuration management tools such as Ansible, Chef, Puppet, SaltStack
- Experience with observability and monitoring tools (ex: New Relic, Datadog, Prometheus, Nagios, VictorOps, Splunk)
- Experience with public cloud technologies (ex: EC2, GCS)
- Experience with data visualization tools (ex: Kibana, Grafana, Tableau)
You will enjoy the following benefits:
- Core Benefits: Fully paid medical and dental insurance premiums for employees and dependents, 401k match, employer-paid life & disability insurance
- Perks: Onsite gyms with laundry service, wellness activities, casual dress, snacks, game rooms
- Learning: Tuition reimbursement, conference and training sponsorship
- Time Off: Generous vacation, sick days, and paid caregiver leaves
We are proud to be an equal opportunity workplace. We do not discriminate based upon race, religion, color, national origin, sex, sexual orientation, gender identity/expression, age, status as a protected veteran, status as an individual with a disability, or any other applicable legally protected characteristics.