Software Engineer: Time Series Storage
New York, New York, United States
Two Sigma is a financial sciences company, combining data analysis, invention, and rigorous inquiry to help solve the toughest challenges in investment management, insurance technology, securities, private equity, and venture capital.
Our team of scientists, technologists, and academics looks beyond the traditional to develop creative solutions to some of the world’s most complex economic problems.
We are seeking a software engineer for the Time Series Storage team! The team is responsible for architecture, design, development and support of a petabyte-scale time-series storage solution to empower Two Sigma’s workflows across research, simulation and live-trading. We are evolving our time-series data access layer towards a modern hybrid-cloud architecture to sustain the continuous growth of our business. As a software engineer on the team, you will have an opportunity to deal with exciting technical problems at an extensive scale (e.g. hundred PBs of data, billions of requests, etc.), and utilize modern technologies like Parquet/Arrow/gRPC. You will also have the opportunity to contribute to the open source projects respectively.
You will take on the following responsibilities:
- Develop the next generation time-series storage solution on top of cloud native primitives and standard data format
- Evaluate, develop on and contribute to the relevant open source project that is integrated with the storage solution
- Support the time-series storage infrastructure, diagnosis and resolve issues in the production environment, and operate the services to meet the SLO/SLA
- Collaborate with engineering partners to integrate the time-series storage solution with various downstream applications
You should possess the following qualifications:
- A bachelor’s degree in a technical subject area with 3+ years of full-time work experience in the relevant domain
- Strong software engineering skills in developing, testing and troubleshooting code in one and/or more of the following programming languages: Java/C++/Python
- Familiarity with common RPC frameworks like gRPC and REST. Experience of building data services using these frameworks and understanding how things work on HTTP/TLS, DNS and load balancing
- Familiarity with the use of different data stores such as relational database, key-value store, message queue and time-series databases
- Familiarity with common columnar data formats like Parquet and in-memory formats like Arrow is a plus
- Familiarity with Pandas ecosystem as the Python data science and data analytics library is a plus
- Experience of data warehouse, database system internals, and large-scale data processing solutions like Hadoop and/or Spark is a plus
You will enjoy the following benefits:
- Core Benefits: Fully paid medical and dental insurance premiums for employees and dependents, 401k match, employer-paid life & disability insurance
- Perks: Onsite gyms with laundry service, wellness activities, casual dress, snacks, game rooms
- Learning: Tuition reimbursement, conference and training sponsorship
- Time Off: Generous vacation, sick days, and paid caregiver leaves
We are proud to be an equal opportunity workplace. We do not discriminate based upon race, religion, color, national origin, sex, sexual orientation, gender identity/expression, age, status as a protected veteran, status as an individual with a disability, or any other applicable legally protected characteristics.