We are looking for a highly motivated Senior Query Optimizer Software Engineer to join Voltron Data’s team. On the team, you’ll have the opportunity to help support and grow the Voltron Data and Apache Arrow ecosystems. You will work closely with Voltron Data development teams to build and maintain a SQL parser and query optimizer for large scale single node and distributed query execution engines.
Why work at Voltron Data?
- We are Going for Impact: We are a Series A, venture-backed startup assembling a global team to build a new foundation for data analytics with Apache Arrow. This foundation will usher in a wave of innovation in data processing that can take full advantage of the speed and efficiency offered by modern hardware.
- We are Committed to Bridging Open Source Communities: We are a collection of open source maintainers who have been driving open source ecosystems over the last 15 years, particularly in the C++, Python, and R programming ecosystems.
- We are Building a Diverse, Inclusive Company: We are creating a representative, equitable, and respectful workplace that prioritizes employee growth. Everyone at Voltron Data is bought into the company’s success; all voices are critical to shaping the organization’s future.
Below is a rough timeline of where you can expect to be at different points during your career path starting in this position.
- Spending time learning about the Apache Arrow compute primitives, compute intermediate representation, compute engine, and other foundational components.
- Familiarizing yourself with the different partners for compute kernels and the query execution engine on Apache Arrow.
- Learning and embracing the Apache development process.
Within a month:
- Becoming familiar with our SQL parser and query optimizer.
- Benchmarking queries and exploring the effects of different query optimization techniques using our query optimizer.
- Making changes and improvements to the existing query optimization rules and how it creates physical execution plans for the execution engines under development.
Within 6 months:
- Adding new query optimization rules.
- Making improvements to decision making in the cost based optimization based on metadata availability for the tables being queried.
- Integrating non-SQL operations to the optimization framework.
- Working with client interfacing engineers to understand performance bottlenecks in customer queries.
Within 12 months:
- Proposing and implementing improvements to the query parsing and optimization framework.
- Integrating with a stateful inter-query query engine context to optimize the reusability of compute stages across queries that query the same data in similar ways.
Previous experience that could be helpful:
- Building and/or using open source query optimization frameworks like Apache Calcite, Apache Spark Catalyst, Postgres Query Optimizer, and/or others.
- Developing in C++, especially using modern C++.
- Utilizing serialization libraries like FlatBuffers, Protobuf, Thrift, MessagePack, and/or others.
- Working on non-SQL systems and non-SQL computational abstractions.
US Compensation - The salary range for this role is between $175,000 to $210,000. We have a global market-based pay structure which varies by location. Please note that the base pay range is a guideline and for candidates who receive an offer, the exact base pay will vary based on factors such as actual work location, skills and experience of the candidate. This position is also eligible for additional incentives such as equity awards.