Remote, San Francisco
$160,000 to $180,000 per year
Numerai Numerai is a new kind of hedge fund powered by a decentralized network of machine learning models. Every week, thousands of data scientists from around the world compete in the https://numer.ai data science tournament to model our dataset and predict the global stock market, earning staking rewards in the NMR cryptocurrency based on their performance. Collectively, these staked predictions form the Meta Model which controls the portfolio of the Numerai hedge fund. In the short two years since we started trading, our novel approach to crowdsourcing machine learning intelligence has generated industry-leading performance for the Numerai hedge fund, shown at https://numer.ai/fund. The most exciting thing about our performance is that it is constantly getting better as we improve the dataset and as the community of data scientists grows. And since our alpha is highly portable, we will be able to apply this playbook to create industry-leading investment products in any asset class and in any market condition. All of this is part of Numerai's mission to monopolize intelligence, monopolize data, monopolize money, and to decentralize the monopoly. We believe that the capital allocation industry is ripe for disruption. Recent advances in machine learning and blockchain technologies have given us a unique opportunity to redesign the entire system from first principles and to re-build it from scratch.
The Stack Python (Pandas, Numpy, Scikit-Learn) AWS (S3, EC2, Batch) Airflow Terraform, Docker Postgres, MySQL The Role Your role as a quantitative data engineer is to work on all the systems that ingest, store, transform, and serve data in Numerai. You will be asked to source new interesting datasets, build models to determine their value, and research how to best incorporate them into our strategy and our data science tournament dataset. You will be working closely with our ML engineers and researchers to design and build datasets, with our product and backend engineers to integrate data pipelines and services, and with our infrastructure team to ensure all systems are running smoothly. As a senior engineer, you will be expected to own and lead major projects while actively raising the engineering bar across the entire organization. You will work with other senior engineers to architect robust and performant systems and processes. You will work closely with management to translate business objectives into technical requirements, align projects and goals with key metrics, and grow the engineering team. Example Projects Research, trial, purchase, and integrate new datasets to further enrich our equities data collection, and enable better prediction of stock movement. Take ownership of existing tournament and hedge fund data pipelines and ensure timely delievery of all critical datasets. Build tools and processes to improve the scalability, reliability, and observability of these pipelines without sacrificing developer velocity. Research, implement, and test transformations to the dataset to improve machine learning performance. Design and build a canonical data platform for all datasets and data tables. Define and enforce SLAs for delivery. Define and enforce data quality standards by writing data integrity tests. Create a data staging environment to support end to end integration testing. Design and build metrics to measure the performance, cost, usage, and efficiency of all data systems. Use these metrics to drive tactical and strategic improvements to the overall system. Requirements Experience building machine learning models in production Experience with Quantiative Finance Data Experience building data platforms and APIs Excellent written communication (design docs, specs, documentation, code reviews, post-mortems) Extreme ownership Good general systems knowledge and debugging skills Willing to work extended hours and on weekends if necessary Strong interest in machine learning, quantitative finance, and decentralization