Senior Machine Learning Engineer, Tumblr

AUTOMATTIC

2 weeks ago

Worldwide

Tumblr launched in 2007 with the belief that people need a place to say what they want, be who they want, and connect over their interests. We continue to build Tumblr as a platform for free expression, individuality, and human connection.

We are looking for an experienced candidate to join our Feeds Experience team, which builds Tumblr’s backend systems—powering core content feeds, search, personalization/discovery experiences, user-interest profiling, content understanding and notifications. Your goal will be to design, develop and maintain large-scale data pipelines and backend services, to connect users with the content they love. The team plays a critical role in driving daily active users by improving engagement and retention on Tumblr.

We build on top of open source big-data frameworks, such as Apache Spark (batch processing) and Apache Flink (real-time processing), orchestrated by Kubernetes, Apache Airflow, and with a PHP backend layer.

Responsibilities:

  • Collaborate with the team to enhance engagement on feeds, notifications, content discovery and relevance of search results. This will involve developing algorithms, data-pipelines and backend-services to match millions of users with the most relevant and engaging content, detecting trends, improving search retrieval and relevance, and striking a balance between driving engagement to established and new content creators.

  • Research and develop new features to improve user engagement and the reactivation of lapsed users.

  • Define success metrics, launch A/B tests, perform analysis to validate hypotheses, and build tools to enable continuous experimentation.

  • With stakeholders from Engineering, Product, and User Research, contribute to the team’s strategy and roadmap. This includes identifying short- and long-term opportunities for business impact on Feeds, Search, and Notifications; discussing alternatives; driving architectural decisions and implementation; and,finally, quantifying the impact of implemented solutions and distilling learnings.

Requirements:

  • Good understanding of statistics, machine learning, and mining of massive datasets.
  • 3+ years of professional experience developing large-scale data-pipelines and machine learning approaches for content retrieval, ranking, relevance, and personalization.
  • 5+ years of professional experience in software engineering, with expertise in at least one among the following programming languages: Python, Scala, Java.  You will encounter all of them in this role, as well as PHP; the idea of using them on a regular basis should not be a blocker for you.
  • Have hands-on experience with data processing frameworks like Apache Spark and Apache Flink at scale.
  • You have excellent written English and can effectively communicate with stakeholders and colleagues from a cross functional organization. Communication is our oxygen and the basis of everything we do.
  • You are goal driven, humble, and have equal willingness to learn and teach.
  • Excitement to join a globally distributed team. Familiarity with remote and async work is welcome.
What will make you stand out:
  • MS/PhD in Computer Science, ML, or related fields.
  • Experience in large-scale notifications systems for driving user growth and engagement loops.
  • Deep expertise in search ranking and relevance at scale, in particular with Elasticsearch.
  • Experience in embedding-based recommendation and retrieval.
  • Experience in real-time stream processing frameworks.

Salary range: $100,000-$200,000 USD - Please note that salary ranges are global, regardless of location, and we pay in local currency.

Read more about our compensation philosophy and benefits.

This isn’t your typical work-from-home job—we are a fully-remote company with an open vacation policy. To see a full list of benefits by country, consult our Benefits Page. And check out these links to learn more about How We Hire and What We Expect from Ourselves.