When official Android support for Kotlin was announced on May 2017, I got really excited. Don’t get me wrong, I love Java: it was the first language I used professionally, and it has a very strong community, a myriad libraries to use, and some of the best tooling out there. However… it also has its problems: it’s verbose, until the latest versions didn’t have a nice way to deal with optional or nullable values, and a lot of its progress gets slowed down by backwards compatibility with decisions made two decades ago. Kotlin came as a breath of fresh air.
Data Science is often labeled as one of the sexiest jobs of the 21st century. But it is really hard to find the right sexy data science job.
More and more companies are trying to collect tons of data on pretty much anything they do and try to hire data scientists in hopes of creating value from this data. But companies are all at different stages of data science capabilities and have different expectations of what data scientists should do. So it is very hard for data scientists to choose the right company for their ideal job.
Over the 4 years I’ve been at Thumbtack, our engineering infrastructure has changed a lot. We’ve completely transitioned our cloud provider from SoftLayer to Amazon Web Services (AWS) & Google Cloud Platform (GCP), built our data infrastructure from the ground up, made big steps in migrating to backend services, built a model serving infrastructure, built our own push notification delivery service, migrated > 90% of our iOS codebase to Swift, built two Android apps, and, oh, completely overhauled how our marketplace works.
While we work hard to create systems that are simple,
Thumbtack currently runs about 30 A/B tests per month, ranging in duration from a week to six months. We experiment on virtually every area of our product — customer signup, pro signup, matching algorithm of customers with pros, messaging features, reviews features, SEO traffic, and many more. Our experiment analysis computes 600 different metric definitions defined by both analysts and engineers.
This blog post is in seven parts:
- Experiment Design
- Experiment Configuration
- Experiment Assignment Service
- A brief overview of our Data Platform
- Experiment Result Computation
- Metric Definitions
- Experiment Result Visualization Service
Here’s how these parts are connected:
About half of Thumbtack engineers will own at least one A/B test within their first 6 months.
At Thumbtack we use InfluxDB to store monitoring metrics collected from all of our systems. It currently handles the ingestion of more than 200,000 data points per second and with our current retention policies this adds up to 7.5TB of data.
We prefer to operate on open-source software, one of the reasons we were drawn to InfluxDB. However, the open source core does not provide high availability features.
As we grew, rolled out new features, and collected more and more metrics (the amount of which has almost doubled over the past 12 months),