Data Platform Team | June 30, 2022 | 6 min read
Powering Zomato’s data analytics using Trino

Engineers don’t search for things; they raise queries! 

Much like how we use Google for everyday things, techies use an internal repository (aka SQL search engine) to discover data that is important to them. At Zomato, different teams have different requirements. For example, the City Lead team might be interested in knowing the number of orders placed in NCR last week, Customer Support might want to know how many queries were solved in a minute, Product Team would like to know the number of customers participating in Zomato Premier League on the day of KKR vs CSK, Social Media might like to know the most-searched dishes on New Year’s Eve, and more. In a nutshell, you get how broad our query types are. To maintain and facilitate such a variety, we need a reliable SQL (Structured Query Language) search engine that can provide better and more relevant results to one and all. 

At Zomato, our Data Platform team is responsible for building and maintaining a reliable platform with the core aim of making data an essential part of Zomato’s DNA. To achieve this, our platform should be able to minimise the wait time, i.e., the period between raising a query and getting the results. At our scale, it is a challenge to maintain this without proper infrastructure and efficient data storage. 

We currently use Trino as the query engine to power Zomans to build analytical systems to extract actionable insights from our data lake, run a series of experiments, and build ML models. Trino is a huge part of our data stack due to its industry-standard SQL interface and ability to join and aggregate data across heterogeneous data sources like Apache Hive, MySQL, Apache Druid, Apache Pinot, MongoDB, etc.

Zomans consume data using query and visualisation tools such as Apache Superset, Redash, Jupyter notebooks, and our custom python SDK – PyJumbo.

Talk numbers to me – query scale and workloads

We observed that the query workloads can be broadly divided into the following categories:

  • Scheduled ETLs/ reports: Long ETL(Extract, Transform, Load) queries, with runtime up to 3 hours. Typically, these are materialised views to speed-up dashboard queries run through an internal platform built on top of Apache Airflow with an in-built full query retry mechanism.
  • Dashboard queries: Short queries, with runtime up to 3 minutes. These are primarily used to build and share dashboards for various business process monitoring.
  • Adhoc analysis: Medium duration queries, with a maximum runtime of 20 minutes. Used for debugging and other adhoc use cases.

Based on the usage pattern, we can individually tune the cluster by segregating them for performance and robustness.

Ready? Set? Deploy! 

We are currently running Trino on Amazon Elastic Container Service (Amazon ECS) using our custom Trino image, wherein we have integrated the following components:

  • Zomato Trino query-log plugin: A query logging plugin based on the Trino Event listener framework1. We log all query completed events and use them to create metrics that help us power the golden-signal2 Trino dashboard. We regularly analyse queries to tune our hive tables data layout by tuning data file size and using column-level indexing(Bloom filters). We also compute column-level statistics to enable Cost-Based Optimizer3
  • Trino User Defined Functions (UDFs): Custom UDFs based on the needs of Zomans, like custom geo functions.
  • Verifier and Benchmark: Before rolling out a new Trino version, we benchmark the new version to measure the performance of queries. In addition to this, we also use Trino Verifier to compare the query results produced by two different clusters. 

This is incredibly useful to identify any backward-incompatible changes present in the new version. For instance, Trino version 3414 had a backward-incompatible change for the timestamp column. This tool helped us identify and isolate the change before rolling out the new version.

Setting up alerts to solve bottlenecks and debug issues

Since Trino executes queries from various sources by distributing them across different workers, it becomes difficult to identify bottlenecks and debug service issues. Through Trino JMX and docker container metrics, we can create real-time monitoring dashboards to help debug issues with our clusters. Some of the  PagerDuty alerts we have:

  • P95 query execution/queued time of cluster
  • Number of active workers 
  • Rate of type of query failures (user, internal, external, or insufficient resource)
  • Number of queued and running queries

Apart from real-time monitoring, we have also set alerts on our Trino query logs to 

  1. identify bad queries choking the clusters 
  2. regularly look for query resource usage (bytes scan, total stages, total runtime, etc) to fine-tune our cluster configurations

Going Biiiiiiig – Scaling Trino

As Zomato grew, the need for data-driven analysis also increased, and so did our cost for the Trino setup. 

After deploying Trino, we soon understood that only time-based autoscaling would not work because of two reasons: 

  1. We need to scale our cluster according to the traffic
  2. We want to gracefully shutdown unnecessary extra nodes during the off-peak times

A mould that fits all needs – Custom Autoscaler

We aimed to increase the cluster’s effective utilisation, all while offering a reliable customer experience. 

While looking for the auto-scaling solution, it became clear that Trino cannot be scaled based on CPU/Memory metrics like any other service. We experimented with multiple scaling conditions but finally settled on scaling the cluster on the ratio of running queries and active workers. We experimented with queued queries metric in the scaling equation but soon removed it as we realised it was more biased towards upscaling the cluster and was not going well with our resource groups.

Upscale if runningQueries/activeWorkers > THRESHOLD

Downscale if runningQueries/activeWorkers < THRESHOLD

We calculated the value of THRESHOLD based on the maximum number of nodes, maximum concurrent queries, and query max memory per node.

With this custom auto-scaler, we managed our workload more efficiently. In addition, we also reduced the cluster cost by 30-40%, with 0 downtimes during downscaling.

Getting to the next step – Presto Gateway

As the scale increased, we faced frequent cluster slowdowns, even with our custom auto-scaler. We realised that a single cluster will not be enough to scale indefinitely and we need a multi-cluster concept to manage our workloads. Hence, we decided to use a load balancer/ proxy/ gateway for prestoDB/ Trino called presto-gateway5 open-sourced by Lyft.

In Lyft’s own words – ‘Presto-Gateway is a stateful load-balancer, proxy and router for multiple presto clusters, it provides transparent access to underlying presto-backend without changing the protocol.’

Since then, we’ve added functionalities on top of it. This service makes visualisation tools/ SDKs agnostic of specific Trino versions and enables the following usages. Some of these features are in active development and we are slowly moving all our visualisation tools/ SDKs from specific clusters to Presto Gateway. 

  1. Support for converting X-Presto Headers to X-Trino headers to make it backwards compatible for our custom visualisation tools/ SDKs
  2. Overall Presto cluster’s health visibility
  3. Query cost/thinking estimation routing

With Trino, Zomans say Bingo!

Trino has become an integral part of our day-to-day life. Be it – analytics, reporting, data quality, or lineage tracking. With Trino’s architecture, we can empower Zomans to make data-driven decisions. In the future blogs, we will talk about

  • How we decided the best instances for our different workloads?
  • Migration of Trino to Amazon ECS from Amazon Elastic MapReduce (EMR)
  • Running Trino on AWS Graviton instances

This is a data platform article to share how we scaled Trino to smoothly facilitate our growing number of queries. If you are interested in solving similar problems, connect with Sumanshu Dwivedi on LinkedIn. We’re always looking for cool Data Platform Engineers at Zomato.

This blog was written by Ayush Chauhan and Palash Goel in collaboration with Ankit Atreja, Anmol Virmani, and Rajat Taya under the guidance of Sumanshu Dwivedi and Himanshu Rathore

––––––

Sources –

  1. Trino – Event Listener,trino.io
  2. Monitoring distributing systems, sre.google
  3. Trino – Cost-Based Optimizer, trino.io
  4. Fix timestamp semantics, github.com 
  5. lyft/presto-Gateway, github.com 

-x-

All images/ videos are designed in-house using excalidraw.com and figma.com

All content provided in this blog is for informational and educational purposes only. It is not professional advice and should be treated as such. The writer of this blog makes no representations as to the accuracy or completeness of any content or information contained here or found by following any link on this blog.

facebooklinkedintwitter

More for you to read

Technology

apache-flink-journey-zomato-from-inception-to-innovation
Data Platform Team | November 18, 2024 | 10 min read
Apache Flink Journey @Zomato: From Inception to Innovation

How we built a self-serve stream processing platform to empower real-time analytics

Technology

introducing-pos-developer-platform-simplifying-integration-with-easy-to-use-tools
Sumit Taneja | September 10, 2024 | 2 min read
Introducing POS Developer Platform: Simplifying integration with easy-to-use tools

Read more about how Zomato is enabling restaurants to deliver best-in-class customer experience by working with POS partners

Technology

migrating-to-victoriametrics-a-complete-overhaul-for-enhanced-observability
SRE Team | August 12, 2024 | 11 min read
Migrating to VictoriaMetrics: A Complete Overhaul for Enhanced Observability

Discover how we migrated our observability metrics platform from Thanos and Prometheus to VictoriaMetrics for cost reduction, enhanced reliability and scalability.

Technology

go-beyond-building-performant-and-reliable-golang-applications
Sakib Malik | July 25, 2024 | 6 min read
Go Beyond: Building Performant and Reliable Golang Applications

Read more about how we used GOMEMLIMIT in 250+ microservices to tackle OOM issues and high CPU usage in Go applications, significantly enhancing performance and reliability.