Much like how we use Google for everyday things, techies use an internal repository (aka SQL search engine) to discover data that is important to them. At Zomato, different teams have different requirements. For example, the City Lead team might be interested in knowing the number of orders placed in NCR last week, Customer Support might want to know how many queries were solved in a minute, Product Team would like to know the number of customers participating in Zomato Premier League on the day of KKR vs CSK, Social Media might like to know the most-searched dishes on New Year’s Eve, and more. In a nutshell, you get how broad our query types are. To maintain and facilitate such a variety, we need a reliable SQL (Structured Query Language) search engine that can provide better and more relevant results to one and all.
At Zomato, our Data Platform team is responsible for building and maintaining a reliable platform with the core aim of making data an essential part of Zomato’s DNA. To achieve this, our platform should be able to minimise the wait time, i.e., the period between raising a query and getting the results. At our scale, it is a challenge to maintain this without proper infrastructure and efficient data storage.
We currently use Trino as the query engine to power Zomans to build analytical systems to extract actionable insights from our data lake, run a series of experiments, and build ML models. Trino is a huge part of our data stack due to its industry-standard SQL interface and ability to join and aggregate data across heterogeneous data sources like Apache Hive, MySQL, Apache Druid, Apache Pinot, MongoDB, etc.
Zomans consume data using query and visualisation tools such as Apache Superset, Redash, Jupyter notebooks, and our custom python SDK – PyJumbo.
We observed that the query workloads can be broadly divided into the following categories:
Based on the usage pattern, we can individually tune the cluster by segregating them for performance and robustness.
We are currently running Trino on Amazon Elastic Container Service (Amazon ECS) using our custom Trino image, wherein we have integrated the following components:
This is incredibly useful to identify any backward-incompatible changes present in the new version. For instance, Trino version 3414 had a backward-incompatible change for the timestamp column. This tool helped us identify and isolate the change before rolling out the new version.
Since Trino executes queries from various sources by distributing them across different workers, it becomes difficult to identify bottlenecks and debug service issues. Through Trino JMX and docker container metrics, we can create real-time monitoring dashboards to help debug issues with our clusters. Some of the PagerDuty alerts we have:
Apart from real-time monitoring, we have also set alerts on our Trino query logs to
As Zomato grew, the need for data-driven analysis also increased, and so did our cost for the Trino setup.
After deploying Trino, we soon understood that only time-based autoscaling would not work because of two reasons:
We aimed to increase the cluster’s effective utilisation, all while offering a reliable customer experience.
While looking for the auto-scaling solution, it became clear that Trino cannot be scaled based on CPU/Memory metrics like any other service. We experimented with multiple scaling conditions but finally settled on scaling the cluster on the ratio of running queries and active workers. We experimented with queued queries metric in the scaling equation but soon removed it as we realised it was more biased towards upscaling the cluster and was not going well with our resource groups.
Upscale if runningQueries/activeWorkers > THRESHOLD
Downscale if runningQueries/activeWorkers < THRESHOLD
We calculated the value of THRESHOLD based on the maximum number of nodes, maximum concurrent queries, and query max memory per node.
With this custom auto-scaler, we managed our workload more efficiently. In addition, we also reduced the cluster cost by 30-40%, with 0 downtimes during downscaling.
As the scale increased, we faced frequent cluster slowdowns, even with our custom auto-scaler. We realised that a single cluster will not be enough to scale indefinitely and we need a multi-cluster concept to manage our workloads. Hence, we decided to use a load balancer/ proxy/ gateway for prestoDB/ Trino called presto-gateway5 open-sourced by Lyft.
In Lyft’s own words – ‘Presto-Gateway is a stateful load-balancer, proxy and router for multiple presto clusters, it provides transparent access to underlying presto-backend without changing the protocol.’
Since then, we’ve added functionalities on top of it. This service makes visualisation tools/ SDKs agnostic of specific Trino versions and enables the following usages. Some of these features are in active development and we are slowly moving all our visualisation tools/ SDKs from specific clusters to Presto Gateway.
Trino has become an integral part of our day-to-day life. Be it – analytics, reporting, data quality, or lineage tracking. With Trino’s architecture, we can empower Zomans to make data-driven decisions. In the future blogs, we will talk about
This is a data platform article to share how we scaled Trino to smoothly facilitate our growing number of queries. If you are interested in solving similar problems, connect with Sumanshu Dwivedi on LinkedIn. We’re always looking for cool Data Platform Engineers at Zomato.
This blog was written by Ayush Chauhan and Palash Goel in collaboration with Ankit Atreja, Anmol Virmani, and Rajat Taya under the guidance of Sumanshu Dwivedi and Himanshu Rathore.
––––––
Sources –
-x-
All images/ videos are designed in-house using excalidraw.com and figma.com.
All content provided in this blog is for informational and educational purposes only. It is not professional advice and should be treated as such. The writer of this blog makes no representations as to the accuracy or completeness of any content or information contained here or found by following any link on this blog.