“PagerDuty has been the standard tool in DevOps for a long time, the same discipline we are utilizing in DataOps where all these teams send signals into one single tool, and we are able to correlate across multiple signals.”
Long gone are the days when data is batch loaded into a data warehouse for business intelligence reports that are looked at periodically and if something is broken, only a few internal people would have to wait. Today, data pipelines are “infinitely more complicated”, with more sources from cloud services to on premises systems, and supporting data applications that are critical parts of a business’ ecosystem.
In this episode, Dormain Drewitz sits down with Manu Raj, Senior Director of Analytics and Data Engineering at PagerDuty, and James Zhao, Senior Product Manager at Snowflake to discuss how DataOps has evolved and how it will be essential to support large language models (LLMs) in production.
Resources mentioned:
– Blog: Unlocking the Potential of Snowflake Alerts & PagerDuty Operations Cloud: Enhancing Data Operations | by Ravi Kumar | Jun, 2023 | Medium
– Andreessen Horowitz: Emerging LLM Architectures
Summary created with help from chatGPT
In the opening segment of The Unplanned Show, the host introduces James Zhao from Snowflake and Manu Raj from PagerDuty to discuss the challenges of unplanned work and its impact on businesses. The conversation touches on the core capabilities of large language models (LLMs) driving generative AI features and sets the stage for a deep dive into DataOps. Notably, Manu and James provide insights into their roles and discuss the increasing complexity of data operations in the age of cloud and on-prem systems, mixed with the integration of generative AI tools.
“The complexity of the data operations has now transformed into a mixture of all these tools coming together, which is where DataOps is now in a very challenging situation.”
Next, the discussion revolves around the increasing complexity in LLMs (large language models) architecture and its impact on data operations. The host mentions the Andreesen Horowitz emerging LLMs architecture and highlights the challenges presented by the multitude of components and dependencies. The conversation touches on the business importance and dependency on complex data systems, leading to a discussion between James and Manu about disturbances in the data pipeline and customer experiences when different components break down. James emphasizes the growing need for resilient data pipelines and mentions customer demands for features such as ingesting streaming data and enhanced observability. The conversation then shifts to Snowflake’s evolving role beyond a data warehouse, with a focus on data observability. James discusses Snowflake’s efforts to provide customers with more visibility into their Snowflake accounts, introducing features like an event table and native alerts for proactive monitoring. Manu expresses excitement about the integration of Snowflake’s observability capabilities with PagerDuty, emphasizing the importance of these notifications in modern digital operations.
“We were excited for […] those notifications coming from Snowflake, and our team was the first to jump in with happiness when we first heard that Snowflake released […] observability on top of all the telemetry data event notifications on top of Snowflake.”
Then the discussion revolves around the growing adoption of LLMs and the importance of DataOps maturity as organizations delve into more advanced data processes. The host emphasizes the significance of having a robust DataOps foundation for building resilient LLM architectures. James discusses Snowflake’s recent announcements and expresses excitement about features such as developing ML models, Snow Park Container Services, and Native Apps on the Snowflake Data Cloud. He highlights the enthusiasm among customers for exploring unstructured data and extracting value from it. The conversation delves into specific use cases, including customer data sharing through containerized data services. Manu shares PagerDuty’s interest in customer data sharing and discusses the potential for embedding complex logic in container models.
“We [are looking] at using the containerized data service to share data with other [Snowflake] customers. We are able to insert some of the […] complex logic that is in PagerDuty, so we are able to embed inside the data models and share that with our customers.”
The interview concludes with reflections on the evolving landscape of data operations (DataOps) in the context of generative AI and large language models (LLMs). The discussion emphasizes the significance of having a solid foundation in data governance, quality, and observability as organizations venture into more advanced AI architectures. The participants highlight the need for controls and tools to manage changing assumptions about data, ensuring its accuracy and relevance over time. The conversation also touches upon the challenges and urgency associated with maintaining data applications powered by real-time AI and the importance of vigilance and modern tools in this rapidly evolving space.
“It is absolutely essential that the foundational elements of maintaining that data quality and data observability—those foundational systems around data governance and security—[those] foundations have to be correct in your ecosystem.”
Watch the interview:
"The PagerDuty Operations Cloud is critical for TUI. This is what is actually going to help us grow as a business when it comes to making sure that we provide quality services for our customers."
- Yasin Quareshy, Head of Technology at TUI