Kasmo

Snowflake Summit 2024: Important Announcements and Key Takeaways

Snowflake summit

Exploring Snowflake Summit 2024: A Preview 

It is inspiring to see the pace at which Snowflake is constantly evolving to help customers and organizations globally.  Snowflake’s Data Cloud summit is organized for developers and organizations so that they can learn how Snowflake optimizes and streamlines their operations.  This year was no exception. Let us dive into the Key announcements at the Snowflake Summit 2024.

Inside Look at Snowflake Summit 2024: What’s in Store 

Here are some highlights discussed in the Snowflake Summit 2024: 

In Enterprise AI  and ML: 

  • Document AI: A serverless LLM that extracts structured data from unstructured data. This feature is based on doc processing that easily extracts data from invoices, contracts, etc. 

  • CoPilot: An LLM-based ad-hoc Text to SQL generator built into Snowflake UI. 

  • Cortex Analyst:  Serverless and has an accurate LLM Service that provides actual answers to business users based on their questions that can be used by Chatbots. 

  • Cortex Playground: UI-driven LLM comparison and test tool. 

  • Cortex Cross Region Support: Enhances LLM inference by utilizing the nearest Snowflake region equipped with the necessary models and GPUs, ensuring efficient cloud and regional operations. 

  • Cortex Guard: A safety feature developed with Meta’s Llama Guard, providing enterprises with advanced filtering capabilities to prevent unwanted outputs from their applications. 

  • Serverless LLM Fine Tuning Service (Public Preview): Enables direct UI-based training of foundational LLM models, resulting in more precise smaller models that require shorter prompts, leading to cost savings. 

  • New LLM Embedding Models: Introduces innovative embedding models for LLMs. 

  • Notebooks: Integrated within the Snowflake UI, these notebooks facilitate the creation, building, and scheduling of ML models and data pipelines using Python and SQL, complete with Git integration. 

  • Snowsight AI & ML Studio: A user-friendly wizard interface for constructing various ML and AI models and pipelines. 

  • Snowpark Pandas: Empowers users to code ML models with familiar Pandas data frames, executed on Snowflake compute in a parallel manner, overcoming typical data volume and performance limitations. 

  • Cortex ML Functions: Provides a suite of ready-to-use ML functions for forecasting, anomaly detection, and classification, designed for users without a data science background. 

  • ML FeatureStore: A tool for creating, managing, and serving ML features with automated refresh capabilities on batch or streaming data.
     
  • ML Model Registry: A comprehensive solution for managing, tracking, versioning, and sharing AI/ML models within Snowflake. 

  • Git Integration: Facilitates collaboration and management of Snowflake assets linked to a Git repository. 

  • Automated & Declarative CI/CD Pipelines: Simplifies the management and execution of Snowflake objects via Python APIs and CLI tools. 

  • Observability with Snowflake Trail: Offers a new logging service and capabilities for developers to monitor and troubleshoot their code and compute utilization. 

In Data Engineering, Governance, Security & General Features 

  • Dynamic Tables (GA): A SQL-based method for creating incremental transformations and simple data pipelines. 

  • Iceberg Tables (GA): An open-source table format that stores data in Parquet format within customer-owned object stores. 

  • Polaris Iceberg Catalog: An open-source catalog for Iceberg tables, supporting read/write operations and user access controls. 

  • Horizon Access: A suite of tools for data discovery, including Universal Search, Internal Marketplace, Organizational Listings & Profiles, and Object Insights Interface. 

  • Horizon Privacy: Implements a differential privacy policy to protect sensitive data during aggregate queries. 

  • Compliance: Features Data Quality Monitoring and Lineage Visualization Interfaces for both data and ML assets. 

  • Horizon Security: The Trust Center provides a UI for identifying and mitigating security risks. 

Snowflake Notebooks 

Snowflake Notebooks is a recent Snowflake platform enhancement introduced on 6th June 2024. It is an interactive platform, which has an easy –to-use interface environment. This interface allows for an effortless blend of Python, SQL, and Markdown. The key offerings of Snowflake, such as Snowpark ML, Streamlit, Cortex and Iceberg Tables can be integrated with Snowflake effortlessly. 

Snowflake Notebooks simplifies and enhances the data engineering, analytics and ML workflows of the company. Let’s look at the breakdowns of what can be done with Snowflake Notebooks: 

Snowflake summit

 

Snowpark Pandas API 

The Snowpark Pandas API is a tool that executes pandas code on data within Snowflake by simply altering the import statement and a few lines of code. This means users can enjoy the familiar pandas experience with the added benefits of Snowflake’s scalability and security.  

Let’s break down the features of Snowpark pandas API: 

  1. Use pandas in Snowflake: Users can keep using pandas commands that they are used to. They can also work on the big data inside Snowflake. 

  2. Work with big data easily: There’s no need to learn new tools to work with larger datasets. It helps in reducing the time and expense of transferring pandas pipelines to different big data frameworks. 

  3. Turns pandas into SQL: Snowpark pandas run code into a language that Snowflake understands – SQL. It enables users to take several benefits of data governance and security of Snowflake. 

Introducing Snowflake Horizon: Latest Announcements 

Snowflake Horizon is Snowflake’s suite of features that correlates to data governance, security, discoverability, and privacy. There were some noteworthy mentions related to Horizon: 

  1. Universal Search: Search is now Generally Available. Users can now search everything in Snowflake, from internal data to the marketplace. Neeva’s engine tech was acquired by Snowflake in 2023, which is now being used in Search which makes it possible. 

  2. Table Governance Tab: A new Governance tab was introduced on the Table’s page which contains all the information regarding top queries, including how a user is using a table. It is available in Public Preview. 

  3. Table Lineage: It is available in Private Preview on the Table’s page. It is responsible for displaying upstream and downstream dependencies for Views and Tables. It is highly popular in the Data Catalog and Observability tools available for all Snowflake customers.  

Conclusion 

To sum it up, in Snowflake Summit 2024, there were some exciting announcements and updates. Snowflake is constantly adding and improving its features, which are helping users to access it easily. They are reducing both time and cost for users to go outside to buy any third-party tools to operate their data warehouse. Snowflake’s investment in AI and ML will undoubtedly create an impact on their revenue generation in the upcoming years. 

As a proud partner of Snowflake, we helped multiple businesses streamline their architecture and eliminate data silos. If you want to know more about our services, contact us! 

Snowflake summit

Interested to learn more, talk to our experts