Brands
Discover
Events
Newsletter
More

Follow Us

twitterfacebookinstagramyoutube
Youtstory

Brands

Resources

Stories

General

In-Depth

Announcement

Reports

News

Funding

Startup Sectors

Women in tech

Sportstech

Agritech

E-Commerce

Education

Lifestyle

Entertainment

Art & Culture

Travel & Leisure

Curtain Raiser

Wine and Food

YSTV

Goodbye, ETL chaos! Meet on-demand GenAI-ready data products

By embracing a semantic, federated approach, organisations can create Generative AI-ready data products that are both agile and aligned with modern data demands.

Goodbye, ETL chaos! Meet on-demand GenAI-ready data products

Thursday December 12, 2024 , 6 min Read

Over the past decade, enterprises have significantly advanced their data management architectures to keep up with the three V's of big data: Velocity, Variety, and Volume. Moving from traditional Data Warehouses (DWH) to Data Lakes and, more recently, to Lakehouses, organisations have attempted to create flexible, scalable solutions to meet the needs of modern data-driven decision-making. Yet, as data repositories grow in complexity and volume, companies increasingly find themselves dealing with inefficiencies that turn these powerful repositories into chaotic "data swamps".

To address these issues, many companies have turned to Data Catalogs to organise and create consumable data products. However, limitations in that approach — such as language comprehensibility, analytical capabilities, and lack of sufficient business context—have prevented them from fully meeting enterprise expectations. But how can semantic model-driven federated query generation and AI-ready data products provide a revolutionary solution to the challenges in data management and integration? Let us analyse why data catalogs fall short of expectations:

  1. Limited business metadata integration: Most data catalogs index only technical and operational metadata, missing critical business context. This forces data stewards to manually annotate metadata—a time-intensive process—leaving users struggling to find relevant data without deep domain expertise, ultimately limiting business value.
  2. No intuitive “data shopping” experience: Data catalogs lack user-friendly interfaces that allow seamless browsing, personalised selection, and instant access to data products. Without such a “Data Product as a Service” model, enterprises remain dependent on complex transformations like the medallion model.
  3. Rigid data transformation frameworks: Pre-defined schemas in traditional models increase redundancy and limit flexibility, making it hard to adapt to changing business needs. Agile, adaptable solutions are essential for today’s fast-paced environments.

As data demands evolve, enterprises need scalable, user-friendly solutions that simplify access to data products without the constraints of traditional frameworks.

The rise of semantic model-driven federated query generation

Semantic model-driven query generation is gaining traction in data management, with platforms like dbt Labs enabling data teams to build, document, and manage semantic models for greater accessibility and usability across businesses.

However, these advances still face challenges, particularly the heavy reliance on skilled technical talent and manual efforts. Data engineers spend significant time designing and maintaining semantic models to accurately represent complex relationships, slowing down the data-to-insight pipeline and limiting scalability and agility in dynamic business environments.

LegoAI’s approach: Reducing dependency on manual effort with AI-driven semantic models

LegoAI’s transformative solution mitigates these challenges by leveraging Machine Learning (ML), Large Language Models (LLMs), and Knowledge Graphs to automate large parts of the semantic modelling process. By allowing AI to take on the bulk of the initial groundwork, LegoAI drastically reduces the manual labour traditionally required for semantic modelling. Here’s how it works:

  1. AI-generated semantic Models: LegoAI generates a foundational semantic model directly from raw data. This model is enriched with AI-generated business glossaries and mappings to business-use-case ontologies. This step minimises manual intervention and provides a robust starting point that accelerates the entire semantic modelling process.
  2. Decoupling evolving data structural changes from knowledge: Rather than adhering to a rigid, pre-defined model, this approach decouples data assets from their connected business knowledge. By creating physical versions only on-demand, LegoAI minimises redundancies and maximises flexibility.
  3. Validation rather than creation: While the AI-generated semantic model still requires validation by a domain-experienced data modeller, it eliminates much of the groundwork that would otherwise demand technical expertise. This shift allows data teams to focus on fine-tuning and validating the model rather than building it from scratch, enabling a more efficient, scalable, and agile approach.

By addressing the manual and expertise-intensive aspects of semantic modelling, LegoAI is pushing semantic model-driven federated query generation closer to a future where data management is accessible, flexible, and ready to meet the needs of Generative AI applications.

Federated query generation for on-demand GenAI-ready data products

One of the most transformative applications of LegoAI’s semantic model-driven approach is its ability to facilitate federated query generation, reshaping how enterprises access and use their data. By employing federated queries, organisations can seamlessly pull data from multiple sources in real time, without the need for complex data migrations or transformations. This approach is particularly advantageous for GenAI-ready data products, where highly connected, curated, up-to-date, relevant data is essential for model accuracy and performance.

LegoAI’s system leverages AI-driven semantic models that, once validated and integrated into a Knowledge Graph, serve as the foundation for data retrieval. Using a proprietary query generation algorithm, LegoAI can dynamically generate federated queries compatible with SparkSQL, streamlining data access for users across various sources without needing to know the intricacies of each source system.

Benefits for on-demand Generative AI-ready data products

  • No-code, instant data products: Federated query generation enables data products to be created instantly based on user requests, with no technical expertise required. Users can retrieve tailored data products through a simple, no-code interface, making data accessible to all stakeholders.
  • Real-time data access and flexibility: With federated queries, data is accessed directly from source systems in real time, eliminating the need for data transfers. This approach ensures AI applications operate with the freshest data and allows businesses to adapt quickly to new insights without the delays of traditional data pipelines.
  • Reduced data transformation time: LegoAI’s platform pulls only relevant data directly from source systems, minimising time-consuming transformations. This on-demand access streamlines data operations, reducing pipeline bottlenecks and enabling faster AI model iterations and deployments.

Impact on data management:

  • Reduced maintenance overhead: By minimising the need for manual data curation and transformation, LegoAI’s approach frees up resources for more strategic tasks.
  • Improved responsiveness to business needs: The ability to generate federated queries directly from semantic models enables faster, more relevant data retrieval aligned with real-time business requirements.
  • Enhanced flexibility and agility: With on-demand data access, organisations can pivot and respond to new business questions and needs without being constrained by the limitations of traditional ETL/ELT pipelines.

Empowering vs. renovating enterprise data kitchens with LegoAI

Think of enterprise data platforms as bustling "Data Kitchens," where data teams are master "chefs" creating Data Products for their "guests"—the Business Teams. These guests expect not just data but a tailored, on-demand experience, whether it’s on the menu or a custom creation. The chefs’ mission? To deliver data products that delight, instantly and seamlessly.

LegoAI powers these Data Kitchens with flexibility at its core. For enterprises heavily invested in existing kitchens, LegoAI enhances them with AI-driven capabilities like automated semantic model creation, business glossary generation, and conversational AI, making their Chefs superhuman. For organisations ready to reimagine their kitchens entirely, LegoAI accelerates a full renovation journey, delivering a hyper-modern, AI-first foundation.

Whether it’s empowering existing kitchens or building state-of-the-art ones, LegoAI ensures every chef can serve a world-class experience. After all, it’s not just about cooking data products that helps you run your business; it’s about creating unforgettable experiences with contextual and prescriptive insights that change the way you do business.

Cohort 13 of NetApp Excellerator

The NetApp Excellerator program has been an invaluable platform, providing a testbed for our disruptive technology. With NetApp’s support and mentorship, LegoAI demonstrated the power of our product through real enterprise use cases. LegoAI continues to explore synergies across various applications within NetApp and other enterprise clients.