Data Management Requirements for Analytic Data in the Service of AI: It’s Way Trickier than You Thought

Multiple recent surveys show that a high percentage of enterprises need to develop solutions that incorporate artificial intelligence, yet a low percentage of enterprises feel that they are ready for AI. The same studies show that data is an area of low readiness. In addition, I suspect that organizations and their data are likewise unprepared for techniques closely related to AI, such as GenAI, ML, LLM, RAG, and other forms of advanced analytics (predictive, statistics, single views).

Hence, organizations should not make a beeline to AI and GenAI. They must first (or simultaneously) determine the data requirements for the kinds of AI-driven analytics applications they hope to build and deploy. And that’s the rub. Managing data specifically for AI and other advanced analytics is way trickier than you thought.

This presentation helps you sort out:

•                          The highest priority requirements for data and its management in the context of AI

•                          When you truly need some form of AI versus simpler and cheaper types of analytics

•                          Why data requirements for GenAI and predictive AI are different

•                          Types of cleansing and transformation that are good for analytic data quality vs other types that reduce analytic impact

•                          Building appropriate bias into analytic datasets, yet prohibiting inappropriate bias from profiles and models

•                          A detailed look at the lifecycle stages of machine learning, and why each has different data requirements

•                          Best practices for LLM, especially large model vs small model

•                          The role of RAG in getting just the right collection of data in just the right condition

•                          Collecting production data to help monitor model drift

•                          Approaches to cloud storage, data architecture, and data modeling that help to enable the above

 To catch the replay, click here for the full analysis.

Author: Philip Russom

Previous
Previous

Apex FinTech: How the PostgreSQL landscape has evolved

Next
Next

AWS breaks the mold: SageMaker becomes unified solution