AI Technologies , Generative AI

Why Robust Data Strategy Is the Lifeline of Gen AI

Mai-Lan of AWS on Data Challenges in Gen AI Implementation
Why Robust Data Strategy Is the Lifeline of Gen AI
Mai-Lan Tomsen Bukovec, VP of technology, AWS

Deploying generative AI necessitates a meticulous and well-implemented data strategy. The success of gen AI hinges on large sets of clean, readily accessible, and well-organized data, making cloud infrastructure indispensable for achieving scalability, flexibility and agility.

See Also: Exploring AI's Impact in Third-Party Risk Management

In an exclusive interview with ISMG, Mai-Lan Tomsen Bukovec, VP of technology at AWS, decoded CIO insights and strategies for navigating the data challenges in gen AI.

Why is data strategy critical for the success of gen AI? What should be an appropriate approach?

My discussions with CIOs revolve around data strategy because I run the Amazon S3 business, which hosts the world's largest data lakes. Digital transformation, including gen AI, relies on large sets of clean, easily retrievable and structured data. An on-premises infrastructure lacks the scalability, flexibility and agility necessary for this. Therefore, cloud infrastructure becomes crucial. Applying machine learning to analyze large datasets requires high compute power, which only a cloud infrastructure can provide. Most gen AI use cases, such as predictive analytics or financial fraud detection, demand this level of compute power.

What insights are you gaining from CIOs about gen AI, and how are they implementing a data strategy for it?

The CIOs invariably express discomfort in conducting experiments involving gen AI outside their established data boundaries. They prioritize adherence to existing business protocols than venturing into more consumer-centric approaches. This cautious approach implies restricted data access, limited by the available dataset, irrespective of the potential of the application.

The top three initiatives as part of CIOs' data strategy for gen AI are:

  • Utilizing Internal Data Sets: They customizing gen AI models using their high-quality datasets rather than relying solely on generalized models trained on internet data. This approach helps in employing specific and refined datasets to train AI systems. They focus on leveraging existing high-quality data on platforms such as Amazon S3, known for its reliability. Moreover, CIOs aim to repurpose this data, already utilized for various business applications such as analytics, fraud detection and machine learning, for gen AI purposes. Their intention is not to generate new data but to optimize and use their established high-quality business data from their data lakes for AI customization.
  • No New Architectures: Rather than creating entirely new frameworks for gen AI, CIOs prefer extending their current architecture. While certain components such as the vector data store and foundational models might be novel, the emphasis is on maintaining existing data storage locations, such as Amazon EKS (Elastic Kubernetes Service) or other Kubernetes clusters. They aim to leverage familiar technologies and extend their current infrastructure rather than adopting new systems, primarily due to skillset considerations.
  • Leveraging Existing Expertise: Developers prefer to integrate new gen AI capabilities into their known data architecture while capitalizing on their existing expertise. Hyperscalers such as AWS are sought after to augment familiar data architecture with new gen AI capabilities. AWS, in response, integrates vector capabilities into familiar services like OpenSearch and Aurora PostgreSQL, allowing semantic and full-text searches combined with vector embedding within the same database. The goal is to follow a customer-centric approach, offering various choices such as Amazon Bedrock, Cohere, Meta, Anthropics and Titan models for foundational AI, empowering customers with options for seamless integration of existing resources for gen AI applications.

Regarding the early uses of gen AI in enterprises, where do you see the most traction among the CIOs?

The traction of gen AI in enterprises is primarily observed in two distinctive categories, both gaining considerable interest and adoption among CIOs:

  • Back-Office Gen AI Applications: CIOs are witnessing significant traction in employing gen AI for back-office applications, focusing on enhancing productivity. These applications aim to improve efficiency and effectiveness across different business functions. For instance, gen AI is utilized to assist office workers, customer support representatives, engineers and other professionals in performing their tasks more efficiently. Common applications include generating text summaries, automating document creation and facilitating various aspects of workflow management. These tools help in streamlining processes, saving time and improving overall productivity within the organization.
  • Complex Code Generation: Another area gaining notable traction among CIOs is the application of gen AI for complex code generation. An exemplary tool in this domain is Amazon Q, which enables rapid code creation. Using a generalized understanding of coding syntax, these tools generate codes in minutes. What distinguishes this application is its adaptability for customization, allowing businesses to tailor generated code to their specific requirements. This capability significantly contributes to expediting software development processes, enhancing code quality and fostering innovation in software engineering.

You mentioned that CIOs tend to be conservative about altering data architecture. Does it potentially limit the capability of AI?

I was particularly referring to architectures that are already on the cloud. If a customer lacks cloud-based data architectures, they swiftly move in that direction. This entails a data lake, warehouse and production databases, which serve as essential repositories for gen AI applications.

Bukovec has been an engineering and product leader of AWS storage and compute services since 2010. Prior to joining Amazon, she spent nine years in engineering and product leadership roles at Microsoft, as well as three years in early stage startups.


About the Author

Rahul Neel Mani

Rahul Neel Mani

Founding Director of Grey Head Media and Vice President of Community Engagement and Editorial, ISMG

Neel Mani is responsible for building and nurturing communities in both technology and security domains for various ISMG brands. He has more than 25 years of experience in B2B technology and telecom journalism and has worked in various leadership editorial roles in the past, including incubating and successfully running Grey Head Media for 11 years. Prior to starting Grey Head Media, he worked with 9.9 Media, IDG India and Indian Express.




Around the Network

Our website uses cookies. Cookies enable us to provide the best experience possible and help us understand how visitors use our website. By browsing aitoday.io, you agree to our use of cookies.