Google Cloud customers are increasingly moving their generative AI workloads from proof-of-concept into production, and are now seeing real-world business impact from their AI investments. Many of these customers have worked with Google Cloud Consulting to apply AI in important and helpful ways. For example, Bristol Myers Squibb developed a new AI-powered interface to help its clinical study teams more easily find important information and generate documents, and Palo Alto Networks launched several new AI tools that utilize Gemini to streamline and enhance user experience in its copilots, improving productivity of security practitioners.

Moving these workloads into production requires a deep understanding of generative AI systems design, large language model architectures, prompt engineering, evaluation, and much more. Now, we’re bringing Google’s expertise in these areas to our customers at-scale, with the launch of a new service offering: Generative AI Ops. This new offering, delivered by either Google Cloud Consulting or via our comprehensive partner ecosystem, will help organizations mature their gen AI prototypes into production-grade solutions and provide support in important areas like security, model tuning and feedback, and optimization.

With the launch of Generative AI Ops, Google Cloud now offers customers both an open and optimized technology stack for building AI, and a comprehensive set of services to support customers at every stage of their AI transformations — from exploration to production.


The new Generative AI Ops services offering moves customers through the steps required to make AI applications production-ready. These include:

  • Prompt engineering, design, and optimization: Designing well-optimized prompts is important to ensure models can provide high-quality outputs and build user trust. Using best practices for prompt engineering, and techniques such as ReAct, retrieval augmented generation (RAG), and chain of thought, Google Cloud Consulting can help customers build solutions to improve the performance of their gen AI applications and the outputs of models. Importantly, different models are often suited to different use cases, and each of these models may require different prompting structures. Our expert teams will also help customers apply the right model to the right use case, and to apply the right prompting technique to the right model.
  • Performance and system evaluation: Successfully putting AI into production requires constant evaluation and feedback to improve the performance of models and applications. This services offering helps customers design and deploy an evaluation framework tailored to their applications, and build mechanisms for automated evaluation metrics using tools like AutoSxS and GenAI Eval, human evaluation, as well as hybrid approaches.
  • Model optimization and continuous tuning: Once a framework for performance and system evaluation is in place, gen AI applications and models still require continuous tuning and optimization. Gen AI Ops provides solutions and managed services for optimizing and tuning models based on human feedback and benchmarking. This includes improving system architecture and model selection, reducing latency and costs, and incorporating the latest APIs and and available tools to orchestrate and build AI agents using LangChain or DIY orchestrators to ensure applications run optimally.
  • Monitoring and observability: Having a robust monitoring solution in place is critical to ensuring AI applications are production-ready. Google Cloud Consulting can help customers build observability solutions to constantly monitor the operations and performance of their gen AI applications on a wide variety of factors, like model accuracy and hallucination, latency, throughput, hardware utilization, model drift, traffic, and costs.
  • Business integration and testing: It is critical that a customers’ applications and models perform well in real-world scenarios and integrate well with their business processes. Google Cloud Consulting can help customers through the careful planning required to achieve this, including setting up a scalable and secure environment on Google Cloud, designing APIs to efficiently manage interactions with various models, and implementing rigorous unit, integration, and load testing to evaluate their models’ performance under various conditions.

Train and enable customer teams

On top of the business planning and technical steps required to bring AI applications into production, training and team enablement are also critical priorities for customers wanting to see success in their cloud deployments. Through the Google Cloud Skills Boost Platform, Google Cloud offers a broad range of trainings, hands-on labs, bootcamps, and coursework to help upskill teams on generative AI, to ensure that customer teams can build, deploy, use, and manage new AI applications.

Get started

Ready to learn more? Discover how Google Cloud Consulting can help you learn, build, operate and succeed.