Introduction to Building Real-World AI Applications with AI Applications with Gemini and Imagen
1. BUILD REAL WORLD AI APPLICATIONS WITH GEMINI AND IMAGEN
Complete the introductory Build Real World AI Applications with Gemini and Imagen skill badge to demonstrate skills in the following: image recognition, natural language processing, image generation using Google’s powerful Gemini and Imagen models, deploying applications on the Vertex AI platform.
Objective
Generative AI on Vertex AI (also known as genAI or gen AI) gives you access to Google’s large generative AI models so you can test, tune, and deploy them for use in your AI-powered applications. In this lab, you will:
- Connect to Vertex AI (Google Cloud AI platform): Learn how to establish a connection to Google’s AI services using the Vertex AI SDK.
- Load a pre-trained generative AI model -Gemini: Discover how to use a powerful, pre-trained AI model without building one from scratch.
- Send image + text questions to the AI model: Understand how to provide input for the AI to process.
- Extract text-based answers from the AI: Learn to handle and interpret the text responses generated by the AI model.
- Understand the basics of building AI applications: Gain insights into the core concepts of integrating AI into software projects.
Working with Vertex AI Python SDK
After starting the lab, you will get a split pane view consisting of the Code Editor on the left side and the lab instructions on the right side. Follow these steps to interact with the Generative AI APIs using Vertex AI Python SDK.
- Click
File
→New
File to open a new file within the Code Editor. - Copy and paste the provided code snippet into your file.
import vertexai from vertexai.generative_models import GenerativeModel, Part def generate_text(project_id: str, location: str) -> str: # Initialize Vertex AI vertexai.init(project=project_id, location=location) # Load the model multimodal_model = GenerativeModel("gemini-1.0-pro-vision") # Query the model response = multimodal_model.generate_content( [ # Add an example image Part.from_uri( "gs://generativeai-downloads/images/scones.jpg", mime_type="image/jpeg" ), # Add an example query "what is shown in this image?", ] ) return response.text # -------- Important: Variable declaration -------- project_id = "project-id" location = "REGION" # -------- Call the Function -------- response = generate_text(project_id, location) print(response)
- Click
File
→Save
, choose genai.py for the Name field and clickSave
. - Execute the Python file by clicking the triangle icon on the Code Editor or by invoking the below command inside the terminal within the Code Editor pane to view the output.
/usr/bin/python3 /home/student/genai.py
The image shows a table with a cup of coffee, a bowl of blueberries, and a plate of scones with blueberries on it. There are also pink flowers on the table.
Note: If you encounter a 401 error, try re-running the code.
Code Explanation
- The code snippet is loading a pre-trained AI model called Gemini (gemini-1.0-pro-vision) on Vertex AI.
- The code calls the generate_content method of the loaded Gemini model.
- The input to the method is an image URI and a prompt containing a question about the image.
- The code uses Gemini’s ability to understand images and text together. It uses the text provided in the prompt to describe the contents of the image.
Try it yourself! Experiment with different image URIs and prompt questions to explore Gemini’s capabilities.
2. BUILD AN AI IMAGE GENERATOR APP USING IMAGEN ON VERTEX AI
Objective
Generative AI on Vertex AI (also known as genAI or gen AI) gives you access to Google’s large generative AI models so you can test, tune, and deploy them for use in your AI-powered applications. In this lab, you will:
- Connect to Vertex AI (Google Cloud AI platform): Learn how to establish a connection to Google’s AI services using the Vertex AI SDK.
- Load a pre-trained Image Generation Model : Discover how to use a powerful, pre-trained AI model without building one from scratch.
- Send text to the AI model: Understand how to provide input for the AI to process.
- Extract image-based answers from the AI: Learn to handle and interpret the image responses generated by the AI model.
- Understand the basics of building AI applications: Gain insights into the core concepts of integrating AI into software projects.
Working with Generative AI
After starting the lab, you will get a split pane view consisting of the Code Editor on the left side and the lab instructions on the right side. Follow these steps to interact with the Generative AI APIs using Vertex AI Python SDK.
- Click
File
→New
File to open a new file within the Code Editor. - Copy and paste the provided code snippet into your file.
import argparse import vertexai from vertexai.preview.vision_models import ImageGenerationModel def generate_image( project_id: str, location: str, output_file: str, prompt: str ) -> vertexai.preview.vision_models.ImageGenerationResponse: """Generate an image using a text prompt. Args: project_id: Google Cloud project ID, used to initialize Vertex AI. location: Google Cloud region, used to initialize Vertex AI. output_file: Local path to the output image file. prompt: The text prompt describing what you want to see.""" vertexai.init(project=project_id, location=location) model = ImageGenerationModel.from_pretrained ("imagegeneration@002") images = model.generate_images( prompt=prompt, # Optional parameters number_of_images=1, seed=1, add_watermark=False, ) images[0].save(location=output_file) return images generate_image( project_id='"project-id"', location='"REGION"', output_file='image.jpeg', prompt='Create an image of a cricket ground in the heart of Los Angeles', )
- Click
File
→Save
, choose GenerateImage.py for the Name field and clickSave
. - Execute the Python file by clicking the triangle icon on the Code Editor or by invoking the below command inside the terminal within the Code Editor pane. This will generate a image file with name
image.jpeg
./usr/bin/python3 /home/student/GenerateImage.py
- Now to view the generated image, Click
EXPLORER
>image.jpeg
Code Explanation
- The code snippet is loading a pre-trained AI model called ImageGenerationModel (imagegeneration@002) on Vertex AI.
- The code calls the generate_image method of the loaded Gemini model.
- The input to the method is a text prompt.
- The code uses Gemini’s ability to understand the text prompt and use it to build an AI Image.
Note: By default, a SynthID watermark is added to images, but you can disable it by specifying the optional parameter add_watermark=False. You can’t use a seed value and watermark at the same time. Learn more about SynthID watermark
3. BUILD AN APPLICATION TO SEND CHAT PROMPTS USING THE GEMINI MODEL
Objective
Generative AI on Vertex AI (also known as genAI or gen AI) gives you access to Google’s large generative AI models so you can test, tune, and deploy them for use in your AI-powered applications. In this lab, you will:
- Connect to Vertex AI (Google Cloud AI platform): Learn how to establish a connection to Google’s AI services using the Vertex AI SDK.
- Load a pre-trained generative AI model -Gemini: Discover how to use a powerful, pre-trained AI model without building one from scratch.
- Send text to the AI model: Understand how to provide input for the AI to process.
- Extract chat responses from the AI: Learn how to handle and interpret the chat responses generated by the AI model.
- Understand the basics of building AI applications: Gain insights into the core concepts of integrating AI into software projects.
Working with Generative AI
After starting the lab, you will get a split pane view consisting of the Code Editor on the left side and the lab instructions on the right side. Follow these steps to interact with the Generative AI APIs using Vertex AI Python SDK.
Chat responses without using stream:
Streaming involves receiving responses to prompts as they are generated. That is, as soon as the model generates output tokens, the output tokens are sent. A non-streaming response to prompts is sent only after all of the output tokens are generated.
First we’ll explore the chat responses without using stream.
Create a new file to get the chat responses without using stream:
- Click
File
→New
File to open a new file within the Code Editor. - Copy and paste the provided code snippet into your file.
import vertexai from vertexai.generative_models import GenerativeModel, ChatSession import logging from google.cloud import logging as gcp_logging # ------ Below cloud logging code is for Qwiklab's internal use, do not edit/remove it. -------- # Initialize GCP logging gcp_logging_client = gcp_logging.Client() gcp_logging_client.setup_logging() project_id = ""your-project-id"" location = ""REGION"" vertexai.init(project=project_id, location=location) model = GenerativeModel("gemini-1.0-pro") chat = model.start_chat() def get_chat_response(chat: ChatSession, prompt: str) -> str: logging.info(f'Sending prompt: {prompt}') response = chat.send_message(prompt) logging.info(f'Received response: {response.text}') return response.text prompt = "Hello." print(get_chat_response(chat, prompt)) prompt = "What are all the colors in a rainbow?" print(get_chat_response(chat, prompt)) prompt = "Why does it appear when it rains?" print(get_chat_response(chat, prompt))
-
Click
File
→Save
, enterSendChatwithoutStream.py
for the Name field and clickSave
. - Execute the Python file by clicking the triangle icon on the top-right corner of Code Editor or by running the below command inside the terminal within the Code Editor pane to view the output.
/usr/bin/python3 /home/student/SendChatwithoutStream.py
Hello there! 👋 How can I help you today? 🙂 A rainbow is a beautiful and colorful sight in the sky, formed by the reflection and refraction of sunlight through raindrops. The colors of a rainbow are always in the same order, and there are seven of them: **Red, Orange, Yellow, Green, Blue, Indigo, Violet** This mnemonic to remember the order of these colors is **ROY G. BIV.** Each color in a rainbow has a different wavelength of light, which is what our eyes perceive as different colors. Red light has the longest wavelength, while violet light has the shortest. Did you know that there are even more colors in a rainbow than the seven we can see? These colors are beyond the visible spectrum and include infrared and ultraviolet light. Thanks for asking! Is there anything else I can help you with? You're absolutely right! Rainbows are indeed frequently seen after a rain shower. Here's the science behind it: **Sunlight & Rain:** 1. **Sunlight:** The sun emits white light, which is a combination of all colors in the visible spectrum. 2. **Raindrops:** When sunlight enters a raindrop, it bends or refracts due to the water's different density. This bending separates the white light into its individual colors, just like a prism. 3. **Reflection & Refraction:** The separated colors then reflect off the back of the raindrop and refract again as they exit. This second refraction causes the colors to spread out even more. **Observer's Perspective:** 1. **Angle & Position:** The observer needs to be positioned at a specific angle relative to the sun and the rain. This ensures that the refracted and reflected light reaches their eyes. 2. **Full Spectrum:** The observer sees a full spectrum of colors in a circular arc, forming the beautiful rainbow. **Additional Factors:** * **Dark Background:** A dark background, like rain clouds, enhances the contrast and makes the rainbow more visible. * **Smaller Droplets:** Smaller raindrops create a brighter and more defined rainbow. * **Double Rainbows:** Sometimes, a secondary rainbow forms with the colors reversed. This occurs due to a second internal reflection within the raindrops. So, the next time you see a rainbow after a rain shower, remember this fascinating interplay of light, water, and your position! 🌈 Waiting up to 5 seconds. Sent all pending logs. WARNING: All log messages before absl::InitializeLog() is called are written to STDERR E0000 00:00:1739466638.582176 314 init.cc:232] grpc_wait_for_shutdown_with_timeout() timed out.
Code Explanation
- The code snippet is loading a pre-trained AI model called Gemini (gemini-1.0-pro) on Vertex AI.
- The code calls the get_chat_response method of the loaded Gemini model.
- The input to the method is a text prompt.
- The code uses Gemini’s ability to chat. It uses the text provided in the prompt to chat.
Chat responses with using stream:
Now we’ll explore the chat responses using stream.
Create a new file to get the chat responses with using stream:
- Click
File
→New
File to open a new file within the Code Editor. - Copy and paste the provided code snippet into your file.
import vertexai from vertexai.generative_models import GenerativeModel, ChatSession import logging from google.cloud import logging as gcp_logging # ------ Below cloud logging code is for Qwiklab's internal use, do not edit/remove it. -------- # Initialize GCP logging gcp_logging_client = gcp_logging.Client() gcp_logging_client.setup_logging() project_id = ""your-project-id"" location = ""REGION"" vertexai.init(project=project_id, location=location) model = GenerativeModel("gemini-1.0-pro") chat = model.start_chat() def get_chat_response(chat: ChatSession, prompt: str) -> str: text_response = [] logging.info(f'Sending prompt: {prompt}') responses = chat.send_message(prompt, stream=True) for chunk in responses: text_response.append(chunk.text) return "".join(text_response) logging.info(f'Received response: {response.text}') prompt = "Hello." print(get_chat_response(chat, prompt)) prompt = "What are all the colors in a rainbow?" print(get_chat_response(chat, prompt)) prompt = "Why does it appear when it rains?" print(get_chat_response(chat, prompt))
-
Click
File
→Save
, enter SendChatwithStream.py for the Name field and clickSave
. - Execute the Python file by clicking the triangle icon on the top-right corner of Code Editor or by running the below command inside the terminal within the Code Editor pane to view the output.
/usr/bin/python3 /home/student/SendChatwithStream.py
Hello! How can I help you today? A rainbow typically displays a spectrum of seven colors: 1. Red 2. Orange 3. Yellow 4. Green 5. Blue 6. Indigo 7. Violet Rainbows appear when sunlight interacts with water droplets in the air. The sunlight refracts, or bends, as it enters the water droplets. The different colors of sunlight refract at slightly different angles, causing them to separate. This separation of colors is what we see as a rainbow. For a rainbow to appear, the sun must be behind the observer and the air must contain water droplets. This is why rainbows are often seen after rainstorms. The water droplets in the air act as prisms, refracting the sunlight and creating the rainbow. The position of the rainbow in the sky depends on the position of the sun. When the sun is low in the sky, the rainbow will be high in the sky. When the sun is high in the sky, the rainbow will be lower in the sky. Rainbows are a beautiful and fascinating phenomenon. They are a reminder of the power of light and the beauty of nature. Waiting up to 5 seconds. Sent all pending logs. 1WARNING: All log messages before absl::InitializeLog() is called are written to STDERR E0000 00:00:1739466966.191826 375 init.cc:232] grpc_wait_for_shutdown_with_timeout() timed out.
Code Explanation
- The code snippet is loading a pre-trained AI model called Gemini (gemini-1.0-pro) on Vertex AI.
- The code calls the
get_chat_response
method of the loaded Gemini model. - The code is using stream=
True
while sending the messages. The stream=True
argument indicates that the responses should be streamed back, allowing for real-time processing. - The code uses Gemini’s ability to understand prompts and have a stateful chat conversation.
4. BUILD A MULTI-MODAL GEN AI APPLICATION: CHALLENGE LAB
Overview
In a challenge lab you’re given a scenario and a set of tasks. Instead of following step-by-step instructions, you will use the skills learned from the labs in the course to figure out how to complete the tasks on your own! An automated scoring system (shown on this page) will provide feedback on whether you have completed your tasks correctly.
When you take a challenge lab, you will not be taught new Google Cloud concepts. You are expected to extend your learned skills, like changing default values and reading and researching error messages to fix your own mistakes.
To score 100% you must successfully complete all tasks within the time period! Are you ready for the challenge?
- Labs are timed and cannot be paused. The timer starts when you click Start Lab.
- The included cloud terminal is preconfigured with the gcloud SDK.
- Use the terminal to execute commands and then click Check my progress to verify your work.
Challenge scenario
Scenario: You’re a developer at an AI-powered boquet design company. Your clients can describe their dream bouquet, and your system generates realistic images for their approval. To further enhance the experience, you’re integrating cutting-edge image analysis to provide descriptive summaries of the generated bouquets. Your main application will invoke the relevant methods based on the users’ interaction and to facilitate that, you need to finish the below tasks:
Task 1: Develop a Python function named generate_bouquet_image(prompt)
. This function should invoke the imagegeneration@002
model using the supplied prompt, generate the image, and store it locally. For this challenge, use the prompt: “Create an image containing a bouquet of 2 sunflowers and 3 roses”.
import argparse
import vertexai
from vertexai.preview.vision_models import ImageGenerationModel
def generate_bouquet_image(
project_id: str, location: str, output_file: str, prompt: str
) -> vertexai.preview.vision_models.ImageGenerationResponse:
"""Generate an image using a text prompt.
Args:
project_id: Google Cloud project ID, used to initialize Vertex AI.
location: Google Cloud region, used to initialize Vertex AI.
output_file: Local path to the output image file.
prompt: The text prompt describing what you want to see."""
vertexai.init(project=project_id, location=location)
model = ImageGenerationModel.from_pretrained("imagegeneration@002")
# Corrected indentation here
images = model.generate_images(
prompt=prompt,
# Optional parameters
number_of_images=1,
seed=1,
add_watermark=False,
)
images[0].save(location=output_file)
return images
generate_bouquet_image(
project_id='qwiklabs-gcp-04-562a2998af75',
location='us-east4',
output_file='image.jpeg',
prompt='Create an image containing a bouquet of 2 sunflowers and 3 roses',
)
Task 2: Develop a second Python function called analyze_bouquet_image(image_path)
. This function will take the image path as input along with a text prompt to generate birthday wishes based on the image passed and send it to the gemini-pro-vision
model. To ensure responses can be obtained as and when they are generated, enable streaming on the prompt requests.
import vertexai
from vertexai.generative_models import GenerativeModel, Part
def generate_text(project_id: str, location: str) -> str:
# Initialize Vertex AI
vertexai.init(project=project_id, location=location)
# Load the model
multimodal_model = GenerativeModel("gemini-1.0-pro-vision")
# Query the model
response = multimodal_model.generate_content(
[
# Add an example image
Part.from_uri(
"gs://generativeai-downloads/images/scones.jpg", mime_type="image/jpeg"
),
# Add an example query
"what is shown in this image?",
]
)
return response.text
# -------- Important: Variable declaration --------
project_id = "qwiklabs-gcp-03-2e6a5fcb4022"
location = "us-central1"
# -------- Call the Function --------
response = generate_text(project_id, location)
print(response)
/usr/bin/python3 /home/student/genai.py
sleep 30
/usr/bin/python3 /home/student/genai.py
Congratulations! You have completed the lab! Congratulations!!