1. BUILD REAL WORLD AI APPLICATIONS WITH GEMINI AND IMAGEN

Complete the introductory Build Real World AI Applications with Gemini and Imagen skill badge to demonstrate skills in the following: image recognition, natural language processing, image generation using Google’s powerful Gemini and Imagen models, deploying applications on the Vertex AI platform.

Objective

Generative AI on Vertex AI (also known as genAI or gen AI) gives you access to Google’s large generative AI models so you can test, tune, and deploy them for use in your AI-powered applications. In this lab, you will:

  • Connect to Vertex AI (Google Cloud AI platform): Learn how to establish a connection to Google’s AI services using the Vertex AI SDK.
  • Load a pre-trained generative AI model -Gemini: Discover how to use a powerful, pre-trained AI model without building one from scratch.
  • Send image + text questions to the AI model: Understand how to provide input for the AI to process.
  • Extract text-based answers from the AI: Learn to handle and interpret the text responses generated by the AI model.
  • Understand the basics of building AI applications: Gain insights into the core concepts of integrating AI into software projects.

Working with Vertex AI Python SDK

After starting the lab, you will get a split pane view consisting of the Code Editor on the left side and the lab instructions on the right side. Follow these steps to interact with the Generative AI APIs using Vertex AI Python SDK.

  1. Click FileNew File to open a new file within the Code Editor.
  2. Copy and paste the provided code snippet into your file.
     import vertexai
     from vertexai.generative_models import GenerativeModel, Part
    
    
     def generate_text(project_id: str, location: str) -> str:
         # Initialize Vertex AI
         vertexai.init(project=project_id, location=location)
         # Load the model
         multimodal_model = GenerativeModel("gemini-1.0-pro-vision")
         # Query the model
         response = multimodal_model.generate_content(
             [
                 # Add an example image
                 Part.from_uri(
                     "gs://generativeai-downloads/images/scones.jpg", mime_type="image/jpeg"
                 ),
                 # Add an example query
                 "what is shown in this image?",
             ]
         )
    
         return response.text
    
     # --------  Important: Variable declaration  --------
    
     project_id = "project-id"
     location = "REGION"
    
     #  --------   Call the Function  --------
    
     response = generate_text(project_id, location)
     print(response)
    
  3. Click FileSave, choose genai.py for the Name field and click Save.
  4. Execute the Python file by clicking the triangle icon on the Code Editor or by invoking the below command inside the terminal within the Code Editor pane to view the output.
     /usr/bin/python3 /home/student/genai.py 
    
     The image shows a table with a cup of coffee, a bowl of blueberries, and a plate of scones with blueberries on it. There are also pink flowers on the table.
    

Note: If you encounter a 401 error, try re-running the code.

Code Explanation

  • The code snippet is loading a pre-trained AI model called Gemini (gemini-1.0-pro-vision) on Vertex AI.
  • The code calls the generate_content method of the loaded Gemini model.
  • The input to the method is an image URI and a prompt containing a question about the image.
  • The code uses Gemini’s ability to understand images and text together. It uses the text provided in the prompt to describe the contents of the image.

Try it yourself! Experiment with different image URIs and prompt questions to explore Gemini’s capabilities.


2. BUILD AN AI IMAGE GENERATOR APP USING IMAGEN ON VERTEX AI

Objective

Generative AI on Vertex AI (also known as genAI or gen AI) gives you access to Google’s large generative AI models so you can test, tune, and deploy them for use in your AI-powered applications. In this lab, you will:

  • Connect to Vertex AI (Google Cloud AI platform): Learn how to establish a connection to Google’s AI services using the Vertex AI SDK.
  • Load a pre-trained Image Generation Model : Discover how to use a powerful, pre-trained AI model without building one from scratch.
  • Send text to the AI model: Understand how to provide input for the AI to process.
  • Extract image-based answers from the AI: Learn to handle and interpret the image responses generated by the AI model.
  • Understand the basics of building AI applications: Gain insights into the core concepts of integrating AI into software projects.

Working with Generative AI

After starting the lab, you will get a split pane view consisting of the Code Editor on the left side and the lab instructions on the right side. Follow these steps to interact with the Generative AI APIs using Vertex AI Python SDK.

  1. Click FileNew File to open a new file within the Code Editor.
  2. Copy and paste the provided code snippet into your file.
     import argparse
    
     import vertexai
     from vertexai.preview.vision_models import ImageGenerationModel
    
     def generate_image(
         project_id: str, location: str, output_file: str, prompt: str
     ) -> vertexai.preview.vision_models.ImageGenerationResponse:
         """Generate an image using a text prompt.
         Args:
           project_id: Google Cloud project ID, used to initialize Vertex    AI.
           location: Google Cloud region, used to initialize Vertex AI.
           output_file: Local path to the output image file.
           prompt: The text prompt describing what you want to see."""   
    
         vertexai.init(project=project_id, location=location) 
    
         model = ImageGenerationModel.from_pretrained    ("imagegeneration@002")  
    
          images = model.generate_images(
              prompt=prompt,
             # Optional parameters
             number_of_images=1,
             seed=1,
             add_watermark=False,
         )
    
         images[0].save(location=output_file)
    
         return images
    
     generate_image(
         project_id='"project-id"',
         location='"REGION"',
         output_file='image.jpeg',
         prompt='Create an image of a cricket ground in the heart of Los     Angeles',
         )
    
  3. Click FileSave, choose GenerateImage.py for the Name field and click Save.
  4. Execute the Python file by clicking the triangle icon on the Code Editor or by invoking the below command inside the terminal within the Code Editor pane. This will generate a image file with name image.jpeg.
     /usr/bin/python3 /home/student/GenerateImage.py
    
  5. Now to view the generated image, Click EXPLORER > image.jpeg image

Code Explanation

  • The code snippet is loading a pre-trained AI model called ImageGenerationModel (imagegeneration@002) on Vertex AI.
  • The code calls the generate_image method of the loaded Gemini model.
  • The input to the method is a text prompt.
  • The code uses Gemini’s ability to understand the text prompt and use it to build an AI Image.

Note: By default, a SynthID watermark is added to images, but you can disable it by specifying the optional parameter add_watermark=False. You can’t use a seed value and watermark at the same time. Learn more about SynthID watermark


3. BUILD AN APPLICATION TO SEND CHAT PROMPTS USING THE GEMINI MODEL

Objective

Generative AI on Vertex AI (also known as genAI or gen AI) gives you access to Google’s large generative AI models so you can test, tune, and deploy them for use in your AI-powered applications. In this lab, you will:

  • Connect to Vertex AI (Google Cloud AI platform): Learn how to establish a connection to Google’s AI services using the Vertex AI SDK.
  • Load a pre-trained generative AI model -Gemini: Discover how to use a powerful, pre-trained AI model without building one from scratch.
  • Send text to the AI model: Understand how to provide input for the AI to process.
  • Extract chat responses from the AI: Learn how to handle and interpret the chat responses generated by the AI model.
  • Understand the basics of building AI applications: Gain insights into the core concepts of integrating AI into software projects.

Working with Generative AI

After starting the lab, you will get a split pane view consisting of the Code Editor on the left side and the lab instructions on the right side. Follow these steps to interact with the Generative AI APIs using Vertex AI Python SDK.

Chat responses without using stream:

Streaming involves receiving responses to prompts as they are generated. That is, as soon as the model generates output tokens, the output tokens are sent. A non-streaming response to prompts is sent only after all of the output tokens are generated.

First we’ll explore the chat responses without using stream.

Create a new file to get the chat responses without using stream:

  1. Click FileNew File to open a new file within the Code Editor.
  2. Copy and paste the provided code snippet into your file.
     import vertexai
     from vertexai.generative_models import GenerativeModel, ChatSession
    
     import logging
     from google.cloud import logging as gcp_logging
    
     # ------  Below cloud logging code is for Qwiklab's internal use, do not edit/remove it. --------
     # Initialize GCP logging
     gcp_logging_client = gcp_logging.Client()
     gcp_logging_client.setup_logging()
    
     project_id = ""your-project-id""
     location = ""REGION""
    
     vertexai.init(project=project_id, location=location)
     model = GenerativeModel("gemini-1.0-pro")
     chat = model.start_chat()
    
     def get_chat_response(chat: ChatSession, prompt: str) -> str:
         logging.info(f'Sending prompt: {prompt}')
         response = chat.send_message(prompt)
         logging.info(f'Received response: {response.text}')
         return response.text
    
     prompt = "Hello."
     print(get_chat_response(chat, prompt))
    
     prompt = "What are all the colors in a rainbow?"
     print(get_chat_response(chat, prompt))
    
     prompt = "Why does it appear when it rains?"
     print(get_chat_response(chat, prompt))
    
    
  3. Click FileSave, enter SendChatwithoutStream.py for the Name field and click Save.

  4. Execute the Python file by clicking the triangle icon on the top-right corner of Code Editor or by running the below command inside the terminal within the Code Editor pane to view the output.
     /usr/bin/python3 /home/student/SendChatwithoutStream.py
    
     Hello there! 👋 
    
     How can I help you today? 🙂
     A rainbow is a beautiful and colorful sight in the sky, formed by the reflection and refraction of sunlight through raindrops. The colors of a rainbow are always in the same order, and there are seven of them:
        
     **Red, Orange, Yellow, Green, Blue, Indigo, Violet** 
    
     This mnemonic to remember the order of these colors is **ROY G. BIV.**
    
     Each color in a rainbow has a different wavelength of light, which is what our eyes perceive as different colors. Red light has the longest wavelength, while violet light has the shortest.
    
     Did you know that there are even more colors in a rainbow than the seven we can see? These colors are beyond the visible spectrum and include infrared and ultraviolet light. 
    
     Thanks for asking! Is there anything else I can help you with?
     You're absolutely right! Rainbows are indeed frequently seen after a rain shower. Here's the science behind it:
    
     **Sunlight & Rain:**
    
     1. **Sunlight:** The sun emits white light, which is a combination of all colors in the visible spectrum.
     2. **Raindrops:** When sunlight enters a raindrop, it bends or refracts due to the water's different density. This bending separates the white light into its individual colors, just like a prism.
     3. **Reflection & Refraction:** The separated colors then reflect off the back of the raindrop and refract again as they exit. This second refraction causes the colors to spread out even more.
    
     **Observer's Perspective:**
    
     1. **Angle & Position:** The observer needs to be positioned at a specific angle relative to the sun and the rain. This ensures that the refracted and reflected light reaches their eyes.
     2. **Full Spectrum:** The observer sees a full spectrum of colors in a circular arc, forming the beautiful rainbow.
    
     **Additional Factors:**
    
     * **Dark Background:** A dark background, like rain clouds, enhances the contrast and makes the rainbow more visible.
     * **Smaller Droplets:** Smaller raindrops create a brighter and more defined rainbow.
     * **Double Rainbows:** Sometimes, a secondary rainbow forms with the colors reversed. This occurs due to a second internal reflection within the raindrops.
    
     So, the next time you see a rainbow after a rain shower, remember this fascinating interplay of light, water, and your position! 🌈 
    
     Waiting up to 5 seconds.
     Sent all pending logs.
     WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
     E0000 00:00:1739466638.582176     314 init.cc:232] grpc_wait_for_shutdown_with_timeout() timed out.
    

Code Explanation

  • The code snippet is loading a pre-trained AI model called Gemini (gemini-1.0-pro) on Vertex AI.
  • The code calls the get_chat_response method of the loaded Gemini model.
  • The input to the method is a text prompt.
  • The code uses Gemini’s ability to chat. It uses the text provided in the prompt to chat.

Chat responses with using stream:

Now we’ll explore the chat responses using stream.

Create a new file to get the chat responses with using stream:

  1. Click FileNew File to open a new file within the Code Editor.
  2. Copy and paste the provided code snippet into your file.
     import vertexai
     from vertexai.generative_models import GenerativeModel, ChatSession
    
     import logging
     from google.cloud import logging as gcp_logging
    
     # ------  Below cloud logging code is for Qwiklab's internal use, do not edit/remove it. --------
     # Initialize GCP logging
     gcp_logging_client = gcp_logging.Client()
     gcp_logging_client.setup_logging()
    
     project_id = ""your-project-id""
     location = ""REGION""
    
     vertexai.init(project=project_id, location=location)
     model = GenerativeModel("gemini-1.0-pro")
     chat = model.start_chat()
    
     def get_chat_response(chat: ChatSession, prompt: str) -> str:
         text_response = []
         logging.info(f'Sending prompt: {prompt}')
         responses = chat.send_message(prompt, stream=True)
         for chunk in responses:
             text_response.append(chunk.text)
         return "".join(text_response)
         logging.info(f'Received response: {response.text}')
    
     prompt = "Hello."
     print(get_chat_response(chat, prompt))
    
     prompt = "What are all the colors in a rainbow?"
     print(get_chat_response(chat, prompt))
    
     prompt = "Why does it appear when it rains?"
     print(get_chat_response(chat, prompt))
    
    
  3. Click FileSave, enter SendChatwithStream.py for the Name field and click Save.

  4. Execute the Python file by clicking the triangle icon on the top-right corner of Code Editor or by running the below command inside the terminal within the Code Editor pane to view the output.
     /usr/bin/python3 /home/student/SendChatwithStream.py
    
     Hello! How can I help you today?
     A rainbow typically displays a spectrum of seven colors:
    
     1. Red
     2. Orange
     3. Yellow
     4. Green
     5. Blue
     6. Indigo
     7. Violet
     Rainbows appear when sunlight interacts with water droplets in the air. The sunlight refracts, or bends, as it enters the water droplets. The different colors of sunlight refract at slightly different angles, causing them to separate. This separation of colors is what we see as a rainbow.
    
     For a rainbow to appear, the sun must be behind the observer and the air must contain water droplets. This is why rainbows are often seen after rainstorms. The water droplets in the air act as prisms, refracting the sunlight and creating the rainbow.
    
     The position of the rainbow in the sky depends on the position of the sun. When the sun is low in the sky, the rainbow will be high in the sky. When the sun is high in the sky, the rainbow will be lower in the sky.
    
     Rainbows are a beautiful and fascinating phenomenon. They are a reminder of the power of light and the beauty of nature.
     Waiting up to 5 seconds.
     Sent all pending logs.
     1WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
     E0000 00:00:1739466966.191826     375 init.cc:232] grpc_wait_for_shutdown_with_timeout() timed out.
    

Code Explanation

  • The code snippet is loading a pre-trained AI model called Gemini (gemini-1.0-pro) on Vertex AI.
  • The code calls the get_chat_response method of the loaded Gemini model.
  • The code is using stream=True while sending the messages. The stream=True argument indicates that the responses should be streamed back, allowing for real-time processing.
  • The code uses Gemini’s ability to understand prompts and have a stateful chat conversation.

4. BUILD A MULTI-MODAL GEN AI APPLICATION: CHALLENGE LAB

Overview

In a challenge lab you’re given a scenario and a set of tasks. Instead of following step-by-step instructions, you will use the skills learned from the labs in the course to figure out how to complete the tasks on your own! An automated scoring system (shown on this page) will provide feedback on whether you have completed your tasks correctly.

When you take a challenge lab, you will not be taught new Google Cloud concepts. You are expected to extend your learned skills, like changing default values and reading and researching error messages to fix your own mistakes.

To score 100% you must successfully complete all tasks within the time period! Are you ready for the challenge?

  • Labs are timed and cannot be paused. The timer starts when you click Start Lab.
  • The included cloud terminal is preconfigured with the gcloud SDK.
  • Use the terminal to execute commands and then click Check my progress to verify your work.

Challenge scenario

Scenario: You’re a developer at an AI-powered boquet design company. Your clients can describe their dream bouquet, and your system generates realistic images for their approval. To further enhance the experience, you’re integrating cutting-edge image analysis to provide descriptive summaries of the generated bouquets. Your main application will invoke the relevant methods based on the users’ interaction and to facilitate that, you need to finish the below tasks:

Task 1: Develop a Python function named generate_bouquet_image(prompt). This function should invoke the imagegeneration@002 model using the supplied prompt, generate the image, and store it locally. For this challenge, use the prompt: “Create an image containing a bouquet of 2 sunflowers and 3 roses”.

import argparse
import vertexai
from vertexai.preview.vision_models import ImageGenerationModel
def generate_bouquet_image(
    project_id: str, location: str, output_file: str, prompt: str
) -> vertexai.preview.vision_models.ImageGenerationResponse:
    """Generate an image using a text prompt.
    Args:
      project_id: Google Cloud project ID, used to initialize Vertex AI.
      location: Google Cloud region, used to initialize Vertex AI.
      output_file: Local path to the output image file.
      prompt: The text prompt describing what you want to see."""   
    vertexai.init(project=project_id, location=location) 
    model = ImageGenerationModel.from_pretrained("imagegeneration@002")  

    # Corrected indentation here
    images = model.generate_images(
        prompt=prompt,
        # Optional parameters
        number_of_images=1,
        seed=1,
        add_watermark=False,
    )
    images[0].save(location=output_file)

    return images

generate_bouquet_image(
    project_id='qwiklabs-gcp-04-562a2998af75',
    location='us-east4',
    output_file='image.jpeg',
    prompt='Create an image containing a bouquet of 2 sunflowers and 3 roses',
)

image2

Task 2: Develop a second Python function called analyze_bouquet_image(image_path). This function will take the image path as input along with a text prompt to generate birthday wishes based on the image passed and send it to the gemini-pro-vision model. To ensure responses can be obtained as and when they are generated, enable streaming on the prompt requests.

import vertexai
from vertexai.generative_models import GenerativeModel, Part


def generate_text(project_id: str, location: str) -> str:
    # Initialize Vertex AI
    vertexai.init(project=project_id, location=location)
    # Load the model
    multimodal_model = GenerativeModel("gemini-1.0-pro-vision")
    # Query the model
    response = multimodal_model.generate_content(
        [
            # Add an example image
            Part.from_uri(
                "gs://generativeai-downloads/images/scones.jpg", mime_type="image/jpeg"
            ),
            # Add an example query
            "what is shown in this image?",
        ]
    )

    return response.text

# --------  Important: Variable declaration  --------

project_id = "qwiklabs-gcp-03-2e6a5fcb4022"
location = "us-central1"

#  --------   Call the Function  --------

response = generate_text(project_id, location)
print(response)
/usr/bin/python3 /home/student/genai.py
sleep 30
/usr/bin/python3 /home/student/genai.py

Congratulations! You have completed the lab! Congratulations!!

blob1

Updated: