此页面由 Cloud Translation API 翻译。

使用 Gemini Enterprise API 生成内容

使用 generateContent 或 streamGenerateContent 通过 Gemini 生成内容。

Gemini 模型系列包含可处理多模态提示请求的模型。多模态一词表示您可以在一个提示中使用多种模态（或输入类型）。非多模态模型仅接受文本提示。模态可以包括文本、音频、视频等。

创建一个 Google Cloud 账号即可开始使用

如要开始使用 Vertex AI API for Gemini，您需要先创建一个 Google Cloud 账号。

创建账号后，您可以使用本文档了解 Gemini 模型的请求正文、模型参数和响应正文，并查看一些示例请求。

准备就绪后，请参阅 Vertex AI API for Gemini 快速入门，了解如何使用编程语言 SDK 或 REST API 向 Vertex AI Gemini API 发送请求。

支持的模型

模型	版本
Gemini 1.5 Flash	`gemini-1.5-flash-001` `gemini-1.5-flash-002`
Gemini 1.5 Pro	`gemini-1.5-pro-001` `gemini-1.5-pro-002`
Gemini 1.0 Pro Vision	`gemini-1.0-pro-001` `gemini-1.0-pro-vision-001`
Gemini 1.0 Pro	`gemini-1.0-pro` `gemini-1.0-pro-001` `gemini-1.0-pro-002`

示例语法

用于生成模型回答的语法。

非在线播放

curl

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \

https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/publishers/google/models/${MODEL_ID}:generateContent \
-d '{
  "contents": [{
    ...
  }],
  "generationConfig": {
    ...
  },
  "safetySettings": {
    ...
  }
  ...
}'

Python

gemini_model = GenerativeModel(MODEL_ID)
generation_config = GenerationConfig(...)

model_response = gemini_model.generate_content([...], generation_config, safety_settings={...})

流式传输

curl

curl -X POST \
  -H "Authorization: Bearer $(gcloud auth print-access-token)" \
  -H "Content-Type: application/json" \
  https://${LOCATION}-aiplatform.googleapis.com/v1/projects/${PROJECT_ID}/locations/${LOCATION}/publishers/google/models/${MODEL_ID}:streamGenerateContent \
  -d '{
    "contents": [{
      ...
    }],
    "generationConfig": {
      ...
    },
    "safetySettings": {
      ...
    }
    ...
  }'

Python

gemini_model = GenerativeModel(MODEL_ID)
model_response = gemini_model.generate_content([...], generation_config, safety_settings={...}, stream=True)

参数列表

如需了解实现详情，请参阅示例。

请求正文

{
  "cachedContent": string,
  "contents": [
    {
      "role": string,
      "parts": [
        {
          // Union field data can be only one of the following:
          "text": string,
          "inlineData": {
            "mimeType": string,
            "data": string
          },
          "fileData": {
            "mimeType": string,
            "fileUri": string
          },
          // End of list of possible types for union field data.

          "videoMetadata": {
            "startOffset": {
              "seconds": integer,
              "nanos": integer
            },
            "endOffset": {
              "seconds": integer,
              "nanos": integer
            }
          }
        }
      ]
    }
  ],
  "systemInstruction": {
    "role": string,
    "parts": [
      {
        "text": string
      }
    ]
  },
  "tools": [
    {
      "functionDeclarations": [
        {
          "name": string,
          "description": string,
          "parameters": {
            object (OpenAPI Object Schema)
          }
        }
      ]
    }
  ],
  "safetySettings": [
    {
      "category": enum (HarmCategory),
      "threshold": enum (HarmBlockThreshold)
    }
  ],
  "generationConfig": {
    "temperature": number,
    "topP": number,
    "topK": number,
    "candidateCount": integer,
    "maxOutputTokens": integer,
    "presencePenalty": float,
    "frequencyPenalty": float,
    "stopSequences": [
      string
    ],
    "responseMimeType": string,
    "responseSchema": schema,
    "seed": integer,
    "responseLogprobs": boolean,
    "logprobs": integer,
    "audioTimestamp": boolean
  },
  "labels": {
    string: string
  }
}

请求正文中包含具有以下参数的数据：

参数
`cachedContent`	可选：`string` 缓存内容的名称，用于作为上下文来提供预测。格式：`projects/{project}/locations/{location}/cachedContents/{cachedContent}`
`contents`	必需：`Content` 与模型的当前对话内容。对于单轮询问，这是一个实例。对于多轮询问，此字段是重复字段，包含对话记录和最新请求。
`systemInstruction`	可选：`Content`。适用于 `gemini-1.5-flash`、`gemini-1.5-pro` 和 `gemini-1.0-pro-002`。有关引导模型获得更好性能的说明。例如，“回答尽可能简明扼要”或“请勿在回答中使用技术词汇”。 `text` 字符串会计入 token 限制。 `systemInstruction` 的 `role` 字段会被忽略，不会影响模型的性能。注意：`parts` 中应仅使用 `text`，并且每个 `part` 中的内容应位于单独的段落中。
`tools`	可选。一段代码，可让系统与外部系统进行交互，以在模型知识和范围之外执行操作或一组操作。请参阅函数调用。
`toolConfig`	可选。请参阅函数调用。
`safetySettings`	可选：`SafetySetting`。用于屏蔽不安全内容的每次请求设置。在 `GenerateContentResponse.candidates` 上强制执行。
`generationConfig`	可选：`GenerationConfig`。生成配置设置。
`labels`	可选：`string`。您可以以键值对的形式向 API 调用添加的元数据。

`contents`

包含消息多部分内容的基本结构化数据类型。

此类包含两个主要属性：role 和 parts。role 属性表示生成内容的个人，而 parts 属性包含多个元素，每个元素表示消息中的一段数据。

参数

参数
`role`	可选：`string`。创建消息的实体的身份。支持以下值： `user`：表示消息是由真人发送的，通常是用户生成的消息。 `model`：表示消息是由模型生成的。 `model` 值用于在多轮对话期间将来自模型的消息插入到对话中。对于非多轮对话，此字段可以留空或未设置。
`parts`	`Part` 构成单条消息的有序部分的列表。不同的部分可能具有不同的 IANA MIME 类型。如需了解输入限制（例如 token 或图片数量上限），请参阅 Google 模型页面上的模型规范部分。如需计算请求中的 token 数量，请参阅获取 token 数。

role

可选：string。

创建消息的实体的身份。支持以下值：

user：表示消息是由真人发送的，通常是用户生成的消息。
model：表示消息是由模型生成的。

model 值用于在多轮对话期间将来自模型的消息插入到对话中。

对于非多轮对话，此字段可以留空或未设置。

parts

Part

构成单条消息的有序部分的列表。不同的部分可能具有不同的 IANA MIME 类型。

如需了解输入限制（例如 token 或图片数量上限），请参阅 Google 模型页面上的模型规范部分。

如需计算请求中的 token 数量，请参阅获取 token 数。

`parts`

包含属于多部分 Content 消息一部分的媒体的数据类型。

参数
`text`	可选：`string`。文本提示或代码段。
`inlineData`	可选：`Blob`。原始字节中的内嵌数据。对于 `gemini-1.0-pro-vision`，您可以使用 `inlineData` 指定最多 1 张图片。如需指定最多 16 张图片，请使用 `fileData`。
`fileData`	可选：`fileData`。存储在文件中的数据。
`functionCall`	可选：`FunctionCall`。它包含表示 `FunctionDeclaration.name` 字段的字符串，以及包含模型预测的函数调用的所有参数的结构化 JSON 对象。请参阅函数调用。
`functionResponse`	可选：`FunctionResponse`。 `FunctionCall` 的结果输出，其中包含表示 `FunctionDeclaration.name` 字段的字符串和包含函数调用的任何输出的结构化 JSON 对象。它将用作模型的上下文。请参阅函数调用。
`videoMetadata`	可选：`VideoMetadata`。对于视频输入，为视频的开始和结束偏移量，采用时长格式。例如，如需指定从 1:00 开始的 10 秒剪辑，请设置 `"startOffset": { "seconds": 60 }` 和 `"endOffset": { "seconds": 70 }`。仅当视频数据以 `inlineData` 或 `fileData` 的形式呈现时，才应指定元数据。

`blob`

内容 blob。如果可能，这会以文本而非原始字节的形式发送。

参数

参数
`mimeType`	`string` 在 `data` 或 `fileUri` 字段中指定的文件的媒体类型。可接受的值包括：点击即可展开 MIME 类型 `application/pdf` `audio/mpeg` `audio/mp3` `audio/wav` `image/png` `image/jpeg` `image/webp` `text/plain` `video/mov` `video/mpeg` `video/mp4` `video/mpg` `video/avi` `video/wmv` `video/mpegps` `video/flv` 对于 `gemini-1.0-pro-vision`，视频时长上限为 2 分钟。对于 Gemini 1.5 Pro 和 Gemini 1.5 Flash，音频文件的时长上限为 8.4 小时，视频文件（不含音频）的时长上限为 1 小时。如需了解详情，请参阅 Gemini 1.5 Pro 媒体要求。文本文件必须采用 UTF-8 编码。文本文件的内容会计入 token 数限制。图片分辨率没有限制。
`data`	`bytes` ：要在提示中包含内嵌的图片、PDF 或视频的 base64 编码。添加媒体内嵌时，您还必须指定数据的媒体类型 (`mimeType`)。大小上限：20MB

mimeType

string

在 data 或 fileUri 字段中指定的文件的媒体类型。可接受的值包括：

点击即可展开 MIME 类型

application/pdf
audio/mpeg
audio/mp3
audio/wav
image/png
image/jpeg
image/webp
text/plain
video/mov
video/mpeg
video/mp4
video/mpg
video/avi
video/wmv
video/mpegps
video/flv

对于 gemini-1.0-pro-vision，视频时长上限为 2 分钟。

对于 Gemini 1.5 Pro 和 Gemini 1.5 Flash，音频文件的时长上限为 8.4 小时，视频文件（不含音频）的时长上限为 1 小时。如需了解详情，请参阅 Gemini 1.5 Pro 媒体要求。

文本文件必须采用 UTF-8 编码。文本文件的内容会计入 token 数限制。

图片分辨率没有限制。

data

bytes

：要在提示中包含内嵌的图片、PDF 或视频的 base64 编码。添加媒体内嵌时，您还必须指定数据的媒体类型 (mimeType)。

大小上限：20MB

FileData

URI 或网址数据。

参数

参数
`mimeType`	`string` 数据的 IANA MIME 类型。
`fileUri`	`string` 要包含在提示中的文件的 URI 或网址。可接受的值包括： Cloud Storage 存储桶 URI：对象必须可公开读取，或者位于发送请求的同一 Google Cloud 项目中。对于 `gemini-1.5-pro` 和 `gemini-1.5-flash`，大小限制为 2 GB。对于 `gemini-1.0-pro-vision`，大小限制为 20 MB。 HTTP 网址：文件网址必须可公开读取。您可以为每个请求指定一个视频文件、一个音频文件和最多 10 个图片文件。音频文件、视频文件和文档的大小不得超过 15 MB。 YouTube 视频网址：YouTube 视频必须由您用于登录 Google Cloud 控制台的账号所拥有，或者是公开的。每个请求仅支持一个 YouTube 视频网址。指定 `fileURI` 时，您还必须指定文件的媒体类型 (`mimeType`)。

mimeType

string

数据的 IANA MIME 类型。

fileUri

string

要包含在提示中的文件的 URI 或网址。可接受的值包括：

Cloud Storage 存储桶 URI：对象必须可公开读取，或者位于发送请求的同一 Google Cloud 项目中。对于 gemini-1.5-pro 和 gemini-1.5-flash，大小限制为 2 GB。对于 gemini-1.0-pro-vision，大小限制为 20 MB。
HTTP 网址：文件网址必须可公开读取。您可以为每个请求指定一个视频文件、一个音频文件和最多 10 个图片文件。音频文件、视频文件和文档的大小不得超过 15 MB。
YouTube 视频网址：YouTube 视频必须由您用于登录 Google Cloud 控制台的账号所拥有，或者是公开的。每个请求仅支持一个 YouTube 视频网址。

指定 fileURI 时，您还必须指定文件的媒体类型 (mimeType)。

`functionCall`

从模型返回的预测 functionCall，其中包含表示 functionDeclaration.name 的字符串和包含参数及其值的结构化 JSON 对象。

参数

参数
`name`	`string` 要调用的函数名称。
`args`	`Struct` 函数参数和值（采用 JSON 对象格式）。如需了解参数详情，请参阅函数调用。

name

string

要调用的函数名称。

args

Struct

函数参数和值（采用 JSON 对象格式）。

如需了解参数详情，请参阅函数调用。

`functionResponse`

FunctionCall 的生成输出，其中包含表示 FunctionDeclaration.name 的字符串。还包含带有函数输出的结构化 JSON 对象（并将其用作模型的上下文）。该文件应包含基于模型预测做出的 FunctionCall 结果。

参数

参数
`name`	`string` 要调用的函数名称。
`response`	`Struct` 函数响应，采用 JSON 对象格式。

name

string

要调用的函数名称。

response

Struct

函数响应，采用 JSON 对象格式。

`videoMetadata`

用于描述输入视频内容的元数据。

参数

参数
`startOffset`	可选：`google.protobuf.Duration`。视频的开始偏移量。
`endOffset`	可选：`google.protobuf.Duration`。视频的结束偏移量。

startOffset

可选：google.protobuf.Duration。

视频的开始偏移量。

endOffset

可选：google.protobuf.Duration。

视频的结束偏移量。

`safetySetting`

安全设置。

参数

参数
`category`	可选：`HarmCategory`。要为其配置阈值的安全类别。可接受的值包括：点击即可展开安全类别 `HARM_CATEGORY_SEXUALLY_EXPLICIT` `HARM_CATEGORY_HATE_SPEECH` `HARM_CATEGORY_HARASSMENT` `HARM_CATEGORY_DANGEROUS_CONTENT`
`threshold`	可选：`HarmBlockThreshold`。基于概率阻止属于指定安全类别的响应的阈值。 `OFF` `BLOCK_NONE` `BLOCK_LOW_AND_ABOVE` `BLOCK_MEDIUM_AND_ABOVE` `BLOCK_ONLY_HIGH`
`method`	可选：`HarmBlockMethod`。指定阈值是用于概率得分还是严重程度得分。如果未指定，则系统会使用该阈值来计算概率得分。

category

可选：HarmCategory。

要为其配置阈值的安全类别。可接受的值包括：

点击即可展开安全类别

HARM_CATEGORY_SEXUALLY_EXPLICIT
HARM_CATEGORY_HATE_SPEECH
HARM_CATEGORY_HARASSMENT
HARM_CATEGORY_DANGEROUS_CONTENT

threshold

可选：HarmBlockThreshold。

基于概率阻止属于指定安全类别的响应的阈值。

OFF
BLOCK_NONE
BLOCK_LOW_AND_ABOVE
BLOCK_MEDIUM_AND_ABOVE
BLOCK_ONLY_HIGH

method

可选：HarmBlockMethod。

指定阈值是用于概率得分还是严重程度得分。如果未指定，则系统会使用该阈值来计算概率得分。

`harmCategory`

屏蔽内容的有害类别。

参数
`HARM_CATEGORY_UNSPECIFIED`	未指定有害类别。
`HARM_CATEGORY_HATE_SPEECH`	有害类别为仇恨言论。
`HARM_CATEGORY_DANGEROUS_CONTENT`	有害类别为危险内容。
`HARM_CATEGORY_HARASSMENT`	有害类别为骚扰。
`HARM_CATEGORY_SEXUALLY_EXPLICIT`	有害类别为露骨色情内容。

`harmBlockThreshold`

用于屏蔽回答的概率阈值级别。

参数
`HARM_BLOCK_THRESHOLD_UNSPECIFIED`	未指定的有害屏蔽阈值。
`BLOCK_LOW_AND_ABOVE`	屏蔽低阈值及以上风险的内容（即，屏蔽更多内容）。
`BLOCK_MEDIUM_AND_ABOVE`	屏蔽中等阈值及以上。
`BLOCK_ONLY_HIGH`	仅屏蔽高风险内容阈值（即屏蔽较少内容）。
`BLOCK_NONE`	全部不屏蔽。
`OFF`	如果所有类别均处于关闭状态，则关闭安全模式

`harmBlockMethod`

根据概率和严重级别组合屏蔽回答的概率阈值。

参数
`HARM_BLOCK_METHOD_UNSPECIFIED`	未指定有害内容屏蔽方法。
`SEVERITY`	有害内容屏蔽方法同时使用可能性得分和严重程度得分。
`PROBABILITY`	有害内容屏蔽方法使用概率得分。

`generationConfig`

生成提示时使用的配置设置。

参数
`temperature`	可选：`float`。温度 (temperature) 在生成回复期间用于采样，在应用 `topP` 和 `topK` 时会生成回复。温度可以控制词元选择的随机性。较低的温度有利于需要更少开放性或创造性回复的提示，而较高的温度可以带来更具多样性或创造性的结果。温度为 `0` 表示始终选择概率最高的词元。在这种情况下，给定提示的回复大多是确定的，但可能仍然有少量变化。如果模型返回的回答过于笼统、过于简短，或者模型给出后备回答，请尝试提高温度。 `gemini-1.5-flash` 的范围：`0.0 - 2.0`（默认值：`1.0`） `gemini-1.5-pro` 的范围：`0.0 - 2.0`（默认值：`1.0`） `gemini-1.0-pro-vision` 的范围：`0.0 - 1.0`（默认值：`0.4`） `gemini-1.0-pro-002` 的范围：`0.0 - 2.0`（默认值：`1.0`） `gemini-1.0-pro-001` 的范围：`0.0 - 1.0`（默认值：`0.9`）如需了解详情，请参阅内容生成参数。
`topP`	可选：`float`。如果指定，则使用核采样。 Top-P 可更改模型选择输出词元的方式。系统会按照概率从最高（见 Top-K）到最低的顺序选择 token，直到所选 token 的概率总和等于 Top-P 的值。例如，如果 token A、B 和 C 的概率分别为 0.3、0.2 和 0.1，并且 top-P 值为 `0.5`，则模型将选择 A 或 B 作为下一个 token（通过温度确定），并会排除 C，将其作为候选 token。指定较低的值可获得随机程度较低的回答，指定较高的值可获得随机程度较高的回答。范围：`0.0 - 1.0` 默认（对于 `gemini-1.5-flash`）：`0.95` 默认（对于 `gemini-1.5-pro`）：`0.95` 默认（对于 `gemini-1.0-pro`）：`1.0` 默认（对于 `gemini-1.0-pro-vision`）：`1.0`
`topK`	可选：Top-K 可更改模型选择输出 token 的方式。如果 top-K 设为 `1`，表示所选 token 是模型词汇表的所有 token 中概率最高的 token（也称为贪心解码）。如果 top-K 设为 `3`，则表示系统将从 3 个概率最高的 token（通过温度确定）中选择下一个 token。在每个词元选择步中，系统都会对概率最高的 Top-K 词元进行采样。然后，系统会根据 Top-P 进一步过滤 token，并使用温度采样选择最终的 token。指定较低的值可获得随机程度较低的回答，指定较高的值可获得随机程度较高的回答。范围：`1-40` 仅受 `gemini-1.0-pro-vision` 支持。默认（对于 `gemini-1.0-pro-vision`）： `32`
`candidateCount`	可选：`int`。要返回的响应变体数量。对于每个请求，您需要为所有候选词元的输出词元付费，但只需为输入词元支付一次费用。指定多个候选项是适用于 `generateContent` 的预览版功能（不支持 `streamGenerateContent`）。支持以下型号： Gemini 1.5 Flash：`1`-`8`，默认值：`1` Gemini 1.5 Pro：`1`-`8`，默认值：`1` Gemini 1.0 Pro：`1`-`8`，默认值：`1`
`maxOutputTokens`	可选：int 回复中可生成的词元数量上限。词元约为 4 个字符。100 个词元对应大约 60-80 个单词。指定较低的值可获得较短的回答，指定较高的值可获得可能较长的回答。如需了解详情，请参阅内容生成参数。
`stopSequences`	可选：`List[string]`。指定一个字符串列表，告知模型在响应中遇到其中一个字符串时，停止生成文本。如果某个字符串在响应中多次出现，则响应会在首次出现的位置截断。字符串区分大小写。例如，未指定 `stopSequences` 时，如果下面的内容是返回的回复： `public static string reverse(string myString)` 则返回的回复为以下内容，其中 `stopSequences` 设置为 `["Str", "reverse"]`： `public static string` 列表中的项数量上限为 5 项。如需了解详情，请参阅内容生成参数。
`presencePenalty`	可选：`float`。正罚分。正值会惩罚生成的文本中已存在的 token，从而增加生成更多样化内容的概率。 `presencePenalty` 的最大值为 `2.0`，但不包括该数值。最小值为 `-2.0`。受 `gemini-1.5-pro` 和 `gemini-1.5-flash` 支持。
`frequencyPenalty`	可选：`float`。正值会惩罚生成的文本中反复出现的 token，从而降低重复内容概率。 `frequencyPenalty` 的最大值为 `2.0`，但不包括该数值。最小值为 `-2.0`。受 `gemini-1.5-pro` 和 `gemini-1.5-flash` 支持。
`responseMimeType`	可选：`string (enum)`。适用于以下型号： `gemini-1.5-pro` `gemini-1.5-flash` 生成的候选文本的输出回答 MIME 类型。支持以下 MIME 类型： `application/json`：候选项中的 JSON 响应。 `text/plain`（默认）：纯文本输出。 `text/x.enum`：对于分类任务，输出响应架构中定义的枚举值。请指定适当的回答类型，以避免意外行为。例如，如果您需要 JSON 格式的响应，请指定 `application/json`，而不是 `text/plain`。
`responseSchema`	可选：架构生成候选文本时必须遵循的架构。如需了解详情，请参阅控制生成的输出。您必须指定 `responseMimeType` 参数才能使用此参数。适用于以下型号： `gemini-1.5-pro` `gemini-1.5-flash`
`seed`	可选：`int`。当种子固定为特定值时，模型会尽最大努力为重复请求提供相同的回答。无法保证确定性输出。此外，更改模型或参数设置（例如温度）可能会导致回答发生变化，即使您使用相同的种子值也是如此。默认情况下，系统会使用随机种子值。适用于以下型号： `gemini-1.5-pro` `gemini-1.5-flash` `gemini-1.0-pro-002` 这是预览版功能。
`responseLogprobs`	可选：`boolean`。如果为 true，则返回模型在每个步骤中选择的令牌的对数概率。默认情况下，此参数设置为 `false`。适用于以下型号： `gemini-1.5-flash` 这是预览版功能。
`logprobs`	可选：`int`。返回每个生成步骤中排名靠前的候选 token 的对数概率。模型选择的词元可能与每个步骤中的最可能候选词元不同。使用介于 `1` 到 `5` 范围内的整数值指定要返回的候选项数量。您必须启用 `responseLogprobs` 才能使用此参数。这是预览版功能。
`audioTimestamp`	可选：`boolean`。适用于以下型号： `gemini-1.5-pro-002` `gemini-1.5-flash-002` 支持对仅音频文件进行时间戳理解。这是预览版功能。

响应正文

{
  "candidates": [
    {
      "content": {
        "parts": [
          {
            "text": string
          }
        ]
      },
      "finishReason": enum (FinishReason),
      "safetyRatings": [
        {
          "category": enum (HarmCategory),
          "probability": enum (HarmProbability),
          "blocked": boolean
        }
      ],
      "citationMetadata": {
        "citations": [
          {
            "startIndex": integer,
            "endIndex": integer,
            "uri": string,
            "title": string,
            "license": string,
            "publicationDate": {
              "year": integer,
              "month": integer,
              "day": integer
            }
          }
        ]
      },
      "avgLogprobs": double,
      "logprobsResult": {
        "topCandidates": [
          {
            "candidates": [
              {
                "token": string,
                "logProbability": float
              }
            ]
          }
        ],
        "chosenCandidates": [
          {
            "token": string,
            "logProbability": float
          }
        ]
      }
    }
  ],
  "usageMetadata": {
    "promptTokenCount": integer,
    "candidatesTokenCount": integer,
    "totalTokenCount": integer
  },
  "modelVersion": string
}

响应元素	说明
`modelVersion`	用于生成的模型和版本。例如：`gemini-1.5-flash-002`。
`text`	生成的文本。
`finishReason`	模型停止生成词元的原因。如果为空，则模型尚未停止生成词元。由于回答使用上下文提示，因此无法更改模型停止生成词元的行为。 `FINISH_REASON_STOP`：模型的自然停止点或提供的停止序列。 `FINISH_REASON_MAX_TOKENS`：已达到请求中指定的 token 数量上限。 `FINISH_REASON_SAFETY`：由于出于安全原因标记了回答，token 生成已停止。请注意，如果内容过滤器阻止输出，则 `Candidate.content` 为空。 `FINISH_REASON_RECITATION`：由于回答因未经授权的引用而进行标记，因此 token 生成操作已停止。 `FINISH_REASON_BLOCKLIST`：由于回答包含禁用词，因此 token 生成操作已停止。 `FINISH_REASON_PROHIBITED_CONTENT`：由于回答因包含禁止的内容（例如儿童性虐待内容 [CSAM]）而被标记，因此 token 生成操作已停止。 `FINISH_REASON_SPII`：由于回答因敏感的个人身份信息 (SPII) 而被标记，因此 token 生成操作已停止。 `FINISH_REASON_MALFORMED_FUNCTION_CALL`：候选者因函数调用格式有误且无法解析而被屏蔽。 `FINISH_REASON_OTHER`：停止 token 的所有其他原因 `FINISH_REASON_UNSPECIFIED`：未指定完成原因。
`category`	要为其配置阈值的安全类别。可接受的值包括：点击即可展开安全类别 `HARM_CATEGORY_SEXUALLY_EXPLICIT` `HARM_CATEGORY_HATE_SPEECH` `HARM_CATEGORY_HARASSMENT` `HARM_CATEGORY_DANGEROUS_CONTENT`
`probability`	内容中的有害概率级别。 `HARM_PROBABILITY_UNSPECIFIED` `NEGLIGIBLE` `LOW` `MEDIUM` `HIGH`
`blocked`	一个与安全属性关联的布尔值标志，用于指示模型的输入或输出是否被阻止。
`startIndex`	一个整数，用于指定引用在 `content` 中的起始位置。
`endIndex`	一个整数，用于指定引用在 `content` 中的结束位置。
`url`	引用来源的网址。网址来源的示例可能是新闻网站或 GitHub 代码库。
`title`	引用来源的标题。来源标题的示例可能是新闻报道或书籍标题。
`license`	与引用关联的许可。
`publicationDate`	引用的发布日期。其有效格式为 `YYYY`、`YYYY-MM`、`YYYY-MM-DD`。
`avgLogprobs`	候选项的平均对数概率。
`logprobsResult`	返回每个步骤中排名靠前的候选 token (`topCandidates`) 和实际选择的 token (`chosenCandidates`)。
`token`	生成式 AI 模型会将文本数据细分为 token 以进行处理，这些 token 可以是字符、单词或短语。
`logProbability`	对特定令牌的对数概率值，表示模型对该令牌的置信度。
`promptTokenCount`	请求中的词元数量。
`candidatesTokenCount`	响应中的词元数量。
`totalTokenCount`	请求和响应中的词元数量。

示例

非流式文本回答

根据文本输入生成非流式传输模型回答。

REST

在使用任何请求数据之前，请先进行以下替换：

PROJECT_ID：您的项目 ID。
LOCATION：处理请求的区域。
MODEL_ID：您要使用的模型的模型 ID（例如 gemini-1.5-flash-002）。请参阅受支持的模型列表。
TEXT：要包含在提示中的文本说明。

HTTP 方法和网址：

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:generateContent

请求 JSON 正文：

{
  "contents": [{
    "role": "user",
    "parts": [{
      "text": "TEXT"
    }]
  }]
}

如需发送请求，请选择以下方式之一：

curl

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI，或者使用了 Cloud Shell，这会使您自动登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:generateContent"

PowerShell

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:generateContent" | Select-Object -Expand Content

Python

import vertexai
from vertexai.generative_models import GenerativeModel

# TODO(developer): Update and un-comment below line
# PROJECT_ID = "your-project-id"
vertexai.init(project=PROJECT_ID, location="us-central1")

model = GenerativeModel("gemini-1.5-flash-002")

response = model.generate_content(
    "What's a good name for a flower shop that specializes in selling bouquets of dried flowers?"
)

print(response.text)
# Example response:
# **Emphasizing the Dried Aspect:**
# * Everlasting Blooms
# * Dried & Delightful
# * The Petal Preserve
# ...

NodeJS

const {VertexAI} = require('@google-cloud/vertexai');

/**
 * TODO(developer): Update these variables before running the sample.
 */
async function generate_from_text_input(projectId = 'PROJECT_ID') {
  const vertexAI = new VertexAI({project: projectId, location: 'us-central1'});

  const generativeModel = vertexAI.getGenerativeModel({
    model: 'gemini-1.5-flash-001',
  });

  const prompt =
    "What's a good name for a flower shop that specializes in selling bouquets of dried flowers?";

  const resp = await generativeModel.generateContent(prompt);
  const contentResponse = await resp.response;
  console.log(JSON.stringify(contentResponse));
}

Java

import com.google.cloud.vertexai.VertexAI;
import com.google.cloud.vertexai.api.GenerateContentResponse;
import com.google.cloud.vertexai.generativeai.GenerativeModel;
import com.google.cloud.vertexai.generativeai.ResponseHandler;

public class QuestionAnswer {

  public static void main(String[] args) throws Exception {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "your-google-cloud-project-id";
    String location = "us-central1";
    String modelName = "gemini-1.5-flash-001";

    String output = simpleQuestion(projectId, location, modelName);
    System.out.println(output);
  }

  // Asks a question to the specified Vertex AI Gemini model and returns the generated answer.
  public static String simpleQuestion(String projectId, String location, String modelName)
      throws Exception {
    // Initialize client that will be used to send requests.
    // This client only needs to be created once, and can be reused for multiple requests.
    try (VertexAI vertexAI = new VertexAI(projectId, location)) {
      String output;
      GenerativeModel model = new GenerativeModel(modelName, vertexAI);
      // Send the question to the model for processing.
      GenerateContentResponse response = model.generateContent("Why is the sky blue?");
      // Extract the generated text from the model's response.
      output = ResponseHandler.getText(response);
      return output;
    }
  }
}

Go

import (
	"context"
	"encoding/json"
	"fmt"
	"io"

	"cloud.google.com/go/vertexai/genai"
)

func generateContentFromText(w io.Writer, projectID string) error {
	location := "us-central1"
	modelName := "gemini-1.5-flash-001"

	ctx := context.Background()
	client, err := genai.NewClient(ctx, projectID, location)
	if err != nil {
		return fmt.Errorf("error creating client: %w", err)
	}
	gemini := client.GenerativeModel(modelName)
	prompt := genai.Text(
		"What's a good name for a flower shop that specializes in selling bouquets of dried flowers?")

	resp, err := gemini.GenerateContent(ctx, prompt)
	if err != nil {
		return fmt.Errorf("error generating content: %w", err)
	}
	// See the JSON response in
	// https://pkg.go.dev/cloud.google.com/go/vertexai/genai#GenerateContentResponse.
	rb, err := json.MarshalIndent(resp, "", "  ")
	if err != nil {
		return fmt.Errorf("json.MarshalIndent: %w", err)
	}
	fmt.Fprintln(w, string(rb))
	return nil
}

C#


using Google.Cloud.AIPlatform.V1;
using System;
using System.Threading.Tasks;

public class TextInputSample
{
    public async Task<string> TextInput(
        string projectId = "your-project-id",
        string location = "us-central1",
        string publisher = "google",
        string model = "gemini-1.5-flash-001")
    {

        var predictionServiceClient = new PredictionServiceClientBuilder
        {
            Endpoint = $"{location}-aiplatform.googleapis.com"
        }.Build();
        string prompt = @"What's a good name for a flower shop that specializes in selling bouquets of dried flowers?";

        var generateContentRequest = new GenerateContentRequest
        {
            Model = $"projects/{projectId}/locations/{location}/publishers/{publisher}/models/{model}",
            Contents =
            {
                new Content
                {
                    Role = "USER",
                    Parts =
                    {
                        new Part { Text = prompt }
                    }
                }
            }
        };

        GenerateContentResponse response = await predictionServiceClient.GenerateContentAsync(generateContentRequest);

        string responseText = response.Candidates[0].Content.Parts[0].Text;
        Console.WriteLine(responseText);

        return responseText;
    }
}

REST (OpenAI)

您可以使用 OpenAI 库调用推理 API。如需了解详情，请参阅使用 OpenAI 库调用 Vertex AI 模型。

在使用任何请求数据之前，请先进行以下替换：

PROJECT_ID：您的项目 ID。
LOCATION：处理请求的区域。
MODEL_ID：要使用的模型的名称。

HTTP 方法和网址：

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/openapi/chat/completions

请求 JSON 正文：

{
  "model": "google/MODEL_ID",
  "messages": [{
    "role": "user",
    "content": "Write a story about a magic backpack."
  }]
}

如需发送请求，请选择以下方式之一：

curl

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/openapi/chat/completions"

PowerShell

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/openapi/chat/completions" | Select-Object -Expand Content

Python (OpenAI)

您可以使用 OpenAI 库调用推理 API。如需了解详情，请参阅使用 OpenAI 库调用 Vertex AI 模型。

import vertexai
import openai

from google.auth import default, transport

# TODO(developer): Update and un-comment below line
# PROJECT_ID = "your-project-id"
location = "us-central1"

vertexai.init(project=PROJECT_ID, location=location)

# Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
auth_request = transport.requests.Request()
credentials.refresh(auth_request)

# # OpenAI Client
client = openai.OpenAI(
    base_url=f"https://{location}-aiplatform.googleapis.com/v1beta1/projects/{PROJECT_ID}/locations/{location}/endpoints/openapi",
    api_key=credentials.token,
)

response = client.chat.completions.create(
    model="google/gemini-1.5-flash-002",
    messages=[{"role": "user", "content": "Why is the sky blue?"}],
)

print(response.choices[0].message.content)
# Example response:
# The sky is blue due to a phenomenon called **Rayleigh scattering**.
# Sunlight is made up of all the colors of the rainbow.
# As sunlight enters the Earth's atmosphere ...

非流式多模态回答

根据多模态输入（例如文本和图片）生成非流式传输模型回答。

REST

在使用任何请求数据之前，请先进行以下替换：

PROJECT_ID：您的项目 ID。
LOCATION：处理请求的区域。
MODEL_ID：您要使用的模型的模型 ID（例如 gemini-1.5-flash-002）。请参阅受支持的模型列表。
TEXT：要包含在提示中的文本说明。
FILE_URI：存储数据的文件的 Cloud Storage URI。
MIME_TYPE：数据的 IANA MIME 类型。

HTTP 方法和网址：

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:generateContent

请求 JSON 正文：

{
  "contents": [{
    "role": "user",
    "parts": [
      {
        "text": "TEXT"
      },
      {
        "fileData": {
          "fileUri": "FILE_URI",
          "mimeType": "MIME_TYPE"
        }
      }
    ]
  }]
}

如需发送请求，请选择以下方式之一：

curl

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:generateContent"

PowerShell

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:generateContent" | Select-Object -Expand Content

Python

import vertexai

from vertexai.generative_models import GenerativeModel, Part

# TODO(developer): Update and un-comment below line
# PROJECT_ID = "your-project-id"
vertexai.init(project=PROJECT_ID, location="us-central1")

model = GenerativeModel("gemini-1.5-flash-002")

response = model.generate_content(
    [
        Part.from_uri(
            "gs://cloud-samples-data/generative-ai/image/scones.jpg",
            mime_type="image/jpeg",
        ),
        "What is shown in this image?",
    ]
)

print(response.text)
# That's a lovely overhead shot of a rustic-style breakfast or brunch spread.
# Here's what's in the image:
# * **Blueberry scones:** Several freshly baked blueberry scones are arranged on parchment paper.
# They look crumbly and delicious.
# ...

NodeJS

const {VertexAI} = require('@google-cloud/vertexai');

/**
 * TODO(developer): Update these variables before running the sample.
 */
async function createNonStreamingMultipartContent(
  projectId = 'PROJECT_ID',
  location = 'us-central1',
  model = 'gemini-1.5-flash-001',
  image = 'gs://generativeai-downloads/images/scones.jpg',
  mimeType = 'image/jpeg'
) {
  // Initialize Vertex with your Cloud project and location
  const vertexAI = new VertexAI({project: projectId, location: location});

  // Instantiate the model
  const generativeVisionModel = vertexAI.getGenerativeModel({
    model: model,
  });

  // For images, the SDK supports both Google Cloud Storage URI and base64 strings
  const filePart = {
    fileData: {
      fileUri: image,
      mimeType: mimeType,
    },
  };

  const textPart = {
    text: 'what is shown in this image?',
  };

  const request = {
    contents: [{role: 'user', parts: [filePart, textPart]}],
  };

  console.log('Prompt Text:');
  console.log(request.contents[0].parts[1].text);

  console.log('Non-Streaming Response Text:');

  // Generate a response
  const response = await generativeVisionModel.generateContent(request);

  // Select the text from the response
  const fullTextResponse =
    response.response.candidates[0].content.parts[0].text;

  console.log(fullTextResponse);
}

Java

import com.google.cloud.vertexai.VertexAI;
import com.google.cloud.vertexai.api.GenerateContentResponse;
import com.google.cloud.vertexai.generativeai.ContentMaker;
import com.google.cloud.vertexai.generativeai.GenerativeModel;
import com.google.cloud.vertexai.generativeai.PartMaker;
import com.google.cloud.vertexai.generativeai.ResponseHandler;

public class Multimodal {
  public static void main(String[] args) throws Exception {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "your-google-cloud-project-id";
    String location = "us-central1";
    String modelName = "gemini-1.5-flash-001";

    String output = nonStreamingMultimodal(projectId, location, modelName);
    System.out.println(output);
  }

  // Ask a simple question and get the response.
  public static String nonStreamingMultimodal(String projectId, String location, String modelName)
      throws Exception {
    // Initialize client that will be used to send requests.
    // This client only needs to be created once, and can be reused for multiple requests.
    try (VertexAI vertexAI = new VertexAI(projectId, location)) {
      GenerativeModel model = new GenerativeModel(modelName, vertexAI);

      String videoUri = "gs://cloud-samples-data/video/animals.mp4";
      String imgUri = "gs://cloud-samples-data/generative-ai/image/character.jpg";

      // Get the response from the model.
      GenerateContentResponse response = model.generateContent(
          ContentMaker.fromMultiModalData(
              PartMaker.fromMimeTypeAndData("video/mp4", videoUri),
              PartMaker.fromMimeTypeAndData("image/jpeg", imgUri),
              "Are this video and image correlated?"
          ));

      // Extract the generated text from the model's response.
      String output = ResponseHandler.getText(response);
      return output;
    }
  }
}

Go

import (
	"context"
	"encoding/json"
	"fmt"
	"io"

	"cloud.google.com/go/vertexai/genai"
)

func tryGemini(w io.Writer, projectID string, location string, modelName string) error {
	// location := "us-central1"
	// modelName := "gemini-1.5-flash-001"

	ctx := context.Background()
	client, err := genai.NewClient(ctx, projectID, location)
	if err != nil {
		return fmt.Errorf("error creating client: %w", err)
	}
	gemini := client.GenerativeModel(modelName)

	img := genai.FileData{
		MIMEType: "image/jpeg",
		FileURI:  "gs://generativeai-downloads/images/scones.jpg",
	}
	prompt := genai.Text("What is in this image?")

	resp, err := gemini.GenerateContent(ctx, img, prompt)
	if err != nil {
		return fmt.Errorf("error generating content: %w", err)
	}
	rb, err := json.MarshalIndent(resp, "", "  ")
	if err != nil {
		return fmt.Errorf("json.MarshalIndent: %w", err)
	}
	fmt.Fprintln(w, string(rb))
	return nil
}

C#


using Google.Api.Gax.Grpc;
using Google.Cloud.AIPlatform.V1;
using System.Text;
using System.Threading.Tasks;

public class GeminiQuickstart
{
    public async Task<string> GenerateContent(
        string projectId = "your-project-id",
        string location = "us-central1",
        string publisher = "google",
        string model = "gemini-1.5-flash-001"
    )
    {
        // Create client
        var predictionServiceClient = new PredictionServiceClientBuilder
        {
            Endpoint = $"{location}-aiplatform.googleapis.com"
        }.Build();

        // Initialize content request
        var generateContentRequest = new GenerateContentRequest
        {
            Model = $"projects/{projectId}/locations/{location}/publishers/{publisher}/models/{model}",
            GenerationConfig = new GenerationConfig
            {
                Temperature = 0.4f,
                TopP = 1,
                TopK = 32,
                MaxOutputTokens = 2048
            },
            Contents =
            {
                new Content
                {
                    Role = "USER",
                    Parts =
                    {
                        new Part { Text = "What's in this photo?" },
                        new Part { FileData = new() { MimeType = "image/png", FileUri = "gs://generativeai-downloads/images/scones.jpg" } }
                    }
                }
            }
        };

        // Make the request, returning a streaming response
        using PredictionServiceClient.StreamGenerateContentStream response = predictionServiceClient.StreamGenerateContent(generateContentRequest);

        StringBuilder fullText = new();

        // Read streaming responses from server until complete
        AsyncResponseStream<GenerateContentResponse> responseStream = response.GetResponseStream();
        await foreach (GenerateContentResponse responseItem in responseStream)
        {
            fullText.Append(responseItem.Candidates[0].Content.Parts[0].Text);
        }

        return fullText.ToString();
    }
}

REST (OpenAI)

您可以使用 OpenAI 库调用推理 API。如需了解详情，请参阅使用 OpenAI 库调用 Vertex AI 模型。

在使用任何请求数据之前，请先进行以下替换：

PROJECT_ID：您的项目 ID。
LOCATION：处理请求的区域。
MODEL_ID：要使用的模型的名称。

HTTP 方法和网址：

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/openapi/chat/completions

请求 JSON 正文：

{
  "model": "google/MODEL_ID",
  "messages": [{
    "role": "user",
    "content": [
       {
          "type": "text",
          "text": "Describe the following image:"
       },
       {
          "type": "image_url",
          "image_url": {
             "url": "gs://generativeai-downloads/images/character.jpg"
          }
       }
     ]
  }]
}

如需发送请求，请选择以下方式之一：

curl

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/openapi/chat/completions"

PowerShell

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/openapi/chat/completions" | Select-Object -Expand Content

Python (OpenAI)

您可以使用 OpenAI 库调用推理 API。如需了解详情，请参阅使用 OpenAI 库调用 Vertex AI 模型。

import vertexai
import openai

from google.auth import default, transport

# TODO(developer): Update and un-comment below lines
# PROJECT_ID = "your-project-id"
location = "us-central1"

vertexai.init(project=PROJECT_ID, location=location)

# Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
auth_request = transport.requests.Request()
credentials.refresh(auth_request)

# OpenAI Client
client = openai.OpenAI(
    base_url=f"https://{location}-aiplatform.googleapis.com/v1beta1/projects/{PROJECT_ID}/locations/{location}/endpoints/openapi",
    api_key=credentials.token,
)

response = client.chat.completions.create(
    model="google/gemini-1.5-flash-002",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe the following image:"},
                {
                    "type": "image_url",
                    "image_url": "gs://cloud-samples-data/generative-ai/image/scones.jpg",
                },
            ],
        }
    ],
)

print(response.choices[0].message.content)
# Example response:
# Here's a description of the image:
# High-angle, close-up view of a rustic arrangement featuring several blueberry scones
# on a piece of parchment paper. The scones are golden-brown...

流式传输文本回答

根据文本输入生成流式传输模型回答。

REST

在使用任何请求数据之前，请先进行以下替换：

PROJECT_ID：您的项目 ID。
LOCATION：处理请求的区域。
MODEL_ID：您要使用的模型的模型 ID（例如 gemini-1.5-flash-002）。请参阅受支持的模型列表。
TEXT：要包含在提示中的文本说明。

HTTP 方法和网址：

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:streamGenerateContent

请求 JSON 正文：

{
  "contents": [{
    "role": "user",
    "parts": [{
      "text": "TEXT"
    }]
  }]
}

如需发送请求，请选择以下方式之一：

curl

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:streamGenerateContent"

PowerShell

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:streamGenerateContent" | Select-Object -Expand Content

Python

import vertexai

from vertexai.generative_models import GenerativeModel

# TODO(developer): Update and un-comment below line
# PROJECT_ID = "your-project-id"
vertexai.init(project=PROJECT_ID, location="us-central1")

model = GenerativeModel("gemini-1.5-flash-002")
responses = model.generate_content(
    "Write a story about a magic backpack.", stream=True
)

for response in responses:
    print(response.text)
# Example response:
# El
# ara wasn't looking for magic. She was looking for rent money.
# Her tiny apartment, perched precariously on the edge of Whispering Woods,
# ...

NodeJS

const {VertexAI} = require('@google-cloud/vertexai');

/**
 * TODO(developer): Update these variables before running the sample.
 */
const PROJECT_ID = process.env.CAIP_PROJECT_ID;
const LOCATION = process.env.LOCATION;
const MODEL = 'gemini-1.5-flash-001';

async function generateContent() {
  // Initialize Vertex with your Cloud project and location
  const vertexAI = new VertexAI({project: PROJECT_ID, location: LOCATION});

  // Instantiate the model
  const generativeModel = vertexAI.getGenerativeModel({
    model: MODEL,
  });

  const request = {
    contents: [
      {
        role: 'user',
        parts: [
          {
            text: 'Write a story about a magic backpack.',
          },
        ],
      },
    ],
  };

  console.log(JSON.stringify(request));

  const result = await generativeModel.generateContentStream(request);
  for await (const item of result.stream) {
    console.log(item.candidates[0].content.parts[0].text);
  }
}

Java

import com.google.cloud.vertexai.VertexAI;
import com.google.cloud.vertexai.generativeai.GenerativeModel;

public class StreamingQuestionAnswer {

  public static void main(String[] args) throws Exception {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "your-google-cloud-project-id";
    String location = "us-central1";
    String modelName = "gemini-1.5-flash-001";

    streamingQuestion(projectId, location, modelName);
  }

  // Ask a simple question and get the response via streaming.
  public static void streamingQuestion(String projectId, String location, String modelName)
      throws Exception {
    // Initialize client that will be used to send requests.
    // This client only needs to be created once, and can be reused for multiple requests.
    try (VertexAI vertexAI = new VertexAI(projectId, location)) {
      GenerativeModel model = new GenerativeModel(modelName, vertexAI);

      // Stream the result.
      model.generateContentStream("Write a story about a magic backpack.")
          .stream()
          .forEach(System.out::println);

      System.out.println("Streaming complete.");
    }
  }
}

Go

import (
	"context"
	"errors"
	"fmt"
	"io"

	"cloud.google.com/go/vertexai/genai"
	"google.golang.org/api/iterator"
)

// generateContent shows how to	send a basic streaming text prompt, writing
// the response to the provided io.Writer.
func generateContent(w io.Writer, projectID, modelName string) error {
	ctx := context.Background()

	client, err := genai.NewClient(ctx, projectID, "us-central1")
	if err != nil {
		return fmt.Errorf("unable to create client: %w", err)
	}
	defer client.Close()

	model := client.GenerativeModel(modelName)

	iter := model.GenerateContentStream(
		ctx,
		genai.Text("Write a story about a magic backpack."),
	)
	for {
		resp, err := iter.Next()
		if err == iterator.Done {
			return nil
		}
		if len(resp.Candidates) == 0 || len(resp.Candidates[0].Content.Parts) == 0 {
			return errors.New("empty response from model")
		}
		if err != nil {
			return err
		}
		fmt.Fprint(w, "generated response: ")
		for _, c := range resp.Candidates {
			for _, p := range c.Content.Parts {
				fmt.Fprintf(w, "%s ", p)
			}
		}
	}
}

REST (OpenAI)

您可以使用 OpenAI 库调用推理 API。如需了解详情，请参阅使用 OpenAI 库调用 Vertex AI 模型。

在使用任何请求数据之前，请先进行以下替换：

PROJECT_ID：您的项目 ID。
LOCATION：处理请求的区域。
MODEL_ID：要使用的模型的名称。

HTTP 方法和网址：

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/openapi/chat/completions

请求 JSON 正文：

{
  "model": "google/MODEL_ID",
  "stream": true,
  "messages": [{
    "role": "user",
    "content": "Write a story about a magic backpack."
  }]
}

如需发送请求，请选择以下方式之一：

curl

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/openapi/chat/completions"

PowerShell

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/openapi/chat/completions" | Select-Object -Expand Content

Python (OpenAI)

您可以使用 OpenAI 库调用推理 API。如需了解详情，请参阅使用 OpenAI 库调用 Vertex AI 模型。

import vertexai
import openai

from google.auth import default, transport

# TODO(developer): Update and un-comment below line
# PROJECT_ID = "your-project-id"
location = "us-central1"

vertexai.init(project=PROJECT_ID, location=location)

# Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
auth_request = transport.requests.Request()
credentials.refresh(auth_request)

# OpenAI Client
client = openai.OpenAI(
    base_url=f"https://{location}-aiplatform.googleapis.com/v1beta1/projects/{PROJECT_ID}/locations/{location}/endpoints/openapi",
    api_key=credentials.token,
)

response = client.chat.completions.create(
    model="google/gemini-1.5-flash-002",
    messages=[{"role": "user", "content": "Why is the sky blue?"}],
    stream=True,
)
for chunk in response:
    print(chunk.choices[0].delta.content)
# Example response:
# The sky is blue due to a phenomenon called **Rayleigh scattering**. Sunlight is
# made up of all the colors of the rainbow. When sunlight enters the Earth 's atmosphere,
# it collides with tiny air molecules (mostly nitrogen and oxygen). ...

流式多模态回答

根据多模态输入（例如文本和图片）生成流式模型回答。

REST

在使用任何请求数据之前，请先进行以下替换：

PROJECT_ID：您的项目 ID。
LOCATION：处理请求的区域。
MODEL_ID：您要使用的模型的模型 ID（例如 gemini-1.5-flash-002）。请参阅受支持的模型列表。
TEXT：要包含在提示中的文本说明。
FILE_URI1：存储数据的文件的 Cloud Storage URI。
MIME_TYPE1：数据的 IANA MIME 类型。
FILE_URI2：存储数据的文件的 Cloud Storage URI。
MIME_TYPE2：数据的 IANA MIME 类型。

HTTP 方法和网址：

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:streamGenerateContent

请求 JSON 正文：

{
  "contents": [{
    "role": "user",
    "parts": [
      {
        "text": "TEXT"
      },
      {
        "fileData": {
          "fileUri": "FILE_URI1",
          "mimeType": "MIME_TYPE1"
        }
      },
      {
        "fileData": {
          "fileUri": "FILE_URI2",
          "mimeType": "MIME_TYPE2"
        }
      }
    ]
  }]
}

如需发送请求，请选择以下方式之一：

curl

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:streamGenerateContent"

PowerShell

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/MODEL_ID:streamGenerateContent" | Select-Object -Expand Content

Python

import vertexai

from vertexai.generative_models import GenerativeModel, Part

# TODO(developer): Update and un-comment below line
# PROJECT_ID = "your-project-id"

vertexai.init(project=PROJECT_ID, location="us-central1")

model = GenerativeModel("gemini-1.5-flash-002")
responses = model.generate_content(
    [
        Part.from_uri(
            "gs://cloud-samples-data/generative-ai/video/animals.mp4", "video/mp4"
        ),
        Part.from_uri(
            "gs://cloud-samples-data/generative-ai/image/character.jpg",
            "image/jpeg",
        ),
        "Are these video and image correlated?",
    ],
    stream=True,
)

for response in responses:
    print(response.candidates[0].content.text)
# Example response:
# No, the video and image are not correlated. The video shows a Google Photos
# project where animals at the Los Angeles Zoo take selfies using modified cameras.
# The image is a simple drawing of a wizard.

NodeJS

const {VertexAI} = require('@google-cloud/vertexai');

/**
 * TODO(developer): Update these variables before running the sample.
 */
const PROJECT_ID = process.env.CAIP_PROJECT_ID;
const LOCATION = process.env.LOCATION;
const MODEL = 'gemini-1.5-flash-001';

async function generateContent() {
  // Initialize Vertex AI
  const vertexAI = new VertexAI({project: PROJECT_ID, location: LOCATION});
  const generativeModel = vertexAI.getGenerativeModel({model: MODEL});

  const request = {
    contents: [
      {
        role: 'user',
        parts: [
          {
            file_data: {
              file_uri: 'gs://cloud-samples-data/video/animals.mp4',
              mime_type: 'video/mp4',
            },
          },
          {
            file_data: {
              file_uri:
                'gs://cloud-samples-data/generative-ai/image/character.jpg',
              mime_type: 'image/jpeg',
            },
          },
          {text: 'Are this video and image correlated?'},
        ],
      },
    ],
  };

  const result = await generativeModel.generateContentStream(request);

  for await (const item of result.stream) {
    console.log(item.candidates[0].content.parts[0].text);
  }
}

Java

import com.google.cloud.vertexai.VertexAI;
import com.google.cloud.vertexai.generativeai.ContentMaker;
import com.google.cloud.vertexai.generativeai.GenerativeModel;
import com.google.cloud.vertexai.generativeai.PartMaker;

public class StreamingMultimodal {
  public static void main(String[] args) throws Exception {
    // TODO(developer): Replace these variables before running the sample.
    String projectId = "your-google-cloud-project-id";
    String location = "us-central1";
    String modelName = "gemini-1.5-flash-001";

    streamingMultimodal(projectId, location, modelName);
  }

  // Ask a simple question and get the response via streaming.
  public static void streamingMultimodal(String projectId, String location, String modelName)
      throws Exception {
    // Initialize client that will be used to send requests.
    // This client only needs to be created once, and can be reused for multiple requests.
    try (VertexAI vertexAI = new VertexAI(projectId, location)) {
      GenerativeModel model = new GenerativeModel(modelName, vertexAI);

      String videoUri = "gs://cloud-samples-data/video/animals.mp4";
      String imgUri = "gs://cloud-samples-data/generative-ai/image/character.jpg";

      // Stream the result.
      model.generateContentStream(
          ContentMaker.fromMultiModalData(
              PartMaker.fromMimeTypeAndData("video/mp4", videoUri),
              PartMaker.fromMimeTypeAndData("image/jpeg", imgUri),
              "Are this video and image correlated?"
          ))
          .stream()
          .forEach(System.out::println);
    }
  }
}

Go

import (
	"context"
	"errors"
	"fmt"
	"io"

	"cloud.google.com/go/vertexai/genai"
	"google.golang.org/api/iterator"
)

func generateContent(w io.Writer, projectID, modelName string) error {
	ctx := context.Background()

	client, err := genai.NewClient(ctx, projectID, "us-central1")
	if err != nil {
		return fmt.Errorf("unable to create client: %w", err)
	}
	defer client.Close()

	model := client.GenerativeModel(modelName)
	iter := model.GenerateContentStream(
		ctx,
		genai.FileData{
			MIMEType: "video/mp4",
			FileURI:  "gs://cloud-samples-data/generative-ai/video/animals.mp4",
		},
		genai.FileData{
			MIMEType: "video/jpeg",
			FileURI:  "gs://cloud-samples-data/generative-ai/image/character.jpg",
		},
		genai.Text("Are these video and image correlated?"),
	)
	for {
		resp, err := iter.Next()
		if err == iterator.Done {
			return nil
		}
		if len(resp.Candidates) == 0 || len(resp.Candidates[0].Content.Parts) == 0 {
			return errors.New("empty response from model")
		}
		if err != nil {
			return err
		}

		fmt.Fprint(w, "generated response: ")
		for _, c := range resp.Candidates {
			for _, p := range c.Content.Parts {
				fmt.Fprintf(w, "%s ", p)
			}
		}
		fmt.Fprint(w, "\n")
	}
}

REST (OpenAI)

您可以使用 OpenAI 库调用推理 API。如需了解详情，请参阅使用 OpenAI 库调用 Vertex AI 模型。

在使用任何请求数据之前，请先进行以下替换：

PROJECT_ID：您的项目 ID。
LOCATION：处理请求的区域。
MODEL_ID：要使用的模型的名称。

HTTP 方法和网址：

POST https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/openapi/chat/completions

请求 JSON 正文：

{
  "model": "google/MODEL_ID",
  "stream": true,
  "messages": [{
    "role": "user",
    "content": [
       {
          "type": "text",
          "text": "Describe the following image:"
       },
       {
          "type": "image_url",
          "image_url": {
             "url": "gs://generativeai-downloads/images/character.jpg"
          }
       }
     ]
  }]
}

如需发送请求，请选择以下方式之一：

curl

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/openapi/chat/completions"

PowerShell

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1beta1/projects/PROJECT_ID/locations/LOCATION/endpoints/openapi/chat/completions" | Select-Object -Expand Content

Python (OpenAI)

您可以使用 OpenAI 库调用推理 API。如需了解详情，请参阅使用 OpenAI 库调用 Vertex AI 模型。

import vertexai
import openai

from google.auth import default, transport

# TODO(developer): Update and un-comment below line
# PROJECT_ID = "your-project-id"
location = "us-central1"

vertexai.init(project=PROJECT_ID, location=location)

# Programmatically get an access token
credentials, _ = default(scopes=["https://www.googleapis.com/auth/cloud-platform"])
auth_request = transport.requests.Request()
credentials.refresh(auth_request)

# OpenAI Client
client = openai.OpenAI(
    base_url=f"https://{location}-aiplatform.googleapis.com/v1beta1/projects/{PROJECT_ID}/locations/{location}/endpoints/openapi",
    api_key=credentials.token,
)

response = client.chat.completions.create(
    model="google/gemini-1.5-flash-002",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe the following image:"},
                {
                    "type": "image_url",
                    "image_url": "gs://cloud-samples-data/generative-ai/image/scones.jpg",
                },
            ],
        }
    ],
    stream=True,
)
for chunk in response:
    print(chunk.choices[0].delta.content)
# Example response:
# Here's a description of the image:
# High-angle, close-up view of a rustic scene featuring several blueberry
# scones arranged on a piece of parchment paper...

模型版本

如需使用自动更新版本，请指定不含尾随版本号的模型名称，例如 gemini-1.5-flash，而不是 gemini-1.5-flash-001。

如需了解详情，请参阅 Gemini 模型版本和生命周期。

后续步骤

详细了解 Gemini API。
详细了解函数调用。
详细了解如何为 Gemini 模型设置标准回答。