此页面由 Cloud Translation API 翻译。

图片说明

imagetext 是支持图片说明的模型的名称。 imagetext 可以根据您指定的语言对您提供的图片生成图片说明。该模型支持以下语言：英语 (en)、德语 (de)、法语 (fr)、西班牙语 (es) 和意大利语 (it)。

如需在控制台中浏览此模型，请参阅模型库中的 Image Captioning 模型卡片。

前往 Model Garden

使用场景

图片说明的一些常见应用场景包括：

创建者可为上传的图片和视频生成图片说明（例如，视频序列的简短说明）
生成图片说明以描述产品
使用 API 将图片说明与应用集成，打造新体验

HTTP 请求

POST https://us-central1-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/us-central1/publishers/google/models/imagetext:predict

请求正文

{
  "instances": [
    {
      "image": {
        // Union field can be only one of the following:
        "bytesBase64Encoded": string,
        "gcsUri": string,
        // End of list of possible types for union field.
        "mimeType": string
      }
    }
  ],
  "parameters": {
    "sampleCount": integer,
    "storageUri": string,
    "language": string,
    "seed": integer
  }
}

对 Imagen 模型 imagetext 使用以下参数。如需了解详情，请参阅使用视觉图片说明获取图片说明。

参数	说明	可接受的值
`instances`	一个数组，包含要获取其相关信息的对象以及图片详细信息。	数组（允许 1 个图片对象）
`bytesBase64Encoded`	要显示说明的图片。	采用 Base64 编码的图片字符串（PNG 或 JPEG，最大 20 MB）
`gcsUri`	要显示说明的图片的 Cloud Storage URI。	Cloud Storage 中图片文件的字符串 URI（PNG 或 JPEG，最大 20 MB）
`mimeType`	可选。您指定的图片的 MIME 类型。	字符串（`image/jpeg` 或 `image/png`）
`sampleCount`	生成的文本字符串数。	整数值：1-3
`seed`	可选。随机数生成器 (RNG) 的种子。如果包含输入的请求的 RNG 种子相同，则预测结果将相同。	整数
`storageUri`	可选。用于保存生成的文本响应的 Cloud Storage 位置。	字符串
`language`	可选。引导响应的文本提示。	字符串：`en`（默认）、`de`、`fr`、`it`、`es`

示例请求

REST

如需使用 Vertex AI API 测试文本提示，请向发布方模型端点发送 POST 请求。

在使用任何请求数据之前，请先进行以下替换：

PROJECT_ID：您的 Google Cloud 项目 ID。
LOCATION：您的项目的区域。例如 us-central1、europe-west2 或 asia-northeast3。如需查看可用区域的列表，请参阅 Vertex AI 上的生成式 AI 位置。
B64_IMAGE：要获取其说明的图片。图片必须指定为 base64 编码的字节字符串。大小上限：10 MB。
RESPONSE_COUNT：您要生成的图片说明数量。接受的整数值：1-3。
LANGUAGE_CODE：支持的语言代码之一。支持的语言：
- 英语 (en)
- 法语 (fr)
- 德语 (de)
- 意大利语 (it)
- 西班牙语 (es)

HTTP 方法和网址：

POST https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagetext:predict

请求 JSON 正文：

{
  "instances": [
    {
      "image": {
          "bytesBase64Encoded": "B64_IMAGE"
      }
    }
  ],
  "parameters": {
    "sampleCount": RESPONSE_COUNT,
    "language": "LANGUAGE_CODE"
  }
}

如需发送请求，请选择以下方式之一：

curl

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI，或者使用了 Cloud Shell，这会使您自动登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

curl -X POST \
     -H "Authorization: Bearer $(gcloud auth print-access-token)" \
     -H "Content-Type: application/json; charset=utf-8" \
     -d @request.json \
     "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagetext:predict"

PowerShell

注意：以下命令假定您已使用您的用户账号通过运行 gcloud init 或 gcloud auth login 登录 gcloud CLI。您可以运行 gcloud auth list 来检查当前活跃的账号。

将请求正文保存在名为 request.json 的文件中，然后执行以下命令：

$cred = gcloud auth print-access-token
$headers = @{ "Authorization" = "Bearer $cred" }

Invoke-WebRequest `
    -Method POST `
    -Headers $headers `
    -ContentType: "application/json; charset=utf-8" `
    -InFile request.json `
    -Uri "https://LOCATION-aiplatform.googleapis.com/v1/projects/PROJECT_ID/locations/LOCATION/publishers/google/models/imagetext:predict" | Select-Object -Expand Content

以下示例响应适用于包含 "sampleCount": 2 的请求。该响应会返回两个预测字符串。

英语 (en)：

{
  "predictions": [
    "a yellow mug with a sheep on it sits next to a slice of cake",
    "a cup of coffee with a heart shaped latte art next to a slice of cake"
  ],
  "deployedModelId": "DEPLOYED_MODEL_ID",
  "model": "projects/PROJECT_ID/locations/LOCATION/models/MODEL_ID",
  "modelDisplayName": "MODEL_DISPLAYNAME",
  "modelVersionId": "1"
}

西班牙语 (es)：

{
  "predictions": [
    "una taza de café junto a un plato de pastel de chocolate",
    "una taza de café con una forma de corazón en la espuma"
  ]
}

响应正文

{
  "predictions": [ string ]
}

响应元素	说明
`predictions`	表示图片说明的文本字符串列表，按置信度排序。

示例响应

{
  "predictions": [
    "text1",
    "text2"
  ]
}