Ask or Find with Images as query

Agentic RAG now supports multimodal search. Enabling the possibility to find information and get generative answers using images as queries. This powerful feature enables you to combine visual context with a textual question, opening the door to more vision-based use-cases.

This is perfect for scenarios like:

Component Identification: Combining an image of a vehicle part with the query "What vehicles does this fit?"
Technical Specifications: Using a photo of an industrial lathe with the query "Show me the dimensions for this machine."
Product Catalogs: Searching with an image of a product and asking, "Does this come in different colors?"

For detailed examples on specific use cases, keep an eye on the related article launch on nuclia.com.

How it works

Using an image as part of a query is available on both the find and ask endpoints.

To use this feature, you include the query_image object in your request payload.

Parameters

As part of this query_image object, you will need to supply:

b64encoded: The base64-encoded image data.
content-type: The MIME type of the image (e.g., image/jpeg, image/png).

Notes

You must always include a text query alongside the image. The text query provides context for the image search.
The functionality is only available if the generative model selected supports image inputs.
The effectiveness of the image search may vary depending on the generative model selected.

API & SDK Examples

Example with ask

Through the API

POST https://{region-x}.rag.progress.cloud/api/v1/kb/{YOUR-KB-ID}/ask
{
  "query": "What vehicles does this part fit?",
  "query_image": {
    "b64encoded": "{BASE64_ENCODED_IMAGE}",
    "content-type": "image/jpeg"
  }
}

With the nuclia.py SDK

import base64
from nuclia import sdk
from nucliadb_models.search import AskRequest, Image

search = sdk.NucliaSearch()

# Read the image file and encode it as base64
image_path = "path/to/my_image.jpg"
with open(image_path, "rb") as image_file:
    image_data = base64.b64encode(image_file.read()).decode("utf-8")

# Perform the ask
request = AskRequest(
  query="What vehicles does this part fit?",
  query_image=Image(
    b64encoded=image_data,
    content_type="image/jpeg"
  ))
ask_result = search.ask(query=request)

Example with find

Through the API

POST https://{region-x}.rag.progress.cloud/api/v1/kb/{YOUR-KB-ID}/find
{
  "query": "What vehicles does this part fit?",
  "query_image": {
    "b64encoded": "{BASE64_ENCODED_IMAGE}",
    "content-type": "image/jpeg"
  }
}

With the nuclia.py SDK

import base64
from nuclia import sdk
from nucliadb_models.search import FindRequest, Image

search = sdk.NucliaSearch()

# Read the image file and encode it as base64
image_path = "path/to/my_image.jpg"
with open(image_path, "rb") as image_file:
    image_data = base64.b64encode(image_file.read()).decode("utf-8")

# Perform the find
request = FindRequest(
  query="What vehicles does this part fit?",
  query_image=Image(
    b64encoded=image_data,
    content_type="image/jpeg"
  ))
find_result = search.find(query=request)

How it works​

API & SDK Examples​

How it works

API & SDK Examples