Ask or Find with Images as query
Agentic RAG now supports multimodal search. Enabling the possibility to find information and get generative answers using images as queries. This powerful feature enables you to combine visual context with a textual question, opening the door to more vision-based use-cases.
This is perfect for scenarios like:
- Component Identification: Combining an image of a vehicle part with the query "What vehicles does this fit?"
- Technical Specifications: Using a photo of an industrial lathe with the query "Show me the dimensions for this machine."
- Product Catalogs: Searching with an image of a product and asking, "Does this come in different colors?"
For detailed examples on specific use cases, keep an eye on the related article launch on nuclia.com.
How it works
Using an image as part of a query is available on both the find
and ask
endpoints.
To use this feature, you include the query_image
object in your request payload.
Parameters
As part of this query_image object, you will need to supply:
b64encoded
: The base64-encoded image data.content-type
: The MIME type of the image (e.g.,image/jpeg
,image/png
).
Notes
- You must always include a text
query
alongside the image. The text query provides context for the image search. - The functionality is only available if the generative model selected supports image inputs.
- The effectiveness of the image search may vary depending on the generative model selected.
API & SDK Examples
Example with ask
Through the API
POST https://{region-x}.rag.progress.cloud/api/v1/kb/{YOUR-KB-ID}/ask
{
"query": "What vehicles does this part fit?",
"query_image": {
"b64encoded": "{BASE64_ENCODED_IMAGE}",
"content-type": "image/jpeg"
}
}
With the nuclia.py SDK
import base64
from nuclia import sdk
from nucliadb_models.search import AskRequest, Image
search = sdk.NucliaSearch()
# Read the image file and encode it as base64
image_path = "path/to/my_image.jpg"
with open(image_path, "rb") as image_file:
image_data = base64.b64encode(image_file.read()).decode("utf-8")
# Perform the ask
request = AskRequest(
query="What vehicles does this part fit?",
query_image=Image(
b64encoded=image_data,
content_type="image/jpeg"
))
ask_result = search.ask(query=request)
Example with find
Through the API
POST https://{region-x}.rag.progress.cloud/api/v1/kb/{YOUR-KB-ID}/find
{
"query": "What vehicles does this part fit?",
"query_image": {
"b64encoded": "{BASE64_ENCODED_IMAGE}",
"content-type": "image/jpeg"
}
}
With the nuclia.py SDK
import base64
from nuclia import sdk
from nucliadb_models.search import FindRequest, Image
search = sdk.NucliaSearch()
# Read the image file and encode it as base64
image_path = "path/to/my_image.jpg"
with open(image_path, "rb") as image_file:
image_data = base64.b64encode(image_file.read()).decode("utf-8")
# Perform the find
request = FindRequest(
query="What vehicles does this part fit?",
query_image=Image(
b64encoded=image_data,
content_type="image/jpeg"
))
find_result = search.find(query=request)