Skip to main content

Oracle AI Vector Search: Generate Summary

Oracle AI Vector Search is designed for Artificial Intelligence (AI) workloads that allows you to query data based on semantics, rather than keywords. One of the biggest benefit of Oracle AI Vector Search is that semantic search on unstructured data can be combined with relational search on business data in one single system. This is not only powerful but also significantly more effective because you don't need to add a specialized vector database, eliminating the pain of data fragmentation between multiple systems.

The guide demonstrates how to use Summary Capabilities within Oracle AI Vector Search to generate summary for your documents using OracleSummary.

Prerequisites

Please install Oracle Python Client driver to use Langchain with Oracle AI Vector Search.

# pip install oracledb

Connect to Oracle Database

The following sample code will show how to connect to Oracle Database.

import sys

import oracledb

# please update with your username, password, hostname and service_name
username = "<username>"
password = "<password>"
dsn = "<hostname>/<service_name>"

try:
conn = oracledb.connect(user=username, password=password, dsn=dsn)
print("Connection successful!")
except Exception as e:
print("Connection failed!")
sys.exit(1)

Generate Summary

The Oracle AI Vector Search Langchain library provides APIs to generate summaries of documents. There are a few summary generation provider options including Database, OCIGENAI, HuggingFace and so on. The users can choose their preferred provider to generate a summary. They just need to set the summary parameters accordingly. Please refer to the Oracle AI Vector Search Guide book for complete information about these parameters.

Note: The users may need to set proxy if they want to use some 3rd party summary generation providers other than Oracle's in-house and default provider: 'database'. If you don't have proxy, please remove the proxy parameter when you instantiate the OracleSummary.

# proxy to be used when we instantiate summary and embedder object
proxy = "<proxy>"

The following sample code will show how to generate summary:

from langchain_community.utilities.oracleai import OracleSummary
from langchain_core.documents import Document

"""
# using 'ocigenai' provider
summary_params = {
"provider": "ocigenai",
"credential_name": "OCI_CRED",
"url": "https://inference.generativeai.us-chicago-1.oci.oraclecloud.com/20231130/actions/summarizeText",
"model": "cohere.command",
}

# using 'huggingface' provider
summary_params = {
"provider": "huggingface",
"credential_name": "HF_CRED",
"url": "https://api-inference.huggingface.co/models/",
"model": "facebook/bart-large-cnn",
"wait_for_model": "true"
}
"""

# using 'database' provider
summary_params = {
"provider": "database",
"glevel": "S",
"numParagraphs": 1,
"language": "english",
}

# get the summary instance
# Remove proxy if not required
summ = OracleSummary(conn=conn, params=summary_params, proxy=proxy)
summary = summ.get_summary(
"In the heart of the forest, "
+ "a lone fox ventured out at dusk, seeking a lost treasure. "
+ "With each step, memories flooded back, guiding its path. "
+ "As the moon rose high, illuminating the night, the fox unearthed "
+ "not gold, but a forgotten friendship, worth more than any riches."
)

print(f"Summary generated by OracleSummary: {summary}")

API Reference:

End to End Demo

Please refer to our complete demo guide Oracle AI Vector Search End-to-End Demo Guide to build an end to end RAG pipeline with the help of Oracle AI Vector Search.


Help us out by providing feedback on this documentation page: