Vectorisation

The vectors module allows you to transform a graph into a collection of documents and vectorise those documents using an embedding function. Since the AI space moves quickly, Raphtory allows you to plug in your preferred embedding model either locally or from an API.

Using this you can perform semantic search over your graph data and build powerful AI systems with graph based RAG.

Vectorise a graph

To vectorise a graph you must create an embeddings function that takes a list of strings and returns a matching list of embeddings. This function can use any model or library you prefer, in this example we use the openai library and direct it to a local API compatible ollama service.

When you call Vectorise() Raphtory automatically creates documents for each node and edge entity in your graph, optionally you can provide template strings to format documents and pass these to vectorise(). This is useful when you know which properties are semantically relevant or want to present information in a specific format when retrieved by a human or machine user. Additionally, you can cache the embedded graph to disk to avoid having to recompute the vectors when nothing has changed.

Document templates

The templates for entity documents follow a subset of Jinja using Mini Jinja.

Additionally, graph attributes and properties are exposed so that you can use them in template expressions. The nesting of attributes reflects the Python interface and you can perform chains such as properties.prop_name or src.name which will follow the same typing as in Python. For datetime values, by default Raphtory converts these into milliseconds since the Unix epoch but provides an optional datetimeformat function to convert this to a human readable format.

Retrieve documents

You can retrieve relevant information from the VectorisedGraph by making selections.

A VectorSelection is a general object for holding embedded documents, you can create an empty selection or perform a similarity query against a VectorisedGraph to populate a new selection.

You can add to a selection by combining existing selections or by adding new documents associated with specific nodes and edges by their IDs. Additionally, you can expand a selection by making similarity queries relative to the entities in the current selection, this uses the power of the graph relationships to constrain your query.

Once you have a selection containing the information you want you can:

Get the associated graph entities using nodes() or edges().
Get the associated documents using get_documents() or get_documents_with_scores().

Each Document corresponds to unique entity in the graph, the contents of the associated document and it's vector representation. You can pull any of these out to retrieve information about an entity for a RAG system, compose a subgraph to analyse using Raphtory's algorithms, or feed into some more complex pipeline.

Asking questions about your network

Using the Network example from the ingestion using dataframes discussion you can set up a graph and add some simple AI tools in order to create a VectorisedGraph:

Using this VectorisedGraph you can perform similarity queries and feed the results into an LLM to ground it's responses in your data.

However, you must always be aware that LLM responses are still statistical and variations will occur. In production systems you may want to use a structured output tool to enforce a specific format.