graph_loader

Load and save Raphtory graphs from/to file(s)

Functions

FunctionDescription
karate_club_graphkarate_club_graph constructs a karate club graph.
lotr_graphLoad the Lord of the Rings dataset into a graph.
lotr_graph_with_propsSame as lotr_graph() but with additional properties race and gender for some of the nodes
neo4j_movie_graphReturns the neo4j movie graph example.
reddit_hyperlink_graphLoad (a subset of) Reddit hyperlinks dataset into a graph.
reddit_hyperlink_graph_localReturns the Reddit hyperlink graph example.
stable_coin_graphReturns the stablecoin graph example.

Function Details

karate_club_graph

karate_club_graph constructs a karate club graph.

This function uses the Zachary's karate club dataset to create a graph object. Nodes represent members of the club, and edges represent relationships between them. Node properties indicate the club to which each member belongs.

Background: These are data collected from the members of a university karate club by Wayne Zachary. The ZACHE matrix represents the presence or absence of ties among the members of the club; the ZACHC matrix indicates the relative strength of the associations (number of situations in and outside the club in which interactions occurred). Zachary (1977) used these data and an information flow model of network conflict resolution to explain the split-up of this group following disputes among the members.

Reference: Zachary W. (1977). An information flow model for conflict and fission in small groups. Journal of Anthropological Research, 33, 452-473.

Returns

TypeDescription
Graph

lotr_graph

Load the Lord of the Rings dataset into a graph. The dataset is available at https://raw.githubusercontent.com/Raphtory/Data/main/lotr.csv and is a list of interactions between characters in the Lord of the Rings books and movies. The dataset is a CSV file with the following columns:

  • src_id: The ID of the source character
  • dst_id: The ID of the destination character
  • time: The time of the interaction (in page)

Dataset statistics:

  • Number of nodes (subreddits) 139
  • Number of edges (hyperlink between subreddits) 701

Returns

TypeDescription
GraphA Graph containing the LOTR dataset

lotr_graph_with_props

Same as lotr_graph() but with additional properties race and gender for some of the nodes

Returns

TypeDescription
Graph

neo4j_movie_graph

Signature: neo4j_movie_graph(uri, username, password, database=...)

Returns the neo4j movie graph example.

Parameters

NameTypeDefaultDescription
databasestr, optional...
passwordstr-
uristr-
usernamestr-

Returns

TypeDescription
Graph

Signature: reddit_hyperlink_graph(timeout_seconds=600)

Load (a subset of) Reddit hyperlinks dataset into a graph. The dataset is available at http://snap.stanford.edu/data/soc-redditHyperlinks-title.tsv The hyperlink network represents the directed connections between two subreddits (a subreddit is a community_detection on Reddit). We also provide subreddit embeddings. The network is extracted from publicly available Reddit data of 2.5 years from Jan 2014 to April 2017. NOTE: It may take a while to download the dataset

Dataset statistics:

  • Number of nodes (subreddits) 35,776
  • Number of edges (hyperlink between subreddits) 137,821
  • Timespan Jan 2014 - April 2017

Source:

  • S. Kumar, W.L. Hamilton, J. Leskovec, D. Jurafsky. Community Interaction and Conflict on the Web. World Wide Web Conference, 2018.

Properties:

  • SOURCE_SUBREDDIT: the subreddit where the link originates
  • TARGET_SUBREDDIT: the subreddit where the link ends
  • POST_ID: the post in the source subreddit that starts the link
  • TIMESTAMP: time of the post
  • POST_LABEL: label indicating if the source post is explicitly negative towards the target post. The value is -1 if the source is negative towards the target, and 1 if it is neutral or positive. The label is created using crowd-sourcing and training a text based classifier, and is better than simple sentiment analysis of the posts. Please see the reference paper for details.
  • POST_PROPERTIES: a vector representing the text properties of the source post, listed as a list of comma separated numbers. This can be found on the source website

Parameters

NameTypeDefaultDescription
timeout_secondsint, optional600The number of seconds to wait for the dataset to download. Defaults to 600.

Returns

TypeDescription
GraphA Graph containing the Reddit hyperlinks dataset

Signature: reddit_hyperlink_graph_local(file_path)

Returns the Reddit hyperlink graph example.

Parameters

NameTypeDefaultDescription
file_pathstr-

Returns

TypeDescription
Graph

stable_coin_graph

Signature: stable_coin_graph(path=None, subset=None)

Returns the stablecoin graph example.

Parameters

NameTypeDefaultDescription
pathstr, optionalNone
subsetbool, optionalNone

Returns

TypeDescription
Graph