torch_geometric.data
Data Objects and Helper Functions
A data object describing a homogeneous graph. |
|
A data object describing a heterogeneous graph, holding multiple node and/or edge types in disjunct storage objects. |
|
A data object describing a batch of graphs as one big (disconnected) graph. |
|
A data object composed by a stream of events describing a temporal graph. |
|
Dataset base class for creating graph datasets. |
|
Dataset base class for creating graph datasets which easily fit into CPU memory. |
|
An abstract base class to access features from a remote feature store. |
|
Defines the attributes of a |
|
An abstract base class to access edges from a remote graph store. |
|
Defines the attributes of a |
|
Recursive directory creation function. |
|
Downloads the content of an URL to a specific folder. |
|
Extracts a tar archive to a specific folder. |
|
Extracts a zip archive to a specific folder. |
|
Extracts a bz2 archive to a specific folder. |
|
Extracts a gz archive to a specific folder. |
- class Data(x: Optional[Tensor] = None, edge_index: Optional[Tensor] = None, edge_attr: Optional[Tensor] = None, y: Optional[Tensor] = None, pos: Optional[Tensor] = None, **kwargs)[source]
A data object describing a homogeneous graph. The data object can hold node-level, link-level and graph-level attributes. In general,
Datatries to mimic the behaviour of a regular Python dictionary. In addition, it provides useful functionality for analyzing graph structures, and provides basic PyTorch tensor functionalities. See here for the accompanying tutorial.from torch_geometric.data import Data data = Data(x=x, edge_index=edge_index, ...) # Add additional arguments to `data`: data.train_idx = torch.tensor([...], dtype=torch.long) data.test_mask = torch.tensor([...], dtype=torch.bool) # Analyzing the graph structure: data.num_nodes >>> 23 data.is_directed() >>> False # PyTorch tensor functionality: data = data.pin_memory() data = data.to('cuda:0', non_blocking=True)
- Parameters
x (Tensor, optional) – Node feature matrix with shape
[num_nodes, num_node_features]. (default:None)edge_index (LongTensor, optional) – Graph connectivity in COO format with shape
[2, num_edges]. (default:None)edge_attr (Tensor, optional) – Edge feature matrix with shape
[num_edges, num_edge_features]. (default:None)y (Tensor, optional) – Graph-level or node-level ground-truth labels with arbitrary shape. (default:
None)pos (Tensor, optional) – Node position matrix with shape
[num_nodes, num_dimensions]. (default:None)**kwargs (optional) – Additional attributes.
- to_namedtuple() NamedTuple[source]
Returns a
NamedTupleof stored key/value pairs.
- update(data: Data) Data[source]
Updates the data object with the elements from another data object.
- __cat_dim__(key: str, value: Any, *args, **kwargs) Any[source]
Returns the dimension for which the value
valueof the attributekeywill get concatenated when creating mini-batches usingtorch_geometric.loader.DataLoader.Note
This method is for internal use only, and should only be overridden in case the mini-batch creation process is corrupted for a specific attribute.
- __inc__(key: str, value: Any, *args, **kwargs) Any[source]
Returns the incremental count to cumulatively increase the value
valueof the attributekeywhen creating mini-batches usingtorch_geometric.loader.DataLoader.Note
This method is for internal use only, and should only be overridden in case the mini-batch creation process is corrupted for a specific attribute.
- is_node_attr(key: str) bool[source]
Returns
Trueif the object at keykeydenotes a node-level tensor attribute.
- is_edge_attr(key: str) bool[source]
Returns
Trueif the object at keykeydenotes an edge-level tensor attribute.
- subgraph(subset: Tensor) Data[source]
Returns the induced subgraph given by the node indices
subset.- Parameters
subset (LongTensor or BoolTensor) – The nodes to keep.
- edge_subgraph(subset: Tensor) Data[source]
Returns the induced subgraph given by the edge indices
subset. Will currently preserve all the nodes in the graph, even if they are isolated after subgraph computation.- Parameters
subset (LongTensor or BoolTensor) – The edges to keep.
- to_heterogeneous(node_type: Optional[Tensor] = None, edge_type: Optional[Tensor] = None, node_type_names: Optional[List[str]] = None, edge_type_names: Optional[List[Tuple[str, str, str]]] = None)[source]
Converts a
Dataobject to a heterogeneousHeteroDataobject. For this, node and edge attributes are splitted according to the node-level and edge-level vectorsnode_typeandedge_type, respectively.node_type_namesandedge_type_namescan be used to give meaningful node and edge type names, respectively. That is, the node_type0is given bynode_type_names[0]. If theDataobject was constructed viato_homogeneous(), the object can be reconstructed without any need to pass in additional arguments.- Parameters
node_type (Tensor, optional) – A node-level vector denoting the type of each node. (default:
None)edge_type (Tensor, optional) – An edge-level vector denoting the type of each edge. (default:
None)node_type_names (List[str], optional) – The names of node types. (default:
None)edge_type_names (List[Tuple[str, str, str]], optional) – The names of edge types. (default:
None)
- classmethod from_dict(mapping: Dict[str, Any]) Data[source]
Creates a
Dataobject from a Python dictionary.
- property num_features: int
Returns the number of features per node in the graph. Alias for
num_node_features.
- get_all_tensor_attrs() List[TensorAttr][source]
Obtains all feature attributes stored in Data.
- apply(func: Callable, *args: List[str])
Applies the function
func, either to all attributes or only the ones given in*args.
- apply_(func: Callable, *args: List[str])
Applies the in-place function
func, either to all attributes or only the ones given in*args.
- clone(*args: List[str])
Performs cloning of tensors, either for all attributes or only the ones given in
*args.
- coalesce()
Sorts and removes duplicated entries from edge indices
edge_index.
- contiguous(*args: List[str])
Ensures a contiguous memory layout, either for all attributes or only the ones given in
*args.
- coo(edge_types: Optional[List[Any]] = None, store: bool = False) Tuple[Dict[Tuple[str, str, str], Tensor], Dict[Tuple[str, str, str], Tensor], Dict[Tuple[str, str, str], Optional[Tensor]]]
Obtains the edge indices in the
GraphStorein COO format.- Parameters
edge_types (List[Any], optional) – The edge types of edge indices to obtain. If set to
None, will return the edge indices of all existing edge types. (default:None)store (bool, optional) – Whether to store converted edge indices in the
GraphStore. (default:False)
- cpu(*args: List[str])
Copies attributes to CPU memory, either for all attributes or only the ones given in
*args.
- csc(edge_types: Optional[List[Any]] = None, store: bool = False) Tuple[Dict[Tuple[str, str, str], Tensor], Dict[Tuple[str, str, str], Tensor], Dict[Tuple[str, str, str], Optional[Tensor]]]
Obtains the edge indices in the
GraphStorein CSC format.- Parameters
edge_types (List[Any], optional) – The edge types of edge indices to obtain. If set to
None, will return the edge indices of all existing edge types. (default:None)store (bool, optional) – Whether to store converted edge indices in the
GraphStore. (default:False)
- csr(edge_types: Optional[List[Any]] = None, store: bool = False) Tuple[Dict[Tuple[str, str, str], Tensor], Dict[Tuple[str, str, str], Tensor], Dict[Tuple[str, str, str], Optional[Tensor]]]
Obtains the edge indices in the
GraphStorein CSR format.- Parameters
edge_types (List[Any], optional) – The edge types of edge indices to obtain. If set to
None, will return the edge indices of all existing edge types. (default:None)store (bool, optional) – Whether to store converted edge indices in the
GraphStore. (default:False)
- cuda(device: Optional[Union[int, str]] = None, *args: List[str], non_blocking: bool = False)
Copies attributes to CUDA memory, either for all attributes or only the ones given in
*args.
- detach(*args: List[str])
Detaches attributes from the computation graph by creating a new tensor, either for all attributes or only the ones given in
*args.
- detach_(*args: List[str])
Detaches attributes from the computation graph, either for all attributes or only the ones given in
*args.
- generate_ids()
Generates and sets
n_idande_idattributes to assign each node and edge to a continuously ascending and unique ID.
- get_edge_index(*args, **kwargs) Tuple[Tensor, Tensor]
Synchronously obtains an
edge_indextuple from theGraphStore.
- get_tensor(*args, **kwargs) Union[Tensor, ndarray]
Synchronously obtains a
tensorfrom theFeatureStore.- Parameters
**kwargs (TensorAttr) – Any relevant tensor attributes that correspond to the feature tensor. See the
TensorAttrdocumentation for required and optional attributes.- Raises
ValueError – If the input
TensorAttris not fully specified.KeyError – If the tensor corresponding to the input
TensorAttrwas not found.
- get_tensor_size(*args, **kwargs) Optional[Tuple[int, ...]]
Obtains the size of a tensor given its
TensorAttr, orNoneif the tensor does not exist.
- is_coalesced() bool
Returns
Trueif edge indicesedge_indexare sorted and do not contain duplicate entries.
- property is_cuda: bool
Returns
Trueif anytorch.Tensorattribute is stored on the GPU,Falseotherwise.
- multi_get_tensor(attrs: List[TensorAttr]) List[Union[Tensor, ndarray]]
Synchronously obtains a list of tensors from the
FeatureStorefor each tensor associated with the attributes inattrs.Note
The default implementation simply iterates over all calls to
get_tensor(). Implementor classes that can provide additional, more performant functionality are recommended to to override this method.- Parameters
attrs (List[TensorAttr]) – A list of input
TensorAttrobjects that identify the tensors to obtain.- Raises
ValueError – If any input
TensorAttris not fully specified.KeyError – If any of the tensors corresponding to the input
TensorAttrwas not found.
- property num_edges: int
Returns the number of edges in the graph. For undirected graphs, this will return the number of bi-directional edges, which is double the amount of unique edges.
- property num_nodes: Optional[int]
Returns the number of nodes in the graph.
Note
The number of nodes in the data object is automatically inferred in case node-level attributes are present, e.g.,
data.x. In some cases, however, a graph may only be given without any node-level attributes. PyG then guesses the number of nodes according toedge_index.max().item() + 1. However, in case there exists isolated nodes, this number does not have to be correct which can result in unexpected behaviour. Thus, we recommend to set the number of nodes in your data object explicitly viadata.num_nodes = .... You will be given a warning that requests you to do so.
- pin_memory(*args: List[str])
Copies attributes to pinned memory, either for all attributes or only the ones given in
*args.
- put_edge_index(edge_index: Tuple[Tensor, Tensor], *args, **kwargs) bool
Synchronously adds an
edge_indextuple to theGraphStore. Returns whether insertion was successful.- Parameters
tensor (Tuple[torch.Tensor, torch.Tensor]) – The
edge_indextuple in a format specified inEdgeAttr.**kwargs (EdgeAttr) – Any relevant edge attributes that correspond to the
edge_indextuple. See theEdgeAttrdocumentation for required and optional attributes.
- put_tensor(tensor: Union[Tensor, ndarray], *args, **kwargs) bool
Synchronously adds a
tensorto theFeatureStore. Returns whether insertion was successful.- Parameters
tensor (torch.Tensor or np.ndarray) – The feature tensor to be added.
**kwargs (TensorAttr) – Any relevant tensor attributes that correspond to the feature tensor. See the
TensorAttrdocumentation for required and optional attributes.
- Raises
ValueError – If the input
TensorAttris not fully specified.
- record_stream(stream: Stream, *args: List[str])
Ensures that the tensor memory is not reused for another tensor until all current work queued on
streamhas been completed, either for all attributes or only the ones given in*args.
- remove_edge_index(*args, **kwargs) bool
Synchronously deletes an
edge_indextuple from theGraphStore. Returns whether deletion was successful.
- remove_tensor(*args, **kwargs) bool
Removes a tensor from the
FeatureStore. Returns whether deletion was successful.- Parameters
**kwargs (TensorAttr) – Any relevant tensor attributes that correspond to the feature tensor. See the
TensorAttrdocumentation for required and optional attributes.- Raises
ValueError – If the input
TensorAttris not fully specified.
- requires_grad_(*args: List[str], requires_grad: bool = True)
Tracks gradient computation, either for all attributes or only the ones given in
*args.
Moves attributes to shared memory, either for all attributes or only the ones given in
*args.
- size(dim: Optional[int] = None) Optional[Union[Tuple[Optional[int], Optional[int]], int]]
Returns the size of the adjacency matrix induced by the graph.
- to(device: Union[int, str], *args: List[str], non_blocking: bool = False)
Performs tensor device conversion, either for all attributes or only the ones given in
*args.
- update_tensor(tensor: Union[Tensor, ndarray], *args, **kwargs) bool
Updates a
tensorin theFeatureStorewith a new value. Returns whether the update was succesful.Note
Implementor classes can choose to define more efficient update methods; the default performs a removal and insertion.
- Parameters
tensor (torch.Tensor or np.ndarray) – The feature tensor to be updated.
**kwargs (TensorAttr) – Any relevant tensor attributes that correspond to the feature tensor. See the
TensorAttrdocumentation for required and optional attributes.
- view(*args, **kwargs) AttrView
Returns a view of the
FeatureStoregiven a not yet fully-specifiedTensorAttr.
- get_all_edge_attrs() List[EdgeAttr][source]
Obtains all edge attributes stored in the
GraphStore.
- class HeteroData(_mapping: Optional[Dict[str, Any]] = None, **kwargs)[source]
A data object describing a heterogeneous graph, holding multiple node and/or edge types in disjunct storage objects. Storage objects can hold either node-level, link-level or graph-level attributes. In general,
HeteroDatatries to mimic the behaviour of a regular nested Python dictionary. In addition, it provides useful functionality for analyzing graph structures, and provides basic PyTorch tensor functionalities.from torch_geometric.data import HeteroData data = HeteroData() # Create two node types "paper" and "author" holding a feature matrix: data['paper'].x = torch.randn(num_papers, num_paper_features) data['author'].x = torch.randn(num_authors, num_authors_features) # Create an edge type "(author, writes, paper)" and building the # graph connectivity: data['author', 'writes', 'paper'].edge_index = ... # [2, num_edges] data['paper'].num_nodes >>> 23 data['author', 'writes', 'paper'].num_edges >>> 52 # PyTorch tensor functionality: data = data.pin_memory() data = data.to('cuda:0', non_blocking=True)
Note that there exists multiple ways to create a heterogeneous graph data, e.g.:
To initialize a node of type
"paper"holding a node feature matrixx_papernamedx:from torch_geometric.data import HeteroData data = HeteroData() data['paper'].x = x_paper data = HeteroData(paper={ 'x': x_paper }) data = HeteroData({'paper': { 'x': x_paper }})
To initialize an edge from source node type
"author"to destination node type"paper"with relation type"writes"holding a graph connectivity matrixedge_index_author_papernamededge_index:data = HeteroData() data['author', 'writes', 'paper'].edge_index = edge_index_author_paper data = HeteroData(author__writes__paper={ 'edge_index': edge_index_author_paper }) data = HeteroData({ ('author', 'writes', 'paper'): { 'edge_index': edge_index_author_paper } })
- node_items() List[Tuple[str, NodeStorage]][source]
Returns a list of node type and node storage pairs.
- edge_items() List[Tuple[Tuple[str, str, str], EdgeStorage]][source]
Returns a list of edge type and edge storage pairs.
- to_namedtuple() NamedTuple[source]
Returns a
NamedTupleof stored key/value pairs.
- update(data: HeteroData) HeteroData[source]
Updates the data object with the elements from another data object.
- __cat_dim__(key: str, value: Any, store: Optional[Union[NodeStorage, EdgeStorage]] = None, *args, **kwargs) Any[source]
Returns the dimension for which the value
valueof the attributekeywill get concatenated when creating mini-batches usingtorch_geometric.loader.DataLoader.Note
This method is for internal use only, and should only be overridden in case the mini-batch creation process is corrupted for a specific attribute.
- __inc__(key: str, value: Any, store: Optional[Union[NodeStorage, EdgeStorage]] = None, *args, **kwargs) Any[source]
Returns the incremental count to cumulatively increase the value
valueof the attributekeywhen creating mini-batches usingtorch_geometric.loader.DataLoader.Note
This method is for internal use only, and should only be overridden in case the mini-batch creation process is corrupted for a specific attribute.
- property num_node_features: Dict[str, int]
Returns the number of features per node type in the graph.
- property num_features: Dict[str, int]
Returns the number of features per node type in the graph. Alias for
num_node_features.
- property num_edge_features: Dict[Tuple[str, str, str], int]
Returns the number of features per edge type in the graph.
- metadata() Tuple[List[str], List[Tuple[str, str, str]]][source]
Returns the heterogeneous meta-data, i.e. its node and edge types.
data = HeteroData() data['paper'].x = ... data['author'].x = ... data['author', 'writes', 'paper'].edge_index = ... print(data.metadata()) >>> (['paper', 'author'], [('author', 'writes', 'paper')])
- collect(key: str) Dict[Union[str, Tuple[str, str, str]], Any][source]
Collects the attribute
keyfrom all node and edge types.data = HeteroData() data['paper'].x = ... data['author'].x = ... print(data.collect('x')) >>> { 'paper': ..., 'author': ...}
Note
This is equivalent to writing
data.x_dict.
- get_node_store(key: str) NodeStorage[source]
Gets the
NodeStorageobject of a particular node typekey. If the storage is not present yet, will create a newtorch_geometric.data.storage.NodeStorageobject for the given node type.data = HeteroData() node_storage = data.get_node_store('paper')
- get_edge_store(src: str, rel: str, dst: str) EdgeStorage[source]
Gets the
EdgeStorageobject of a particular edge type given by the tuple(src, rel, dst). If the storage is not present yet, will create a newtorch_geometric.data.storage.EdgeStorageobject for the given edge type.data = HeteroData() edge_storage = data.get_edge_store('author', 'writes', 'paper')
- rename(name: str, new_name: str) HeteroData[source]
Renames the node type
nametonew_namein-place.
- subgraph(subset_dict: Dict[str, Tensor]) HeteroData[source]
Returns the induced subgraph containing the node types and corresponding nodes in
subset_dict.data = HeteroData() data['paper'].x = ... data['author'].x = ... data['conference'].x = ... data['paper', 'cites', 'paper'].edge_index = ... data['author', 'paper'].edge_index = ... data['paper', 'conference'].edge_index = ... print(data) >>> HeteroData( paper={ x=[10, 16] }, author={ x=[5, 32] }, conference={ x=[5, 8] }, (paper, cites, paper)={ edge_index=[2, 50] }, (author, to, paper)={ edge_index=[2, 30] }, (paper, to, conference)={ edge_index=[2, 25] } ) subset_dict = { 'paper': torch.tensor([3, 4, 5, 6]), 'author': torch.tensor([0, 2]), } print(data.subgraph(subset_dict)) >>> HeteroData( paper={ x=[4, 16] }, author={ x=[2, 32] }, (paper, cites, paper)={ edge_index=[2, 24] }, (author, to, paper)={ edge_index=[2, 5] } )
- Parameters
subset_dict (Dict[str, LongTensor or BoolTensor]) – A dictonary holding the nodes to keep for each node type.
- edge_subgraph(subset_dict: Dict[Tuple[str, str, str], Tensor]) HeteroData[source]
Returns the induced subgraph given by the edge indices in
subset_dictfor certain edge types. Will currently preserve all the nodes in the graph, even if they are isolated after subgraph computation.
- node_type_subgraph(node_types: List[str]) HeteroData[source]
Returns the subgraph induced by the given
node_types, i.e. the returnedHeteroDataobject only contains the node types which are included innode_types, and only contains the edge types where both end points are included innode_types.
- edge_type_subgraph(edge_types: List[Tuple[str, str, str]]) HeteroData[source]
Returns the subgraph induced by the given
edge_types, i.e. the returnedHeteroDataobject only contains the edge types which are included inedge_types, and only contains the node types of the end points which are included innode_types.
- to_homogeneous(node_attrs: Optional[List[str]] = None, edge_attrs: Optional[List[str]] = None, add_node_type: bool = True, add_edge_type: bool = True, dummy_values: bool = True) Data[source]
Converts a
HeteroDataobject to a homogeneousDataobject. By default, all features with same feature dimensionality across different types will be merged into a single representation, unless otherwise specified via thenode_attrsandedge_attrsarguments. Furthermore, attributes namednode_typeandedge_typewill be added to the returnedDataobject, denoting node-level and edge-level vectors holding the node and edge type as integers, respectively.- Parameters
node_attrs (List[str], optional) – The node features to combine across all node types. These node features need to be of the same feature dimensionality. If set to
None, will automatically determine which node features to combine. (default:None)edge_attrs (List[str], optional) – The edge features to combine across all edge types. These edge features need to be of the same feature dimensionality. If set to
None, will automatically determine which edge features to combine. (default:None)add_node_type (bool, optional) – If set to
False, will not add the node-level vectornode_typeto the returnedDataobject. (default:True)add_edge_type (bool, optional) – If set to
False, will not add the edge-level vectoredge_typeto the returnedDataobject. (default:True)dummy_values (bool, optional) – If set to
True, will fill attributes of remaining types with dummy values. Dummy values areNaNfor floating point attributes, and-1for integers. (default:True)
- get_all_tensor_attrs() List[TensorAttr][source]
Obtains all tensor attributes stored in this
FeatureStore.
- get_all_edge_attrs() List[EdgeAttr][source]
Obtains all edge attributes stored in the
GraphStore.
- apply(func: Callable, *args: List[str])
Applies the function
func, either to all attributes or only the ones given in*args.
- apply_(func: Callable, *args: List[str])
Applies the in-place function
func, either to all attributes or only the ones given in*args.
- clone(*args: List[str])
Performs cloning of tensors, either for all attributes or only the ones given in
*args.
- coalesce()
Sorts and removes duplicated entries from edge indices
edge_index.
- contiguous(*args: List[str])
Ensures a contiguous memory layout, either for all attributes or only the ones given in
*args.
- coo(edge_types: Optional[List[Any]] = None, store: bool = False) Tuple[Dict[Tuple[str, str, str], Tensor], Dict[Tuple[str, str, str], Tensor], Dict[Tuple[str, str, str], Optional[Tensor]]]
Obtains the edge indices in the
GraphStorein COO format.- Parameters
edge_types (List[Any], optional) – The edge types of edge indices to obtain. If set to
None, will return the edge indices of all existing edge types. (default:None)store (bool, optional) – Whether to store converted edge indices in the
GraphStore. (default:False)
- cpu(*args: List[str])
Copies attributes to CPU memory, either for all attributes or only the ones given in
*args.
- csc(edge_types: Optional[List[Any]] = None, store: bool = False) Tuple[Dict[Tuple[str, str, str], Tensor], Dict[Tuple[str, str, str], Tensor], Dict[Tuple[str, str, str], Optional[Tensor]]]
Obtains the edge indices in the
GraphStorein CSC format.- Parameters
edge_types (List[Any], optional) – The edge types of edge indices to obtain. If set to
None, will return the edge indices of all existing edge types. (default:None)store (bool, optional) – Whether to store converted edge indices in the
GraphStore. (default:False)
- csr(edge_types: Optional[List[Any]] = None, store: bool = False) Tuple[Dict[Tuple[str, str, str], Tensor], Dict[Tuple[str, str, str], Tensor], Dict[Tuple[str, str, str], Optional[Tensor]]]
Obtains the edge indices in the
GraphStorein CSR format.- Parameters
edge_types (List[Any], optional) – The edge types of edge indices to obtain. If set to
None, will return the edge indices of all existing edge types. (default:None)store (bool, optional) – Whether to store converted edge indices in the
GraphStore. (default:False)
- cuda(device: Optional[Union[int, str]] = None, *args: List[str], non_blocking: bool = False)
Copies attributes to CUDA memory, either for all attributes or only the ones given in
*args.
- detach(*args: List[str])
Detaches attributes from the computation graph by creating a new tensor, either for all attributes or only the ones given in
*args.
- detach_(*args: List[str])
Detaches attributes from the computation graph, either for all attributes or only the ones given in
*args.
- generate_ids()
Generates and sets
n_idande_idattributes to assign each node and edge to a continuously ascending and unique ID.
- get_edge_index(*args, **kwargs) Tuple[Tensor, Tensor]
Synchronously obtains an
edge_indextuple from theGraphStore.
- get_tensor(*args, **kwargs) Union[Tensor, ndarray]
Synchronously obtains a
tensorfrom theFeatureStore.- Parameters
**kwargs (TensorAttr) – Any relevant tensor attributes that correspond to the feature tensor. See the
TensorAttrdocumentation for required and optional attributes.- Raises
ValueError – If the input
TensorAttris not fully specified.KeyError – If the tensor corresponding to the input
TensorAttrwas not found.
- get_tensor_size(*args, **kwargs) Optional[Tuple[int, ...]]
Obtains the size of a tensor given its
TensorAttr, orNoneif the tensor does not exist.
- is_coalesced() bool
Returns
Trueif edge indicesedge_indexare sorted and do not contain duplicate entries.
- property is_cuda: bool
Returns
Trueif anytorch.Tensorattribute is stored on the GPU,Falseotherwise.
- multi_get_tensor(attrs: List[TensorAttr]) List[Union[Tensor, ndarray]]
Synchronously obtains a list of tensors from the
FeatureStorefor each tensor associated with the attributes inattrs.Note
The default implementation simply iterates over all calls to
get_tensor(). Implementor classes that can provide additional, more performant functionality are recommended to to override this method.- Parameters
attrs (List[TensorAttr]) – A list of input
TensorAttrobjects that identify the tensors to obtain.- Raises
ValueError – If any input
TensorAttris not fully specified.KeyError – If any of the tensors corresponding to the input
TensorAttrwas not found.
- property num_edges: int
Returns the number of edges in the graph. For undirected graphs, this will return the number of bi-directional edges, which is double the amount of unique edges.
- pin_memory(*args: List[str])
Copies attributes to pinned memory, either for all attributes or only the ones given in
*args.
- put_edge_index(edge_index: Tuple[Tensor, Tensor], *args, **kwargs) bool
Synchronously adds an
edge_indextuple to theGraphStore. Returns whether insertion was successful.- Parameters
tensor (Tuple[torch.Tensor, torch.Tensor]) – The
edge_indextuple in a format specified inEdgeAttr.**kwargs (EdgeAttr) – Any relevant edge attributes that correspond to the
edge_indextuple. See theEdgeAttrdocumentation for required and optional attributes.
- put_tensor(tensor: Union[Tensor, ndarray], *args, **kwargs) bool
Synchronously adds a
tensorto theFeatureStore. Returns whether insertion was successful.- Parameters
tensor (torch.Tensor or np.ndarray) – The feature tensor to be added.
**kwargs (TensorAttr) – Any relevant tensor attributes that correspond to the feature tensor. See the
TensorAttrdocumentation for required and optional attributes.
- Raises
ValueError – If the input
TensorAttris not fully specified.
- record_stream(stream: Stream, *args: List[str])
Ensures that the tensor memory is not reused for another tensor until all current work queued on
streamhas been completed, either for all attributes or only the ones given in*args.
- remove_edge_index(*args, **kwargs) bool
Synchronously deletes an
edge_indextuple from theGraphStore. Returns whether deletion was successful.
- remove_tensor(*args, **kwargs) bool
Removes a tensor from the
FeatureStore. Returns whether deletion was successful.- Parameters
**kwargs (TensorAttr) – Any relevant tensor attributes that correspond to the feature tensor. See the
TensorAttrdocumentation for required and optional attributes.- Raises
ValueError – If the input
TensorAttris not fully specified.
- requires_grad_(*args: List[str], requires_grad: bool = True)
Tracks gradient computation, either for all attributes or only the ones given in
*args.
Moves attributes to shared memory, either for all attributes or only the ones given in
*args.
- size(dim: Optional[int] = None) Optional[Union[Tuple[Optional[int], Optional[int]], int]]
Returns the size of the adjacency matrix induced by the graph.
- to(device: Union[int, str], *args: List[str], non_blocking: bool = False)
Performs tensor device conversion, either for all attributes or only the ones given in
*args.
- update_tensor(tensor: Union[Tensor, ndarray], *args, **kwargs) bool
Updates a
tensorin theFeatureStorewith a new value. Returns whether the update was succesful.Note
Implementor classes can choose to define more efficient update methods; the default performs a removal and insertion.
- Parameters
tensor (torch.Tensor or np.ndarray) – The feature tensor to be updated.
**kwargs (TensorAttr) – Any relevant tensor attributes that correspond to the feature tensor. See the
TensorAttrdocumentation for required and optional attributes.
- view(*args, **kwargs) AttrView
Returns a view of the
FeatureStoregiven a not yet fully-specifiedTensorAttr.
- class TemporalData(src: Optional[Tensor] = None, dst: Optional[Tensor] = None, t: Optional[Tensor] = None, msg: Optional[Tensor] = None, **kwargs)[source]
A data object composed by a stream of events describing a temporal graph. The
TemporalDataobject can hold a list of events (that can be understood as temporal edges in a graph) with structured messages. An event is composed by a source node, a destination node, a timestamp and a message. Any Continuous-Time Dynamic Graph (CTDG) can be represented with these four values.In general,
TemporalDatatries to mimic the behaviour of a regular Python dictionary. In addition, it provides useful functionality for analyzing graph structures, and provides basic PyTorch tensor functionalities.from torch import Tensor from torch_geometric.data import TemporalData events = TemporalData( src=Tensor([1,2,3,4]), dst=Tensor([2,3,4,5]), t=Tensor([1000,1010,1100,2000]), msg=Tensor([1,1,0,0]) ) # Add additional arguments to `events`: events.y = Tensor([1,1,0,0]) # It is also possible to set additional arguments in the constructor events = TemporalData( ..., y=Tensor([1,1,0,0]) ) # Get the number of events: events.num_events >>> 4 # Analyzing the graph structure: events.num_nodes >>> 5 # PyTorch tensor functionality: events = events.pin_memory() events = events.to('cuda:0', non_blocking=True)
- Parameters
src (Tensor, optional) – A list of source nodes for the events with shape
[num_events]. (default:None)dst (Tensor, optional) – A list of destination nodes for the events with shape
[num_events]. (default:None)t (Tensor, optional) – The timestamps for each event with shape
[num_events]. (default:None)msg (Tensor, optional) – Messages feature matrix with shape
[num_events, num_msg_features]. (default:None)**kwargs (optional) – Additional attributes.
Note
The shape of
src,dst,tand the first dimension of :obj`msg` should be the same (num_events).- to_namedtuple() NamedTuple[source]
Returns a
NamedTupleof stored key/value pairs.
- property num_events: int
Returns the number of events loaded.
Note
In a
TemporalData, each row denotes an event. Thus, they can be also understood as edges.
- property num_edges: int
Alias for
num_events().
- size(dim: Optional[int] = None) Optional[Union[Tuple[Optional[int], Optional[int]], int]][source]
Returns the size of the adjacency matrix induced by the graph.
- __cat_dim__(key: str, value: Any, *args, **kwargs) Any[source]
Returns the dimension for which the value
valueof the attributekeywill get concatenated when creating mini-batches usingtorch_geometric.loader.DataLoader.Note
This method is for internal use only, and should only be overridden in case the mini-batch creation process is corrupted for a specific attribute.
- __inc__(key: str, value: Any, *args, **kwargs) Any[source]
Returns the incremental count to cumulatively increase the value
valueof the attributekeywhen creating mini-batches usingtorch_geometric.loader.DataLoader.Note
This method is for internal use only, and should only be overridden in case the mini-batch creation process is corrupted for a specific attribute.
- train_val_test_split(val_ratio: float = 0.15, test_ratio: float = 0.15)[source]
Splits the data in training, validation and test sets according to time.
- apply(func: Callable, *args: List[str])
Applies the function
func, either to all attributes or only the ones given in*args.
- apply_(func: Callable, *args: List[str])
Applies the in-place function
func, either to all attributes or only the ones given in*args.
- clone(*args: List[str])
Performs cloning of tensors, either for all attributes or only the ones given in
*args.
- contiguous(*args: List[str])
Ensures a contiguous memory layout, either for all attributes or only the ones given in
*args.
- cpu(*args: List[str])
Copies attributes to CPU memory, either for all attributes or only the ones given in
*args.
- cuda(device: Optional[Union[int, str]] = None, *args: List[str], non_blocking: bool = False)
Copies attributes to CUDA memory, either for all attributes or only the ones given in
*args.
- detach(*args: List[str])
Detaches attributes from the computation graph by creating a new tensor, either for all attributes or only the ones given in
*args.
- detach_(*args: List[str])
Detaches attributes from the computation graph, either for all attributes or only the ones given in
*args.
- generate_ids()
Generates and sets
n_idande_idattributes to assign each node and edge to a continuously ascending and unique ID.
- is_coalesced() bool
Returns
Trueif edge indicesedge_indexare sorted and do not contain duplicate entries.
- property is_cuda: bool
Returns
Trueif anytorch.Tensorattribute is stored on the GPU,Falseotherwise.
- pin_memory(*args: List[str])
Copies attributes to pinned memory, either for all attributes or only the ones given in
*args.
- record_stream(stream: Stream, *args: List[str])
Ensures that the tensor memory is not reused for another tensor until all current work queued on
streamhas been completed, either for all attributes or only the ones given in*args.
- requires_grad_(*args: List[str], requires_grad: bool = True)
Tracks gradient computation, either for all attributes or only the ones given in
*args.
Moves attributes to shared memory, either for all attributes or only the ones given in
*args.
- to(device: Union[int, str], *args: List[str], non_blocking: bool = False)
Performs tensor device conversion, either for all attributes or only the ones given in
*args.
- update(data: BaseData) BaseData
Updates the data object with the elements from another data object.
- class Batch(*args, **kwargs)[source]
A data object describing a batch of graphs as one big (disconnected) graph. Inherits from
torch_geometric.data.Dataortorch_geometric.data.HeteroData. In addition, single graphs can be identified via the assignment vectorbatch, which maps each node to its respective graph identifier.- classmethod from_data_list(data_list: List[BaseData], follow_batch: Optional[List[str]] = None, exclude_keys: Optional[List[str]] = None)[source]
Constructs a
Batchobject from a Python list ofDataorHeteroDataobjects. The assignment vectorbatchis created on the fly. In addition, creates assignment vectors for each key infollow_batch. Will exclude any keys given inexclude_keys.
- get_example(idx: int) BaseData[source]
Gets the
DataorHeteroDataobject at indexidx. TheBatchobject must have been created viafrom_data_list()in order to be able to reconstruct the initial object.
- index_select(idx: Union[slice, Tensor, ndarray, Sequence]) List[BaseData][source]
Creates a subset of
DataorHeteroDataobjects from specified indicesidx. Indicesidxcan be a slicing object, e.g.,[2:5], a list, a tuple, or atorch.Tensorornp.ndarrayof type long or bool. TheBatchobject must have been created viafrom_data_list()in order to be able to reconstruct the initial objects.
- to_data_list() List[BaseData][source]
Reconstructs the list of
DataorHeteroDataobjects from theBatchobject. TheBatchobject must have been created viafrom_data_list()in order to be able to reconstruct the initial objects.
- class Dataset(root: Optional[str] = None, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None, pre_filter: Optional[Callable] = None, log: bool = True)[source]
Dataset base class for creating graph datasets. See here for the accompanying tutorial.
- Parameters
root (string, optional) – Root directory where the dataset should be saved. (optional:
None)transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Dataobject and returns a transformed version. The data object will be transformed before every access. (default:None)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Dataobject and returns a transformed version. The data object will be transformed before being saved to disk. (default:None)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Dataobject and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None)log (bool, optional) – Whether to print any console output while downloading and processing the dataset. (default:
True)
- property raw_file_names: Union[str, List[str], Tuple]
The name of the files in the
self.raw_dirfolder that must be present in order to skip downloading.
- property processed_file_names: Union[str, List[str], Tuple]
The name of the files in the
self.processed_dirfolder that must be present in order to skip processing.
- property num_features: int
Returns the number of features per node in the dataset. Alias for
num_node_features.
- property raw_paths: List[str]
The absolute filepaths that must be present in order to skip downloading.
- property processed_paths: List[str]
The absolute filepaths that must be present in order to skip processing.
- index_select(idx: Union[slice, Tensor, ndarray, Sequence]) Dataset[source]
Creates a subset of the dataset from specified indices
idx. Indicesidxcan be a slicing object, e.g.,[2:5], a list, a tuple, or atorch.Tensorornp.ndarrayof type long or bool.
- shuffle(return_perm: bool = False) Union[Dataset, Tuple[Dataset, Tensor]][source]
Randomly shuffles the examples in the dataset.
- to_datapipe()[source]
Converts the dataset into a
torch.utils.data.DataPipe.The returned instance can then be used with PyG’s built-in DataPipes for baching graphs as follows:
from torch_geometric.datasets import QM9 dp = QM9(root='./data/QM9/').to_datapipe() dp = dp.batch_graphs(batch_size=2, drop_last=True) for batch in dp: pass
See the PyTorch tutorial for further background on DataPipes.
- class InMemoryDataset(root: Optional[str] = None, transform: Optional[Callable] = None, pre_transform: Optional[Callable] = None, pre_filter: Optional[Callable] = None, log: bool = True)[source]
Dataset base class for creating graph datasets which easily fit into CPU memory. Inherits from
torch_geometric.data.Dataset. See here for the accompanying tutorial.- Parameters
root (string, optional) – Root directory where the dataset should be saved. (default:
None)transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Dataobject and returns a transformed version. The data object will be transformed before every access. (default:None)pre_transform (callable, optional) – A function/transform that takes in an
torch_geometric.data.Dataobject and returns a transformed version. The data object will be transformed before being saved to disk. (default:None)pre_filter (callable, optional) – A function that takes in an
torch_geometric.data.Dataobject and returns a boolean value, indicating whether the data object should be included in the final dataset. (default:None)log (bool, optional) – Whether to print any console output while downloading and processing the dataset. (default:
True)
- property raw_file_names: Union[str, List[str], Tuple]
The name of the files in the
self.raw_dirfolder that must be present in order to skip downloading.
- property processed_file_names: Union[str, List[str], Tuple]
The name of the files in the
self.processed_dirfolder that must be present in order to skip processing.
- static collate(data_list: List[Data]) Tuple[Data, Optional[Dict[str, Tensor]]][source]
Collates a Python list of
torch_geometric.data.Dataobjects to the internal storage format ofInMemoryDataset.
- copy(idx: Optional[Union[slice, Tensor, ndarray, Sequence]] = None) InMemoryDataset[source]
Performs a deep-copy of the dataset. If
idxis not given, will clone the full dataset. Otherwise, will only clone a subset of the dataset from indicesidx. Indices can be slices, lists, tuples, and atorch.Tensorornp.ndarrayof type long or bool.
- class FeatureStore(tensor_attr_cls: Optional[Any] = None)[source]
An abstract base class to access features from a remote feature store.
- Parameters
tensor_attr_cls (TensorAttr, optional) – A user-defined
TensorAttrclass to customize the required attributes and their ordering to unique identify tensor values. (default:None)
- put_tensor(tensor: Union[Tensor, ndarray], *args, **kwargs) bool[source]
Synchronously adds a
tensorto theFeatureStore. Returns whether insertion was successful.- Parameters
tensor (torch.Tensor or np.ndarray) – The feature tensor to be added.
**kwargs (TensorAttr) – Any relevant tensor attributes that correspond to the feature tensor. See the
TensorAttrdocumentation for required and optional attributes.
- Raises
ValueError – If the input
TensorAttris not fully specified.
- get_tensor(*args, **kwargs) Union[Tensor, ndarray][source]
Synchronously obtains a
tensorfrom theFeatureStore.- Parameters
**kwargs (TensorAttr) – Any relevant tensor attributes that correspond to the feature tensor. See the
TensorAttrdocumentation for required and optional attributes.- Raises
ValueError – If the input
TensorAttris not fully specified.KeyError – If the tensor corresponding to the input
TensorAttrwas not found.
- multi_get_tensor(attrs: List[TensorAttr]) List[Union[Tensor, ndarray]][source]
Synchronously obtains a list of tensors from the
FeatureStorefor each tensor associated with the attributes inattrs.Note
The default implementation simply iterates over all calls to
get_tensor(). Implementor classes that can provide additional, more performant functionality are recommended to to override this method.- Parameters
attrs (List[TensorAttr]) – A list of input
TensorAttrobjects that identify the tensors to obtain.- Raises
ValueError – If any input
TensorAttris not fully specified.KeyError – If any of the tensors corresponding to the input
TensorAttrwas not found.
- remove_tensor(*args, **kwargs) bool[source]
Removes a tensor from the
FeatureStore. Returns whether deletion was successful.- Parameters
**kwargs (TensorAttr) – Any relevant tensor attributes that correspond to the feature tensor. See the
TensorAttrdocumentation for required and optional attributes.- Raises
ValueError – If the input
TensorAttris not fully specified.
- update_tensor(tensor: Union[Tensor, ndarray], *args, **kwargs) bool[source]
Updates a
tensorin theFeatureStorewith a new value. Returns whether the update was succesful.Note
Implementor classes can choose to define more efficient update methods; the default performs a removal and insertion.
- Parameters
tensor (torch.Tensor or np.ndarray) – The feature tensor to be updated.
**kwargs (TensorAttr) – Any relevant tensor attributes that correspond to the feature tensor. See the
TensorAttrdocumentation for required and optional attributes.
- get_tensor_size(*args, **kwargs) Optional[Tuple[int, ...]][source]
Obtains the size of a tensor given its
TensorAttr, orNoneif the tensor does not exist.
- abstract get_all_tensor_attrs() List[TensorAttr][source]
Obtains all tensor attributes stored in this
FeatureStore.
- view(*args, **kwargs) AttrView[source]
Returns a view of the
FeatureStoregiven a not yet fully-specifiedTensorAttr.
- class TensorAttr(group_name: Optional[str] = FieldStatus.UNSET, attr_name: Optional[str] = FieldStatus.UNSET, index: Optional[Union[Tensor, ndarray, slice, int]] = FieldStatus.UNSET)[source]
Defines the attributes of a
FeatureStoretensor. It holds all the parameters necessary to uniquely identify a tensor from theFeatureStore.Note that the order of the attributes is important; this is the order in which attributes must be provided for indexing calls.
FeatureStoreimplementations can define a different ordering by overridingTensorAttr.__init__().- is_set(key: str) bool[source]
Whether an attribute is set in
TensorAttr.
- is_fully_specified() bool[source]
Whether the
TensorAttrhas no unset fields.
- fully_specify() TensorAttr[source]
Sets all
UNSETfields toNone.
- update(attr: TensorAttr)[source]
Updates an
TensorAttrwith set attributes from anotherTensorAttr.
- class GraphStore(edge_attr_cls: Optional[Any] = None)[source]
An abstract base class to access edges from a remote graph store.
- Parameters
edge_attr_cls (EdgeAttr, optional) – A user-defined
EdgeAttrclass to customize the required attributes and their ordering to uniquely identify edges. (default:None)
- put_edge_index(edge_index: Tuple[Tensor, Tensor], *args, **kwargs) bool[source]
Synchronously adds an
edge_indextuple to theGraphStore. Returns whether insertion was successful.- Parameters
tensor (Tuple[torch.Tensor, torch.Tensor]) – The
edge_indextuple in a format specified inEdgeAttr.**kwargs (EdgeAttr) – Any relevant edge attributes that correspond to the
edge_indextuple. See theEdgeAttrdocumentation for required and optional attributes.
- get_edge_index(*args, **kwargs) Tuple[Tensor, Tensor][source]
Synchronously obtains an
edge_indextuple from theGraphStore.
- remove_edge_index(*args, **kwargs) bool[source]
Synchronously deletes an
edge_indextuple from theGraphStore. Returns whether deletion was successful.
- abstract get_all_edge_attrs() List[EdgeAttr][source]
Obtains all edge attributes stored in the
GraphStore.
- coo(edge_types: Optional[List[Any]] = None, store: bool = False) Tuple[Dict[Tuple[str, str, str], Tensor], Dict[Tuple[str, str, str], Tensor], Dict[Tuple[str, str, str], Optional[Tensor]]][source]
Obtains the edge indices in the
GraphStorein COO format.- Parameters
edge_types (List[Any], optional) – The edge types of edge indices to obtain. If set to
None, will return the edge indices of all existing edge types. (default:None)store (bool, optional) – Whether to store converted edge indices in the
GraphStore. (default:False)
- csr(edge_types: Optional[List[Any]] = None, store: bool = False) Tuple[Dict[Tuple[str, str, str], Tensor], Dict[Tuple[str, str, str], Tensor], Dict[Tuple[str, str, str], Optional[Tensor]]][source]
Obtains the edge indices in the
GraphStorein CSR format.- Parameters
edge_types (List[Any], optional) – The edge types of edge indices to obtain. If set to
None, will return the edge indices of all existing edge types. (default:None)store (bool, optional) – Whether to store converted edge indices in the
GraphStore. (default:False)
- csc(edge_types: Optional[List[Any]] = None, store: bool = False) Tuple[Dict[Tuple[str, str, str], Tensor], Dict[Tuple[str, str, str], Tensor], Dict[Tuple[str, str, str], Optional[Tensor]]][source]
Obtains the edge indices in the
GraphStorein CSC format.- Parameters
edge_types (List[Any], optional) – The edge types of edge indices to obtain. If set to
None, will return the edge indices of all existing edge types. (default:None)store (bool, optional) – Whether to store converted edge indices in the
GraphStore. (default:False)
- class EdgeAttr(edge_type: Tuple[str, str, str], layout: EdgeLayout, is_sorted: bool = False, size: Optional[Tuple[int, int]] = None)[source]
Defines the attributes of a
GraphStoreedge. It holds all the parameters necessary to uniquely identify an edge from theGraphStore.Note that the order of the attributes is important; this is the order in which attributes must be provided for indexing calls.
GraphStoreimplementations can define a different ordering by overridingEdgeAttr.__init__().
- download_url(url: str, folder: str, log: bool = True, filename: Optional[str] = None)[source]
Downloads the content of an URL to a specific folder.
- extract_tar(path: str, folder: str, mode: str = 'r:gz', log: bool = True)[source]
Extracts a tar archive to a specific folder.
- extract_zip(path: str, folder: str, log: bool = True)[source]
Extracts a zip archive to a specific folder.
PyTorch Lightning Wrappers
Converts a set of |
|
Converts a |
|
Converts a |
- class LightningDataset(train_dataset: Dataset, val_dataset: Optional[Dataset] = None, test_dataset: Optional[Dataset] = None, batch_size: int = 1, num_workers: int = 0, **kwargs)[source]
Converts a set of
Datasetobjects into apytorch_lightning.LightningDataModulevariant, which can be automatically used as adatamodulefor multi-GPU graph-level training via PyTorch Lightning.LightningDatasetwill take care of providing mini-batches viaDataLoader.Note
Currently only the
pytorch_lightning.strategies.SingleDeviceStrategyandpytorch_lightning.strategies.DDPSpawnStrategytraining strategies of PyTorch Lightning are supported in order to correctly share data across all devices/processes:import pytorch_lightning as pl trainer = pl.Trainer(strategy="ddp_spawn", accelerator="gpu", devices=4) trainer.fit(model, datamodule)
- Parameters
train_dataset (Dataset) – The training dataset.
val_dataset (Dataset, optional) – The validation dataset. (default:
None)test_dataset (Dataset, optional) – The test dataset. (default:
None)batch_size (int, optional) – How many samples per batch to load. (default:
1)num_workers – How many subprocesses to use for data loading.
0means that the data will be loaded in the main process. (default:0)**kwargs (optional) – Additional arguments of
torch_geometric.loader.DataLoader.
- class LightningNodeData(data: Union[Data, HeteroData], input_train_nodes: Union[Tensor, None, str, Tuple[str, Optional[Tensor]]] = None, input_train_time: Optional[Tensor] = None, input_val_nodes: Union[Tensor, None, str, Tuple[str, Optional[Tensor]]] = None, input_val_time: Optional[Tensor] = None, input_test_nodes: Union[Tensor, None, str, Tuple[str, Optional[Tensor]]] = None, input_test_time: Optional[Tensor] = None, input_pred_nodes: Union[Tensor, None, str, Tuple[str, Optional[Tensor]]] = None, input_pred_time: Optional[Tensor] = None, loader: str = 'neighbor', node_sampler: Optional[BaseSampler] = None, batch_size: int = 1, num_workers: int = 0, **kwargs)[source]
Converts a
DataorHeteroDataobject into apytorch_lightning.LightningDataModulevariant, which can be automatically used as adatamodulefor multi-GPU node-level training via PyTorch Lightning.LightningDatasetwill take care of providing mini-batches viaNeighborLoader.Note
Currently only the
pytorch_lightning.strategies.SingleDeviceStrategyandpytorch_lightning.strategies.DDPSpawnStrategytraining strategies of PyTorch Lightning are supported in order to correctly share data across all devices/processes:import pytorch_lightning as pl trainer = pl.Trainer(strategy="ddp_spawn", accelerator="gpu", devices=4) trainer.fit(model, datamodule)
- Parameters
data (Data or HeteroData) – The
DataorHeteroDatagraph object.input_train_nodes (torch.Tensor or str or (str, torch.Tensor)) – The indices of training nodes. If not given, will try to automatically infer them from the
dataobject by searching fortrain_mask,train_idx, ortrain_indexattributes. (default:None)input_train_time (Tensor, optional) – The timestamp of training nodes. (default:
None)input_val_nodes (torch.Tensor or str or (str, torch.Tensor)) – The indices of validation nodes. If not given, will try to automatically infer them from the
dataobject by searching forval_mask,valid_mask,val_idx,valid_idx,val_index, orvalid_indexattributes. (default:None)input_val_time (Tensor, optional) – The timestamp of validation edges. (default:
None)input_test_nodes (torch.Tensor or str or (str, torch.Tensor)) – The indices of test nodes. If not given, will try to automatically infer them from the
dataobject by searching fortest_mask,test_idx, ortest_indexattributes. (default:None)input_test_time (Tensor, optional) – The timestamp of test nodes. (default:
None)input_pred_nodes (torch.Tensor or str or (str, torch.Tensor)) – The indices of prediction nodes. If not given, will try to automatically infer them from the
dataobject by searching forpred_mask,pred_idx, orpred_indexattributes. (default:None)input_pred_time (Tensor, optional) – The timestamp of prediction nodes. (default:
None)loader (str) – The scalability technique to use (
"full","neighbor"). (default:"neighbor")node_sampler (BaseSampler, optional) – A custom sampler object to generate mini-batches. If set, will ignore the
loaderoption. (default:None)batch_size (int, optional) – How many samples per batch to load. (default:
1)num_workers – How many subprocesses to use for data loading.
0means that the data will be loaded in the main process. (default:0)**kwargs (optional) – Additional arguments of
torch_geometric.loader.NeighborLoader.
- class LightningLinkData(data: Union[Data, HeteroData, Tuple[FeatureStore, GraphStore]], input_train_edges: Union[Tensor, None, Tuple[str, str, str], Tuple[Tuple[str, str, str], Optional[Tensor]]] = None, input_train_labels: Optional[Tensor] = None, input_train_time: Optional[Tensor] = None, input_val_edges: Union[Tensor, None, Tuple[str, str, str], Tuple[Tuple[str, str, str], Optional[Tensor]]] = None, input_val_labels: Optional[Tensor] = None, input_val_time: Optional[Tensor] = None, input_test_edges: Union[Tensor, None, Tuple[str, str, str], Tuple[Tuple[str, str, str], Optional[Tensor]]] = None, input_test_labels: Optional[Tensor] = None, input_test_time: Optional[Tensor] = None, loader: str = 'neighbor', link_sampler: Optional[BaseSampler] = None, batch_size: int = 1, num_workers: int = 0, **kwargs)[source]
Converts a
DataorHeteroDataobject into apytorch_lightning.LightningDataModulevariant, which can be automatically used as adatamodulefor multi-GPU link-level training (such as for link prediction) via PyTorch Lightning.LightningDatasetwill take care of providing mini-batches viaLinkNeighborLoader.Note
Currently only the
pytorch_lightning.strategies.SingleDeviceStrategyandpytorch_lightning.strategies.DDPSpawnStrategytraining strategies of PyTorch Lightning are supported in order to correctly share data across all devices/processes:import pytorch_lightning as pl trainer = pl.Trainer(strategy="ddp_spawn", accelerator="gpu", devices=4) trainer.fit(model, datamodule)
- Parameters
data (Data or HeteroData or Tuple[FeatureStore, GraphStore]) – The
DataorHeteroDatagraph object, or a tuple of aFeatureStoreandGraphStoreobjects.input_train_edges (Tensor or EdgeType or Tuple[EdgeType, Tensor]) – The training edges. (default:
None)input_train_labels (Tensor, optional) – The labels of train edges. (default:
None)input_train_time (Tensor, optional) – The timestamp of train edges. (default:
None)input_val_edges (Tensor or EdgeType or Tuple[EdgeType, Tensor]) – The validation edges. (default:
None)input_val_labels (Tensor, optional) – The labels of validation edges. (default:
None)input_val_time (Tensor, optional) – The timestamp of validation edges. (default:
None)input_test_edges (Tensor or EdgeType or Tuple[EdgeType, Tensor]) – The test edges. (default:
None)input_test_labels (Tensor, optional) – The labels of train edges. (default:
None)input_test_time (Tensor, optional) – The timestamp of test edges. (default:
None)loader (str) – The scalability technique to use (
"full","neighbor"). (default:"neighbor")link_sampler (BaseSampler, optional) – A custom sampler object to generate mini-batches. If set, will ignore the
loaderoption. (default:None)batch_size (int, optional) – How many samples per batch to load. (default:
1)num_workers – How many subprocesses to use for data loading.
0means that the data will be loaded in the main process. (default:0)**kwargs (optional) – Additional arguments of
torch_geometric.loader.LinkNeighborLoader.