mlcolvar.data.DictDataset¶

class mlcolvar.data.DictDataset(dictionary: dict = None, feature_names=None, metadata: dict = None, data_type: str = 'descriptors', create_ref_idx: bool = False, **kwargs)[source]¶

Bases: Dataset

Define a torch dataset from a dictionary of lists/array/tensors and names.

E.g. { ‘data’torch.Tensor([1,2,3,4]),: ‘labels’ : [0,0,1,1], ‘weights’ : np.asarray([0.5,1.5,1.5,0.5]) }

__init__(dictionary: dict = None, feature_names=None, metadata: dict = None, data_type: str = 'descriptors', create_ref_idx: bool = False, **kwargs)[source]¶

Create a Dataset from a dictionary or from a list of kwargs.

Parameters:

dictionary (dict) – Dictionary with names and tensors
feature_names (array-like) – List or numpy array with feature names
metadata (dict) – Dictionary with metadata quantities shared across the whole dataset.
data_type (str) – Type of data stored in the dataset, either ‘descriptors’ or ‘graphs’, by default ‘descriptors’. This will be stored in the dataset.metadata dictionary.

Methods

`__init__`([dictionary, feature_names, ...])	Create a Dataset from a dictionary or from a list of kwargs.
`get_graph_inputs`()	Generate and input suitable for graph models.
`get_stats`()	Compute statistics ('mean','Std','Min','Max') of the dataset.

property feature_names¶: Feature names.

get_graph_inputs()[source]¶: Generate and input suitable for graph models. Returns the whole dataset as a single batch not shuffled

get_stats()[source]¶

Compute statistics (‘mean’,’Std’,’Min’,’Max’) of the dataset.

Returns:: dictionary of dictionaries with statistics
Return type:: stats

Attributes

feature_names

Feature names.

keys