mlcolvar.data.DictDataset

class mlcolvar.data.DictDataset(dictionary: dict = None, feature_names=None, metadata: dict = None, data_type: str = 'descriptors', create_ref_idx: bool = False, **kwargs)[source]

Bases: Dataset

Define a torch dataset from a dictionary of lists/array/tensors and names.

E.g. { ‘data’torch.Tensor([1,2,3,4]),

‘labels’ : [0,0,1,1], ‘weights’ : np.asarray([0.5,1.5,1.5,0.5]) }

__init__(dictionary: dict = None, feature_names=None, metadata: dict = None, data_type: str = 'descriptors', create_ref_idx: bool = False, **kwargs)[source]

Create a Dataset from a dictionary or from a list of kwargs.

Parameters:
  • dictionary (dict) – Dictionary with names and tensors

  • feature_names (array-like) – List or numpy array with feature names

  • metadata (dict) – Dictionary with metadata quantities shared across the whole dataset.

  • data_type (str) – Type of data stored in the dataset, either ‘descriptors’ or ‘graphs’, by default ‘descriptors’. This will be stored in the dataset.metadata dictionary.

Methods

__init__([dictionary, feature_names, ...])

Create a Dataset from a dictionary or from a list of kwargs.

get_graph_inputs()

Generate and input suitable for graph models.

get_stats()

Compute statistics ('mean','Std','Min','Max') of the dataset.

property feature_names

Feature names.

get_graph_inputs()[source]

Generate and input suitable for graph models. Returns the whole dataset as a single batch not shuffled

get_stats()[source]

Compute statistics (‘mean’,’Std’,’Min’,’Max’) of the dataset.

Returns:

dictionary of dictionaries with statistics

Return type:

stats

Attributes

feature_names

Feature names.

keys