deeprob.spn.utils package
Submodules
deeprob.spn.utils.filter module
deeprob.spn.utils.partitioning module
- class deeprob.spn.utils.partitioning.Partition(row_ids, col_ids, uncond_vars, parent_partition=None, is_naive=False, is_conj=False)[source]
Bases:
objectCreate a partition, i.e. an object modeling a data slice (and some of its properties) by keeping track of its indices (i.e. row_ids and col_ids).
- Parameters
row_ids (list) – The row indices of the modeled slice.
col_ids (list) – The column indices of the modeled slice.
uncond_vars (list) – Ordered list of variables from which the conjunction variables will be extracted to horizontally split the current partition.
parent_partition (Optional[Partition]) – The optional parent partition
is_naive (Optional[bool]) – If True and determinism is not required, a naive factorization will be learnt over the data slice modeled by the current partition; otherwise, if True and determinism is required, a disjunction will be learnt over the data slice modeled by the current partition.
is_conj (Optional[bool]) – True if the modeled slice is associated to a conjunction, i.e. every row in the slice is equal to the others.
- set_parent_partition(parent_partition)[source]
Set the parent partition and update its sub_partitions attribute.
- Parameters
parent_partition (Partition) – The parent partition.
- is_horizontally_partitioned()[source]
- Returns
True if the partition is horizontally partitioned, False otherwise.
- get_vertical_split()[source]
If possible, split vertically the current partition.
- Return type
list[np.ndarray, np.ndarray]
- get_conj_row_ids(data, conj, min_part_inst)[source]
Return the row ids of the instances satisfying the given conjunction. The row ids must be found within the slice modeled by the self partition.
- Parameters
data (ndarray) – The input data.
conj (list) – Conjunction modeled as a list of two lists: the first contains the IDs of the variables, the second the related assignment. For example, [[8,3],[1,0]] models the conjunction X8=1 and X3=0.
min_part_inst (int) – the minimum number of instances allowed to return.
- Returns
The row ids of the instances satisfying the given conjunction iff the number of such instances is greater or equal than the minimum number of instances allowed to return; otherwise, an empty array.
- Return type
- get_horizontal_split(data, min_part_inst, conj_len, arity, sd, random_state)[source]
If possible, split horizontally the current partition.
- Parameters
data (ndarray) – The input data matrix.
min_part_inst (int) – The minimum number of instances allowed per partition.
conj_len (int) – The conjunction length.
arity (int) – The maximum number of subpartitions for an horizontal partitioned partition.
sd (bool) – True if the generated tree will be used to model a SD PC, False otherwise.
random_state (RandomState) – The random state.
- Return type
- deeprob.spn.utils.partitioning.generate_random_partitioning(data, min_part_inst, n_max_parts, conj_len, arity, sd, uncond_vars, random_state)[source]
Create a random partition tree.
- Parameters
data (ndarray) – The input data matrix.
min_part_inst (int) – The minimum number of instances allowed per partition.
n_max_parts (int) – The maximum number of partitions in the tree.
conj_len (int) – The conjunction length.
arity (int) – The maximum number of subpartitions for an horizontal partitioned partition.
sd (bool) – True if the generated tree will be used to model a SD PC, False otherwise.
uncond_vars (list) – Ordered list of variables from which the first conj_len ones are extracted as conjunction variables to partition the root partition.
random_state (RandomState) – The random state.
- Return partition_root
The partition root of the tree.
- Return cl_parts_l
List containing the leaf partitions over which a CLTree will be learnt.
- Return conj_vars_l
List of lists. Every sublist contains the variables of a conjunction (e.g. [[3, 5]]). If a sublist occurs before another, then the former has been used first. There are no duplicates.
- Return n_partitions
The number of partitions in the generated tree.
deeprob.spn.utils.statistics module
- deeprob.spn.utils.statistics.compute_statistics(root)[source]
Compute some statistics of a SPN given its root. The computed statistics are the following:
n_nodes, the number of nodes
n_sum, the number of sum nodes
n_prod, the number of product nodes
n_leaves, the number of leaves
n_edges, the number of edges
n_params, the number of parameters
depth, the depth of the network
- deeprob.spn.utils.statistics.compute_edges_count(root)[source]
Get the number of edges of a SPN given its root.
deeprob.spn.utils.validity module
- deeprob.spn.utils.validity.check_spn(root, labeled=True, smooth=False, decomposable=False, structured_decomposable=False)[source]
Check a SPN have certain properties. Defaults to checking only ‘labeled’. This function combines several checks over a SPN, hence reducing the computational effort used to retrieve the nodes from the SPN.
- Parameters
root (Node) – The root node of the SPN.
labeled (bool) – Whether to check if the SPN is correctly labeled.
smooth (bool) – Whether to check if the SPN is smooth.
decomposable (bool) – Whether to check if the SPN is decomposable.
structured_decomposable (bool) – Whether to check if the SPN is structured decomposable.
- Raises
ValueError – If the SPN doesn’t have a certain property.
- deeprob.spn.utils.validity.is_labeled(root, nodes=None)[source]
Check if the SPN is labeled correctly. It checks that the initial id is zero and each id is consecutive.
- deeprob.spn.utils.validity.is_smooth(root, nodes=None)[source]
Check if the SPN is smooth (or complete). It checks that each child of a sum node has the same scope.
- deeprob.spn.utils.validity.is_decomposable(root, nodes=None)[source]
Check if the SPN is decomposable (or consistent). It checks that each child of a product node has disjointed scopes.
- deeprob.spn.utils.validity.is_structured_decomposable(root, nodes=None)[source]
Check if the PC is structured decomposable. It checks that product nodes follow a vtree. Note that if a PC is structured decomposable then it’s also decomposable.
- deeprob.spn.utils.validity.are_compatible(root_a, root_b, nodes_a=None, nodes_b=None)[source]
Check if two PCs are compatible.
- Parameters
root_a (Node) – The root of the first PC.
root_b (Node) – The root of the second PC.
nodes_a (Optional[List[Node]]) – The list of nodes of the first PC. If None, it will be retrieved starting from the root node.
nodes_b (Optional[List[Node]]) – The list of nodes of the second PC. If None, it will be retrieved starting from the root node.
- Returns
None if the two PCs are compatible, a reason otherwise.
- Return type