bblean.merges#

Merging criteria for BitBIRCH clustering

The functionality in this module is advanced and not needed for normal usage of the library. Make sure you understand the features here before applying them.

Classes

Exceptions

DiscardSubcluster

If raised in hooks, immediatly exit the merge discarding the incident subcluster

RejectMerge

If raised in hooks, immediatly exit the merge and reject it

exception bblean.merges.DiscardSubcluster[source]#

If raised in hooks, immediatly exit the merge discarding the incident subcluster

Discarded subclusters will not be stored in the final tree, and will only show up if calling bblean.BitBirch.get_assigments (or the labels_ attribute if using bblean.sklearn) with a cluster label of 0.

exception bblean.merges.RejectMerge[source]#

If raised in hooks, immediatly exit the merge and reject it

class bblean.merges.MergeAcceptFunction[source]#

Base class for user defined merges

If you want to implement a custom BitBirch merge you can subclass this and pass an instance of this function to a bblean.BitBirch class upon creation as BitBirch(..., merge_criterion=instance).

Warning

This is an advanced feature, make sure you fully understand what you are doing!

on_check_merge_start(threshold, new_sum, new_n, old_sum, nominee_sum, old_n, nominee_n, old_idxs, nominee_idxs)[source]#

Hook called before a merge is checked (meant to be overriden)

See MergeAcceptFunction.check_merge for an explanation of the different args

Warning

Numpy arrays passed to this function may use uint types, watch out for pitfalls of unsigned integer arithmetic.

This function must return the threshold, unchanged. If the threshold is modified by this function, the new threshold will be used for this specific merge check.

check_merge(threshold, new_sum, new_n, old_sum, nominee_sum, old_n, nominee_n, old_idxs, nominee_idxs)[source]#

Check if a merge should be accepted.

All user-defined merges should subclass this function.

threshold: Threshold for the merge new_sum: old_sum + nominee_sum new_n: old_n + new_n old_sum: col-wise sum of all fingerprints in this cluster nominee_sum: col-wise sum of all fingerprints in the nominee cluster old_n: size of this cluster nominee_n: size of the nominee cluster old_idxs: Indices of the mols in cluster before the nominee cluster is merged nominee_idxs: Nominee indices to merge.

If merging a single molecule (the most common case, when calling bblean.BitBirch.fit), nominee_idxs will be a list with a single index, new_n = 1, and nominee_sum will be the molecule fingerprint.

This function must return a boolean that determines whether the merge was accepted

Warning

Numpy arrays passed to this function may use uint types, watch out for pitfalls of unsigned integer arithmetic.

on_check_merge_end(accepted, old_idxs, nominee_idxs)[source]#

Hook called after a merge is checked (meant to be overriden)

accept: Whether the merge was accepted old_idxs: Indices of the mols in cluster before the nominee cluster is merged nominee_idxs: Nominee indices to merge.

If merging a single molecule (the most common case, when calling bblean.BitBirch.fit, nominee_idxs will be a list with a single index)

This function must not return a value

class bblean.merges.RadiusMerge[source]#
class bblean.merges.DiameterMerge[source]#
class bblean.merges.FlexibleToleranceDiameterMerge(tolerance=0.05, n_max=1000, decay=0.001, adaptive=True)[source]#
class bblean.merges.ToleranceDiameterMerge(tolerance=0.05, n_max=1000, decay=0.001, adaptive=True)[source]#
class bblean.merges.ToleranceRadiusMerge(tolerance=0.05, n_max=1000, decay=0.001, adaptive=True)[source]#