bblean.fingerprints#

Utilites for manipulating fingerprints and fingerprint files

Functions

make_fake_fingerprints

Make random fingerprints with statistics similar to (some) real databases

fps_from_smiles

Convert a sequence of smiles into chemical fingerprints

pack_fingerprints

Pack binary (only 0s and 1s) uint8 fingerprint arrays

unpack_fingerprints

Unpack packed uint8 arrays into binary uint8 arrays (with only 0s and 1s)

bblean.fingerprints.make_fake_fingerprints(num, n_features=2048, pack=True, seed=None, dtype=<class 'numpy.uint8'>)[source]#

Make random fingerprints with statistics similar to (some) real databases

bblean.fingerprints.fps_from_smiles(smiles, kind='ecfp4', n_features=2048, dtype=<class 'numpy.uint8'>, sanitize='all', skip_invalid=False, pack=True)[source]#

Convert a sequence of smiles into chemical fingerprints

bblean.fingerprints.pack_fingerprints(a)[source]#

Pack binary (only 0s and 1s) uint8 fingerprint arrays

bblean.fingerprints.unpack_fingerprints(a, n_features=None)[source]#

Unpack packed uint8 arrays into binary uint8 arrays (with only 0s and 1s)

Note

If n_features is not passed, unpacking will only recover the correct number of features if it is a multiple of 8, otherwise fingerprints will be padded with zeros to the closest multiple of 8. This is generally not an issue since most common fingerprints feature sizes (2048, 1024, etc) are multiples of 8, but if you are using a non-standard number of features you should pass n_features explicitly.