$darkmode
DENOPTIM
|
Classes | |
class | DistanceAsRMSD |
Distance in terms of RMSD between sets of 3D points expressed as a single vector of coordinates [x1,y1,z1,x2,y2,z2,...xN,yN,zN]. More... | |
Public Member Functions | |
FragmentClusterer (List< ClusterableFragment > data, FragmenterParameters settings) | |
Constructor for a clusterer of fragments. More... | |
FragmentClusterer (List< ClusterableFragment > data, FragmenterParameters settings, Logger logger) | |
Constructor for a clusterer of fragments. More... | |
void | cluster () |
Runs the clustering algorithm: More... | |
List< DynamicCentroidCluster > | getClusters () |
Once the clustering is done, this method return the list of resulting clusters. More... | |
List< List< Fragment > > | getTransformedClusters () |
Once the clustering is done, this method return the list of clusters. More... | |
List< Fragment > | getClusterCentroids () |
Once the clustering is done, this method return the list of cluster centroids. More... | |
List< Fragment > | getNearestToClusterCentroids () |
Once the clustering is done, this method return the list of fragments that are nearest to the respective cluster centroid. More... | |
Static Protected Member Functions | |
static SummaryStatistics | getRMSDStatsOfNoisyDistorsions (double[] center, int sampleSize, double maxNoise) |
Computes statistics for a unimodal, normally noise-distorted population of points generated by distorting a given N-dimensional vector. More... | |
Protected Attributes | |
FragmenterParameters | settings |
Settings from the user. More... | |
Private Member Functions | |
boolean | mergeClusters () |
Private Attributes | |
List< ClusterableFragment > | data |
The list of fragments to be clustered. More... | |
List< DynamicCentroidCluster > | clusters |
Current list of clusters. More... | |
Logger | logger |
Logger. More... | |
This tool clusters fragments based on geometry features. For each fragment all atoms and all attachment points are used to define a set of points in space (see ClusterableFragment
). Then the RMSD of the points' position upon superposition is used to decide if geometries belong to the same cluster. The threshold RMSD value used to take the decision is calculated from a unimodal distribution of geometries generated from the centroid of the cluster by altering its set of geometries with normally distributed noise. The population of these normally distorted geoemtries is unimodal, by definition, and is used to calculate the threshold RMSD as
threshold = RMSD_mean + x * RMSD_Standard_deviation
where mean and standard deviation are the values for a normally distributed noise-distorted population generated on-the-fly for the cluster centroid of interest.
The factor x above, the size of the noise-distorted population, and the max amount of noise are parameters that are defined via the FragmenterParameters
object given to the constructor.
Definition at line 48 of file FragmentClusterer.java.
denoptim.fragmenter.FragmentClusterer.FragmentClusterer | ( | List< ClusterableFragment > | data, |
FragmenterParameters | settings | ||
) |
Constructor for a clusterer of fragments.
Clustering is based on the geometry of the arrangement of atoms and attachment points. To compare the positions in space of each point consistently, we need a consistent mapping of the points, i.e., a definition of that is the correct order of points for each fragment to analyze. Use FragmentAlignement
to find such mapping to produce ClusterableFragment
that have a ordered sets of points reflecting a consistent mapping throughout the list of fragments.
data | collection of fragments to clusterize. The coordinates vector of each of these is expected to have a consistent ordering, but the value of the coordinates will be edited to align the geometries. |
settings | configuration of the clustering method. This includes the size, max amount of noise of the reference unimodal population with normally distributed noise used to calculate the RMSD of a unimodal distribution of distotsions. It also define the factor used to weight the standard deviation when adding it to the mean of the RMSD of the unimodal population. The resulting value is the threshold RMSD value that is used to decide if two geometries are part of the same unimodal distribution, i.e., the same cluster. |
DENOPTIMException | if an isomorphism is not found. |
Definition at line 98 of file FragmentClusterer.java.
References denoptim.fragmenter.FragmentClusterer.data, and denoptim.fragmenter.FragmentClusterer.settings.
denoptim.fragmenter.FragmentClusterer.FragmentClusterer | ( | List< ClusterableFragment > | data, |
FragmenterParameters | settings, | ||
Logger | logger | ||
) |
Constructor for a clusterer of fragments.
Clustering is based on the geometry of the arrangement of atoms and attachment points. To compare the positions in space of each point consistently, we need a consistent mapping of the points, i.e., a definition of that is the correct order of points for each fragment to analyze. Use FragmentAlignement
to find such mapping to produce ClusterableFragment
that have a ordered sets of points reflecting a consistent mapping throughout the list of fragments.
data | collection of fragments to clusterize. The coordinates vector of each of these is expected to have a consistent ordering, but the value of the coordinates will be edited to align the geometries. |
settings | configuration of the clustering method. This includes the size, max amount of noise of the reference unimodal population with normally distributed noise used to calculate the RMSD of a unimodal distribution of distotsions. It also define the factor used to weight the standard deviation when adding it to the mean of the RMSD of the unimodal population. The resulting value is the threshold RMSD value that is used to decide if two geometries are part of the same unimodal distribution, i.e., the same cluster. |
logger | where to put all log. Note that due to the likely parallelization of fragment clustering tasks, usually there are multiple instances of this class, and we usually prefer to log each instance independently. Therefore, here one can offer a specific logger to use instead of that from the FragmenterParameters parameter. |
DENOPTIMException | if an isomorphism is not found. |
Definition at line 136 of file FragmentClusterer.java.
References denoptim.fragmenter.FragmentClusterer.data, denoptim.fragmenter.FragmentClusterer.logger, and denoptim.fragmenter.FragmentClusterer.settings.
void denoptim.fragmenter.FragmentClusterer.cluster | ( | ) |
Runs the clustering algorithm:
The threshold for merging is derived from the RMSD of a sample of distorted geometries of the centroid, where the distortion is normally distributed. The threshold is calculated as:
threshold = RMSD_mean + x * RMSD_standard_deviation
where mean and standard deviation are calculated on the sample of normally distorted geometries of the centroid (see getRMSDStatsOfNoisyDistorsions(double[], int, double)
). The factor x, the maximum about of noise, and size of the sample are controlled by the settings given upon construction of an instance of this class.
Definition at line 169 of file FragmentClusterer.java.
References denoptim.fragmenter.FragmentClusterer.cluster(), denoptim.fragmenter.FragmentClusterer.clusters, denoptim.fragmenter.FragmentClusterer.data, denoptim.fragmenter.FragmentClusterer.logger, denoptim.fragmenter.FragmentClusterer.mergeClusters(), denoptim.programs.RunTimeParameters.NL, and denoptim.fragmenter.FragmentClusterer.settings.
Referenced by denoptim.fragmenter.ConformerExtractorTask.call(), denoptim.fragmenter.FragmentClusterer.cluster(), denoptim.fragmenter.FragmentClusterer.getClusterCentroids(), denoptim.fragmenter.FragmentClusterer.getNearestToClusterCentroids(), denoptim.fragmenter.FragmentClusterer.getTransformedClusters(), denoptim.fragmenter.FragmentClustererTest.testCluster(), and denoptim.fragmenter.FragmentClustererTest.testCluster2().
List< Fragment > denoptim.fragmenter.FragmentClusterer.getClusterCentroids | ( | ) |
Once the clustering is done, this method return the list of cluster centroids.
Note the centroids are not part of the initial data. Use getNearestToClusterCentroids()
to get the actual fragments from the initial dataset and that are closest to their respective cluster centroid.
clusters
has not been called. Definition at line 463 of file FragmentClusterer.java.
References denoptim.fragmenter.FragmentClusterer.cluster(), denoptim.fragmenter.FragmentClusterer.clusters, and denoptim.fragmenter.ClusterableFragment.getTransformedCopy().
Referenced by denoptim.fragmenter.ConformerExtractorTask.call(), and denoptim.fragmenter.FragmentClustererTest.testCluster().
List< DynamicCentroidCluster > denoptim.fragmenter.FragmentClusterer.getClusters | ( | ) |
Once the clustering is done, this method return the list of resulting clusters.
Definition at line 422 of file FragmentClusterer.java.
References denoptim.fragmenter.FragmentClusterer.clusters.
Referenced by denoptim.fragmenter.FragmentClustererTest.testCluster(), and denoptim.fragmenter.FragmentClustererTest.testCluster2().
List< Fragment > denoptim.fragmenter.FragmentClusterer.getNearestToClusterCentroids | ( | ) |
Once the clustering is done, this method return the list of fragments that are nearest to the respective cluster centroid.
Note the centroids are not part of the initial data, but the nearest to the centroid is.
clusters
has not been called. Definition at line 487 of file FragmentClusterer.java.
References denoptim.fragmenter.FragmentClusterer.cluster(), denoptim.fragmenter.FragmentClusterer.clusters, and denoptim.fragmenter.ClusterableFragment.getTransformedCopy().
Referenced by denoptim.fragmenter.ConformerExtractorTask.call().
|
staticprotected |
Computes statistics for a unimodal, normally noise-distorted population of points generated by distorting a given N-dimensional vector.
This is done by producing a dataset of N-dimensional points by adding normally distributed noise on the given N-dimensional center. Then, this method computes the new centroid of the dataset and produces the statistics of the RMDS upon superposition of new centroid to the centroid.
center | the N-dimensional point around which noise is added. |
sampleSize | the size of the distribution of N-dimensional points that we generate around the center. |
maxNoise | absolute value of the maximum noise. Noise is generated with a Normal distribution centered at 0.0 and going from -maxNoise to +maxNoise. |
Definition at line 345 of file FragmentClusterer.java.
References denoptim.utils.MathUtils.centroidOf(), denoptim.fragmenter.FragmentClusterer.DistanceAsRMSD.compute(), and denoptim.utils.Randomizer.nextNormalDouble().
Referenced by denoptim.fragmenter.FragmentClusterer.mergeClusters(), and denoptim.fragmenter.FragmentClustererTest.testGetRMSDStatsOfNoisyDistorsions().
List< List< Fragment > > denoptim.fragmenter.FragmentClusterer.getTransformedClusters | ( | ) |
Once the clustering is done, this method return the list of clusters.
Each cluster contains objects that are transformed to best align with the centroid of the cluster.
clusters
has not been called. Definition at line 436 of file FragmentClusterer.java.
References denoptim.fragmenter.FragmentClusterer.cluster(), and denoptim.fragmenter.FragmentClusterer.clusters.
Referenced by denoptim.fragmenter.ConformerExtractorTask.call(), and denoptim.fragmenter.FragmentClustererTest.testCluster().
|
private |
Definition at line 207 of file FragmentClusterer.java.
References denoptim.fragmenter.DynamicCentroidCluster.addPoint(), denoptim.fragmenter.FragmentClusterer.clusters, denoptim.fragmenter.ClusterableFragment.convertToPointArray(), denoptim.fragmenter.DynamicCentroidCluster.getCentroid(), denoptim.programs.fragmenter.FragmenterParameters.getFactorForSDOnStatsOfUnimodalPop(), denoptim.programs.fragmenter.FragmenterParameters.getMaxNoiseUnimodalPop(), denoptim.fragmenter.ClusterableFragment.getPoint(), denoptim.fragmenter.DynamicCentroidCluster.getPoints(), denoptim.fragmenter.FragmentClusterer.getRMSDStatsOfNoisyDistorsions(), denoptim.programs.fragmenter.FragmenterParameters.getSizeUnimodalPop(), denoptim.fragmenter.FragmentClusterer.logger, denoptim.fragmenter.DynamicCentroidCluster.removeAll(), and denoptim.fragmenter.FragmentClusterer.settings.
Referenced by denoptim.fragmenter.FragmentClusterer.cluster().
|
private |
Current list of clusters.
Initially empty, then contains one cluster per each data point, and then is pruned to retain only the clusters surviving the merging.
Definition at line 60 of file FragmentClusterer.java.
Referenced by denoptim.fragmenter.FragmentClusterer.cluster(), denoptim.fragmenter.FragmentClusterer.getClusterCentroids(), denoptim.fragmenter.FragmentClusterer.getClusters(), denoptim.fragmenter.FragmentClusterer.getNearestToClusterCentroids(), denoptim.fragmenter.FragmentClusterer.getTransformedClusters(), and denoptim.fragmenter.FragmentClusterer.mergeClusters().
|
private |
The list of fragments to be clustered.
Definition at line 53 of file FragmentClusterer.java.
Referenced by denoptim.fragmenter.FragmentClusterer.cluster(), and denoptim.fragmenter.FragmentClusterer.FragmentClusterer().
|
private |
Logger.
Definition at line 71 of file FragmentClusterer.java.
Referenced by denoptim.fragmenter.FragmentClusterer.cluster(), denoptim.fragmenter.FragmentClusterer.FragmentClusterer(), and denoptim.fragmenter.FragmentClusterer.mergeClusters().
|
protected |
Settings from the user.
Definition at line 66 of file FragmentClusterer.java.
Referenced by denoptim.fragmenter.FragmentClusterer.cluster(), denoptim.fragmenter.FragmentClusterer.FragmentClusterer(), and denoptim.fragmenter.FragmentClusterer.mergeClusters().