$darkmode
DENOPTIM
denoptim.fragmenter.FragmentClusterer Class Reference
Collaboration diagram for denoptim.fragmenter.FragmentClusterer:
[legend]

Classes

class  DistanceAsRMSD
 Distance in terms of RMSD between sets of 3D points expressed as a single vector of coordinates [x1,y1,z1,x2,y2,z2,...xN,yN,zN]. More...
 

Public Member Functions

 FragmentClusterer (List< ClusterableFragment > data, FragmenterParameters settings)
 Constructor for a clusterer of fragments. More...
 
 FragmentClusterer (List< ClusterableFragment > data, FragmenterParameters settings, Logger logger)
 Constructor for a clusterer of fragments. More...
 
void cluster ()
 Runs the clustering algorithm: More...
 
List< DynamicCentroidClustergetClusters ()
 Once the clustering is done, this method return the list of resulting clusters. More...
 
List< List< Fragment > > getTransformedClusters ()
 Once the clustering is done, this method return the list of clusters. More...
 
List< FragmentgetClusterCentroids ()
 Once the clustering is done, this method return the list of cluster centroids. More...
 
List< FragmentgetNearestToClusterCentroids ()
 Once the clustering is done, this method return the list of fragments that are nearest to the respective cluster centroid. More...
 

Static Protected Member Functions

static SummaryStatistics getRMSDStatsOfNoisyDistorsions (double[] center, int sampleSize, double maxNoise)
 Computes statistics for a unimodal, normally noise-distorted population of points generated by distorting a given N-dimensional vector. More...
 

Protected Attributes

FragmenterParameters settings
 Settings from the user. More...
 

Private Member Functions

boolean mergeClusters ()
 

Private Attributes

List< ClusterableFragmentdata
 The list of fragments to be clustered. More...
 
List< DynamicCentroidClusterclusters
 Current list of clusters. More...
 
Logger logger
 Logger. More...
 

Detailed Description

This tool clusters fragments based on geometry features. For each fragment all atoms and all attachment points are used to define a set of points in space (see ClusterableFragment). Then the RMSD of the points' position upon superposition is used to decide if geometries belong to the same cluster. The threshold RMSD value used to take the decision is calculated from a unimodal distribution of geometries generated from the centroid of the cluster by altering its set of geometries with normally distributed noise. The population of these normally distorted geoemtries is unimodal, by definition, and is used to calculate the threshold RMSD as

threshold = RMSD_mean + x * RMSD_Standard_deviation

where mean and standard deviation are the values for a normally distributed noise-distorted population generated on-the-fly for the cluster centroid of interest.

The factor x above, the size of the noise-distorted population, and the max amount of noise are parameters that are defined via the FragmenterParameters object given to the constructor.

Author
Marco Foscato

Definition at line 48 of file FragmentClusterer.java.

Constructor & Destructor Documentation

◆ FragmentClusterer() [1/2]

denoptim.fragmenter.FragmentClusterer.FragmentClusterer ( List< ClusterableFragment data,
FragmenterParameters  settings 
)

Constructor for a clusterer of fragments.

Clustering is based on the geometry of the arrangement of atoms and attachment points. To compare the positions in space of each point consistently, we need a consistent mapping of the points, i.e., a definition of that is the correct order of points for each fragment to analyze. Use FragmentAlignement to find such mapping to produce ClusterableFragment that have a ordered sets of points reflecting a consistent mapping throughout the list of fragments.

Parameters
datacollection of fragments to clusterize. The coordinates vector of each of these is expected to have a consistent ordering, but the value of the coordinates will be edited to align the geometries.
settingsconfiguration of the clustering method. This includes the size, max amount of noise of the reference unimodal population with normally distributed noise used to calculate the RMSD of a unimodal distribution of distotsions. It also define the factor used to weight the standard deviation when adding it to the mean of the RMSD of the unimodal population. The resulting value is the threshold RMSD value that is used to decide if two geometries are part of the same unimodal distribution, i.e., the same cluster.
Exceptions
DENOPTIMExceptionif an isomorphism is not found.

Definition at line 98 of file FragmentClusterer.java.

References denoptim.fragmenter.FragmentClusterer.data, and denoptim.fragmenter.FragmentClusterer.settings.

◆ FragmentClusterer() [2/2]

denoptim.fragmenter.FragmentClusterer.FragmentClusterer ( List< ClusterableFragment data,
FragmenterParameters  settings,
Logger  logger 
)

Constructor for a clusterer of fragments.

Clustering is based on the geometry of the arrangement of atoms and attachment points. To compare the positions in space of each point consistently, we need a consistent mapping of the points, i.e., a definition of that is the correct order of points for each fragment to analyze. Use FragmentAlignement to find such mapping to produce ClusterableFragment that have a ordered sets of points reflecting a consistent mapping throughout the list of fragments.

Parameters
datacollection of fragments to clusterize. The coordinates vector of each of these is expected to have a consistent ordering, but the value of the coordinates will be edited to align the geometries.
settingsconfiguration of the clustering method. This includes the size, max amount of noise of the reference unimodal population with normally distributed noise used to calculate the RMSD of a unimodal distribution of distotsions. It also define the factor used to weight the standard deviation when adding it to the mean of the RMSD of the unimodal population. The resulting value is the threshold RMSD value that is used to decide if two geometries are part of the same unimodal distribution, i.e., the same cluster.
loggerwhere to put all log. Note that due to the likely parallelization of fragment clustering tasks, usually there are multiple instances of this class, and we usually prefer to log each instance independently. Therefore, here one can offer a specific logger to use instead of that from the FragmenterParameters parameter.
Exceptions
DENOPTIMExceptionif an isomorphism is not found.

Definition at line 136 of file FragmentClusterer.java.

References denoptim.fragmenter.FragmentClusterer.data, denoptim.fragmenter.FragmentClusterer.logger, and denoptim.fragmenter.FragmentClusterer.settings.

Member Function Documentation

◆ cluster()

void denoptim.fragmenter.FragmentClusterer.cluster ( )

Runs the clustering algorithm:

  1. creates a cluster for each fragment
  2. tries to merge clusters. The condition for merging is that the the centroids of the clusters have an RMSD upon superposition that is lower than the threshold (see below).
  3. repeat the merging until no more changes occur in the list of clusters.

The threshold for merging is derived from the RMSD of a sample of distorted geometries of the centroid, where the distortion is normally distributed. The threshold is calculated as:

threshold = RMSD_mean + x * RMSD_standard_deviation

where mean and standard deviation are calculated on the sample of normally distorted geometries of the centroid (see getRMSDStatsOfNoisyDistorsions(double[], int, double)). The factor x, the maximum about of noise, and size of the sample are controlled by the settings given upon construction of an instance of this class.

Definition at line 169 of file FragmentClusterer.java.

References denoptim.fragmenter.FragmentClusterer.cluster(), denoptim.fragmenter.FragmentClusterer.clusters, denoptim.fragmenter.FragmentClusterer.data, denoptim.fragmenter.FragmentClusterer.logger, denoptim.fragmenter.FragmentClusterer.mergeClusters(), denoptim.programs.RunTimeParameters.NL, and denoptim.fragmenter.FragmentClusterer.settings.

Referenced by denoptim.fragmenter.ConformerExtractorTask.call(), denoptim.fragmenter.FragmentClusterer.cluster(), denoptim.fragmenter.FragmentClusterer.getClusterCentroids(), denoptim.fragmenter.FragmentClusterer.getNearestToClusterCentroids(), denoptim.fragmenter.FragmentClusterer.getTransformedClusters(), denoptim.fragmenter.FragmentClustererTest.testCluster(), and denoptim.fragmenter.FragmentClustererTest.testCluster2().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ getClusterCentroids()

List< Fragment > denoptim.fragmenter.FragmentClusterer.getClusterCentroids ( )

Once the clustering is done, this method return the list of cluster centroids.

Note the centroids are not part of the initial data. Use getNearestToClusterCentroids() to get the actual fragments from the initial dataset and that are closest to their respective cluster centroid.

Returns
the cluster centroids, or an empty list if clusters has not been called.

Definition at line 463 of file FragmentClusterer.java.

References denoptim.fragmenter.FragmentClusterer.cluster(), denoptim.fragmenter.FragmentClusterer.clusters, and denoptim.fragmenter.ClusterableFragment.getTransformedCopy().

Referenced by denoptim.fragmenter.ConformerExtractorTask.call(), and denoptim.fragmenter.FragmentClustererTest.testCluster().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ getClusters()

List< DynamicCentroidCluster > denoptim.fragmenter.FragmentClusterer.getClusters ( )

Once the clustering is done, this method return the list of resulting clusters.

Returns
the clusters.

Definition at line 422 of file FragmentClusterer.java.

References denoptim.fragmenter.FragmentClusterer.clusters.

Referenced by denoptim.fragmenter.FragmentClustererTest.testCluster(), and denoptim.fragmenter.FragmentClustererTest.testCluster2().

Here is the caller graph for this function:

◆ getNearestToClusterCentroids()

List< Fragment > denoptim.fragmenter.FragmentClusterer.getNearestToClusterCentroids ( )

Once the clustering is done, this method return the list of fragments that are nearest to the respective cluster centroid.

Note the centroids are not part of the initial data, but the nearest to the centroid is.

Returns
the fragment that is closest to each cluster centroid, or an empty list if clusters has not been called.

Definition at line 487 of file FragmentClusterer.java.

References denoptim.fragmenter.FragmentClusterer.cluster(), denoptim.fragmenter.FragmentClusterer.clusters, and denoptim.fragmenter.ClusterableFragment.getTransformedCopy().

Referenced by denoptim.fragmenter.ConformerExtractorTask.call().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ getRMSDStatsOfNoisyDistorsions()

static SummaryStatistics denoptim.fragmenter.FragmentClusterer.getRMSDStatsOfNoisyDistorsions ( double[]  center,
int  sampleSize,
double  maxNoise 
)
staticprotected

Computes statistics for a unimodal, normally noise-distorted population of points generated by distorting a given N-dimensional vector.

This is done by producing a dataset of N-dimensional points by adding normally distributed noise on the given N-dimensional center. Then, this method computes the new centroid of the dataset and produces the statistics of the RMDS upon superposition of new centroid to the centroid.

Parameters
centerthe N-dimensional point around which noise is added.
sampleSizethe size of the distribution of N-dimensional points that we generate around the center.
maxNoiseabsolute value of the maximum noise. Noise is generated with a Normal distribution centered at 0.0 and going from -maxNoise to +maxNoise.
Returns
the statistics of the RMSD for the normally distributed noise-distorted population.

Definition at line 345 of file FragmentClusterer.java.

References denoptim.utils.MathUtils.centroidOf(), denoptim.fragmenter.FragmentClusterer.DistanceAsRMSD.compute(), and denoptim.utils.Randomizer.nextNormalDouble().

Referenced by denoptim.fragmenter.FragmentClusterer.mergeClusters(), and denoptim.fragmenter.FragmentClustererTest.testGetRMSDStatsOfNoisyDistorsions().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ getTransformedClusters()

List< List< Fragment > > denoptim.fragmenter.FragmentClusterer.getTransformedClusters ( )

Once the clustering is done, this method return the list of clusters.

Each cluster contains objects that are transformed to best align with the centroid of the cluster.

Returns
the list of clusters, or an empty list if clusters has not been called.

Definition at line 436 of file FragmentClusterer.java.

References denoptim.fragmenter.FragmentClusterer.cluster(), and denoptim.fragmenter.FragmentClusterer.clusters.

Referenced by denoptim.fragmenter.ConformerExtractorTask.call(), and denoptim.fragmenter.FragmentClustererTest.testCluster().

Here is the call graph for this function:
Here is the caller graph for this function:

◆ mergeClusters()

Member Data Documentation

◆ clusters

List<DynamicCentroidCluster> denoptim.fragmenter.FragmentClusterer.clusters
private
Initial value:
=
new ArrayList<DynamicCentroidCluster>()

Current list of clusters.

Initially empty, then contains one cluster per each data point, and then is pruned to retain only the clusters surviving the merging.

Definition at line 60 of file FragmentClusterer.java.

Referenced by denoptim.fragmenter.FragmentClusterer.cluster(), denoptim.fragmenter.FragmentClusterer.getClusterCentroids(), denoptim.fragmenter.FragmentClusterer.getClusters(), denoptim.fragmenter.FragmentClusterer.getNearestToClusterCentroids(), denoptim.fragmenter.FragmentClusterer.getTransformedClusters(), and denoptim.fragmenter.FragmentClusterer.mergeClusters().

◆ data

List<ClusterableFragment> denoptim.fragmenter.FragmentClusterer.data
private

The list of fragments to be clustered.

Definition at line 53 of file FragmentClusterer.java.

Referenced by denoptim.fragmenter.FragmentClusterer.cluster(), and denoptim.fragmenter.FragmentClusterer.FragmentClusterer().

◆ logger

Logger denoptim.fragmenter.FragmentClusterer.logger
private

◆ settings

FragmenterParameters denoptim.fragmenter.FragmentClusterer.settings
protected

The documentation for this class was generated from the following file: