English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  Chemical library subset selection algorithms: A unified derivation using spatial statistics

Hamprecht, F. A., Thiel, W., & van Gunsteren, W. F. (2002). Chemical library subset selection algorithms: A unified derivation using spatial statistics. Journal of Chemical Information and Computer Sciences, 42(2), 414-428. doi:10.1021/ci010376b.

Item is

Files

show Files

Locators

show

Creators

show
hide
 Creators:
Hamprecht, F. A.1, Author
Thiel, W.2, Author           
van Gunsteren, W. F.1, Author
Affiliations:
1Univ Heidelberg, IWR, Interdisciplinary Ctr Sci Comp,; Neuenheimer Feld 368, D-69120 Heidelberg, Germany; Swiss Fed Inst Technol, Phys Chem Lab, CH-8092 Zürich, Switzerland, ou_persistent22              
2Research Department Thiel, Max-Planck-Institut für Kohlenforschung, Max Planck Society, ou_1445590              

Content

show
hide
Free keywords: -
 Abstract: If similar compounds have similar activity, rational subset selection becomes superior to random selection in screening for pharmacological lead discovery programs. Traditional approaches to this experimental design problem fall into two classes: (i) a linear or quadratic response function is assumed (ii) some space filling criterion is optimized. The assumptions underlying the first approach are clear but not always defendable; the second approach yields more intuitive designs but lacks a clear theoretical foundation. We model activity in a bioassay as realization of a stochastic process and use the best linear unbiased estimator to construct spatial sampling designs that optimize the integrated mean square prediction error, the maximum mean square prediction error, or the entropy. We argue that our approach constitutes a unifying framework encompassing most proposed techniques as limiting cases and sheds light on their underlying assumptions. In particular, vector quantization is obtained, in dimensions up to eight, in the limiting case of very smooth response surfaces for the integrated mean square error criterion. Closest packing is obtained for very rough surfaces under the integrated mean square error and entropy criteria. We suggest to use either the integrated mean square prediction error or the entropy as optimization criteria rather than approximations thereof and propose a scheme for direct iterative minimization of the integrated mean square prediction error. Finally, we discuss how the quality of chemical descriptors manifests itself and clarify the assumptions underlying the selection of diverse or representative subsets.

Details

show
hide
Language(s): eng - English
 Dates: 2002-03
 Publication Status: Issued
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Type: Peer
 Identifiers: eDoc: 20115
DOI: 10.1021/ci010376b
ISI: 000174733500033
 Degree: -

Event

show

Legal Case

show

Project information

show

Source 1

show
hide
Title: Journal of Chemical Information and Computer Sciences
  Alternative Title : J. Chem. Inf. Comput. Sci.
Source Genre: Journal
 Creator(s):
Affiliations:
Publ. Info: -
Pages: - Volume / Issue: 42 (2) Sequence Number: - Start / End Page: 414 - 428 Identifier: ISSN: 0095-2338