Key points are not available for this paper at this time.
Machine learning methods are often used to classify objects described by hundreds of attributes; in many applications of this kind a great fraction of attributes may be totally irrelevant to the classification problem. Even more, usually one cannot decide a priori which attributes are relevant. In this paper we present an improved version of the algorithm for identification of the full set of truly important variables in an information system. It is an extension of the random forest method which utilises the importance measure generated by the original algorithm. It compares, in the iterative fashion, the importances of original attributes with importances of their randomised copies. We analyse performance of the algorithm on several examples of synthetic data, as well as on a biologically important problem, namely on identification of the sequence motifs that are important for aptameric activity of short RNA sequences.
Building similarity graph...
Analyzing shared references across papers
Loading...
Miron B. Kursa
Centrum Kopernika Badań Interdyscyplinarnych
Aleksander Jankowski
University of Warsaw
Witold R. Rudnicki
University of Białystok
Fundamenta Informaticae
University of Warsaw
Building similarity graph...
Analyzing shared references across papers
Loading...
Kursa et al. (Fri,) studied this question.
synapsesocial.com/papers/69de6cf87ed287395e558cd6 — DOI: https://doi.org/10.3233/fi-2010-288
Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context: