FSDM 2006
Home
Authors
Schedule
Speakers
Organization
CFP (pdf)

International Workshop on Feature Selection for Data Mining:
Interfacing Machine Learning and Statistics
April 22, 2006, Bethesda, Maryland

in conjunction with

2006 SIAM Conference on Data Mining (SDM)

Workshop Proceedings available online.

Knowledge discovery and data mining (KDD) is a multidisciplinary effort to extract nuggets of information fromdata. Massive data sets have become common in many applications and pose novel challenges for KDD. Along with changes in size, the context of these data runs from the loose structure of text and images to designs of microarray experiments. Research in computer science, engineering, and statistics confront similar issues in feature selection, and we see a pressing need for and benefits in the interdisciplinary exchange and discussion of ideas. We anticipate that our collaborations will shed light on research directions and provide the stimulus for creative breakthroughs.

This workshop will bring together researchers from different disciplines and encourage collaborative research in feature selection. Feature selection is an essential step in successful data mining applications. Feature selection has practical significance in many areas such as statistics, pattern recognition, machine learning, and data mining. The objectives of feature selection include: building simpler and more comprehensible models, improving data mining performance, and helping to prepare, clean, and understand data.

Submissions that consider knowledge in feature selection will receive special consideration. Knowledge here means some declarative knowledge that can be explicitly expressed by a domain expert such as constraints. One form of using knowledge is semi-supervised learning. The semi-supervised situation remains prevalent, even in the presence of massive data sets. The high expense of “marking documents” leads to situations in which one has massive data describing the feature space, but relatively little describing the relationship between features and the response. We encourage presentations featuring both the theory behind feature selection as well as novel applications to data. Additional workshop topics include the following.

Dimensionality reduction

  Feature ranking

  Subset selection

  Feature extraction

Feature construction

Improving data mining performance

Novel data structures

Streaming data reduction and time series

Selection for labeled and unlabeled data

Modeling variable and feature selection

Goodness measures  and evaluation

  False discovery rates

Ensemble methods

Selection bias

Sampling methods

Selection with small samples

Cross-discipline comparative studies

   Microarray, text, Web

Integration with data mining algorithms

Real-world case studies and applications

Emerging challenges

  Survival analysis

  Connecting selection and causality

  Knowledge in feature selection



There is no separate workshop registration. Please visit SIAM DM 2006 website for registration.

Paper Format, Important Dates, and Submission

  • A paper (maximum 8 pages in single column, no smaller than 11 pt) should be submitted in PDF or WORD format
  • Submissions should be emailed to featureselection@gmail.com
  • Quality short papers, position papers are also welcome
  • The deadline for submission: January 9, Monday.
  • Acceptance notification: February 1, Wednesday
  • Camera ready due: February, 14, Tuesday
  • The accepted papers will be published in the workshop proceedings.
  • Accepted papers will be considered for a special issue in a prestigious journal.

This workshop follows the previous highly successful workshop: FSDM 2005, held in Newport Beach, CA.

For more information about FSDM 2006, please contact us.