Java Data Mining
Encyclopedia
Java Data Mining is a standard Java API for developing data mining
applications and tools. JDM defines an object model and Java API for data mining objects and processes. JDM enables applications to integrate data mining technology for developing predictive analytics
applications and tools. The JDM 1.0 standard was developed under the Java Community Process
as JSR 73. In 2006, the JDM 2.0 specification was being developed under JSR 247, but has been withdrawn in 2011 without standardization.
Various data mining functions and techniques like statistical classification and association
, regression analysis
, data clustering
, and attribute importance are covered by the 1.0 release of this standard.
Data mining
Data mining , a relatively young and interdisciplinary field of computer science is the process of discovering new patterns from large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics and database systems...
applications and tools. JDM defines an object model and Java API for data mining objects and processes. JDM enables applications to integrate data mining technology for developing predictive analytics
Predictive analytics
Predictive analytics encompasses a variety of statistical techniques from modeling, machine learning, data mining and game theory that analyze current and historical facts to make predictions about future events....
applications and tools. The JDM 1.0 standard was developed under the Java Community Process
Java Community Process
The Java Community Process or JCP, established in 1998, is a formalized process that allows interested parties to get involved in the definition of future versions and features of the Java platform....
as JSR 73. In 2006, the JDM 2.0 specification was being developed under JSR 247, but has been withdrawn in 2011 without standardization.
Various data mining functions and techniques like statistical classification and association
Association (statistics)
In statistics, an association is any relationship between two measured quantities that renders them statistically dependent. The term "association" refers broadly to any such relationship, whereas the narrower term "correlation" refers to a linear relationship between two quantities.There are many...
, regression analysis
Regression analysis
In statistics, regression analysis includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables...
, data clustering
Data clustering
Cluster analysis or clustering is the task of assigning a set of objects into groups so that the objects in the same cluster are more similar to each other than to those in other clusters....
, and attribute importance are covered by the 1.0 release of this standard.
See also
- AIDAAIDA (computing)Abstract Interfaces for Data Analysis is a set of defined interfaces and formats for representing common data analysis objects. The project was instigated and is primarily used by researchers in high-energy particle physics....
(Abstract Interfaces for Data Analysis) is a language-neutral standard, with a Java implementation - Mark F. Hornick, Erik Marcade, Sunil Venkayala: "Java Data Mining: Strategy, Standard, And Practice: A Practical Guide for Architecture, Design, And Implementation" (Broché)
- jHepWorkJHepWorkjHepWork is an interactive framework for scientific computation, data analysis and data visualization designed for scientists, engineers and students...
Java data analysis and data mining framework - Weka (machine learning)Weka (machine learning)Weka is a popular suite of machine learning software written in Java, developed at the University of Waikato, New Zealand...
- R (programming language)R (programming language)R is a programming language and software environment for statistical computing and graphics. The R language is widely used among statisticians for developing statistical software, and R is widely used for statistical software development and data analysis....
- SPSSSPSSSPSS is a computer program used for survey authoring and deployment , data mining , text analytics, statistical analysis, and collaboration and deployment ....
- Apache MahoutApache MahoutApache Mahout is an Apache project to produce free implementations of distributed or otherwise scalable machine learning algorithms on the Hadoop platform...
Books
- Java Data Mining: Strategy, Standard, and Practice, Morgan Kaufmann, ISBN 0-12-370452-9
External links
- JSR 247 (JDM 2.0)
- JSR 73 (JDM 1.0)
- Datamining (java.net project)
- Java Data Mining concepts article by Mark F. Hornick, Erik Marcadé, and Sunil Venkayala, at JavaWorld.com
- Mine Your Own Data with the JDM API article by Frank Sommers
- Using Java Data Mining to Develop Advanced Analytics Applications article by Sunil Venkayala at SYS-CON JDM Article