SeerSuite
Encyclopedia
SeerSuite refers a to a collection of open source
Open source
The term open source describes practices in production and development that promote access to the end product's source materials. Some consider open source a philosophy, others consider it a pragmatic methodology...

 tools that provide the underlying application software for creating academic search engines and digital libraries such as CiteSeerX
CiteSeerX
CiteSeerX is a public search engine and digital library and repository for scientific and academic papers with a focus on computer and information science. It is loosely based on the previous CiteSeer search engine and digital library and is built with a new open source infrastructure, SeerSuite,...

, ChemXSeer
ChemXSeer
ChemXSeer project, funded by the National Science Foundation, is a public integrated digital library, database, and search engine for scientific papers in chemistry. It is being developed by a multidisciplinary team of researchers at the Pennsylvania State University. ChemXSeer was conceived by Dr....

, and ArchSeer. The collections of tools is now available on SourceForge under the Apache Software Foundation License.

Each one of the tools in SeerSuite is a stand alone service that can be used to perform a particular task or can be tied to other applications in order to create a bigger and complex system such as CiteSeerX
CiteSeerX
CiteSeerX is a public search engine and digital library and repository for scientific and academic papers with a focus on computer and information science. It is loosely based on the previous CiteSeer search engine and digital library and is built with a new open source infrastructure, SeerSuite,...

. Many of the tools within the suite are the result of research initiatives conducted at the Pennsylvania State University
Pennsylvania State University
The Pennsylvania State University, commonly referred to as Penn State or PSU, is a public research university with campuses and facilities throughout the state of Pennsylvania, United States. Founded in 1855, the university has a threefold mission of teaching, research, and public service...

. Some of the tools may have external dependencies which are freely distributable or can be obtained freely from public sources.

In addition SeerSuite attempts to provide resources such as algorithms, data, metadata, services, techniques, and software that can be used to promote and create other digital libraries.

SeerSuite Tools

  • Header Parser
  • Citation Parser (ParsCit)
  • Document Filter
  • ID Server
  • CiteSeerx Web Application.
  • Installation and Database creation scripts.
  • Configuration files.
  • YouSeer: complete and powerful open source search engine available on SourceForge that integrates the open source crawler Heritrix
    Heritrix
    Heritrix is the Internet Archive’s web crawler, which was specially designed for web archiving. It is open-source and written in Java. The main interface is accessible using a web browser, and there is a command-line tool that can optionally be used to initiate crawls.Heritrix was developed...

     with the open source indexer Solr
    Solr
    Solr is an open source enterprise search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, and rich document handling...

    . The ingesting software is very flexible and allows for user-specific data extraction implementations. Furthermore, YouSeer provides a simple interface to query the index and another interface to retrieve cached versions of the documents.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK