BotSeer
Encyclopedia
BotSeer was a Web-based information system and search tool that provides resources and services for research on Web robots and trends in Robot Exclusion Protocol deployment and adherence. It was created and designed by Yang Sun, Isaac G. Councill, Ziming Zhuang and C. Lee Giles
.
BotSeer provided three major services including robots.txt searching, robot bias analysis, and robot-generated log analysis. The prototype of BotSeer also allowed users to search six thousand documentation files and source codes from 18 open source crawler projects. BotSeer served as a resource for studying the regulation and behavior of Web robots as well as information about the creation of effective robots.txt files and crawler implementations. It was publicly available on the World Wide Web
at the College of Information Sciences and Technology at the Pennsylvania State University
. BotSeer had indexed and analyzed 2.2 million robots.txt files obtained from 13.2 million websites, as well as a large Web server log of real-world robot behavior and related analysis. BotSeer's goals were to assist researchers, webmasters, web crawler developers and others with web robots related research and information needs.
BotSeer has also had set up a honeypot
http://www.v4d.net to test the ethicality
, performance
and behavior
of web crawlers.
Currently, BotSeer is inactive.
Lee Giles
C. Lee Giles is the David Reese Professor at the College of Information Sciences and Technology at the Pennsylvania State University. He is also Professor of Computer Science and Engineering, Professor of Supply Chain and Information Systems, and Director of the Intelligent Systems Research...
.
BotSeer provided three major services including robots.txt searching, robot bias analysis, and robot-generated log analysis. The prototype of BotSeer also allowed users to search six thousand documentation files and source codes from 18 open source crawler projects. BotSeer served as a resource for studying the regulation and behavior of Web robots as well as information about the creation of effective robots.txt files and crawler implementations. It was publicly available on the World Wide Web
World Wide Web
The World Wide Web is a system of interlinked hypertext documents accessed via the Internet...
at the College of Information Sciences and Technology at the Pennsylvania State University
Pennsylvania State University
The Pennsylvania State University, commonly referred to as Penn State or PSU, is a public research university with campuses and facilities throughout the state of Pennsylvania, United States. Founded in 1855, the university has a threefold mission of teaching, research, and public service...
. BotSeer had indexed and analyzed 2.2 million robots.txt files obtained from 13.2 million websites, as well as a large Web server log of real-world robot behavior and related analysis. BotSeer's goals were to assist researchers, webmasters, web crawler developers and others with web robots related research and information needs.
BotSeer has also had set up a honeypot
Honeypot (computing)
In computer terminology, a honeypot is a trap set to detect, deflect, or in some manner counteract attempts at unauthorized use of information systems...
http://www.v4d.net to test the ethicality
Ethics
Ethics, also known as moral philosophy, is a branch of philosophy that addresses questions about morality—that is, concepts such as good and evil, right and wrong, virtue and vice, justice and crime, etc.Major branches of ethics include:...
, performance
Performance
A performance, in performing arts, generally comprises an event in which a performer or group of performers behave in a particular way for another group of people, the audience. Choral music and ballet are examples. Usually the performers participate in rehearsals beforehand. Afterwards audience...
and behavior
Behavior
Behavior or behaviour refers to the actions and mannerisms made by organisms, systems, or artificial entities in conjunction with its environment, which includes the other systems or organisms around as well as the physical environment...
of web crawlers.
Currently, BotSeer is inactive.