Blog scraping
Encyclopedia
Blog scraping is the process of scanning through a large number of blog
Blog
A blog is a type of website or part of a website supposed to be updated with new content from time to time. Blogs are usually maintained by an individual with regular entries of commentary, descriptions of events, or other material such as graphics or video. Entries are commonly displayed in...

s, usually daily, searching for and copying content. This process is conducted through automated software. The software and the individuals who run the software are sometimes referred to as blog scrapers.

Scraping is copying a blog that is not owned by the individual initiating the scraping process. If the material is copyrighted it is considered copyright infringement
Copyright infringement
Copyright infringement is the unauthorized or prohibited use of works under copyright, infringing the copyright holder's exclusive rights, such as the right to reproduce or perform the copyrighted work, or to make derivative works.- "Piracy" :...

, unless there is a license relaxing the copyright. The scraped content is often used on spam blogs or splogs.

Issues

A blog scraper who gathers content that is copyrighted
Copyright
Copyright is a legal concept, enacted by most governments, giving the creator of an original work exclusive rights to it, usually for a limited time...

 material is considered in violation of the law. Blog scraping can create problems for the individual or business who owns the blog. Blog scraping is particularly worrisome for business owners and business bloggers. Scrapers can copy an entire post from an independent or business blog. The duplicated content will include the author's tag and a link back to the author's site (if that link appears in the author's tag). However, most blog scrapers copy only a portion of the content that is keyword-relevant to their splog topic. By doing this, the keyword relevancy of the scraper's site is increased. Secondly, by not scraping the entire post, any outbound links are eliminated which means their search engine
Search engine
A search engine is an information retrieval system designed to help find information stored on a computer system. The search results are usually presented in a list and are commonly called hits. Search engines help to minimize the time required to find information and the amount of information...

 ranking is not reduced.

Additionally, scraped content can appear on literally any type of splog or RSS
RSS (file format)
RSS is a family of web feed formats used to publish frequently updated works—such as blog entries, news headlines, audio, and video—in a standardized format...

-fed spam site. This means an unsuspecting individual could find their creative or copyrighted material copied onto a site promoting pornography or similar type of content that may be offensive to the original author and his/her audience. This may be damaging to the original author's reputation.

External links

The source of this article is wikipedia, the free encyclopedia.  The text of this article is licensed under the GFDL.
 
x
OK