Webgraph
Encyclopedia
The webgraph describes the directed links between pages of the World Wide Web
. A graph
, in general, consists of several vertices, some pairs connected by edges. In a directed graph
, edges are directed lines or arcs. The webgraph is a directed graph, whose vertices correspond to the pages of the WWW, and a directed edge connects page X to page Y if there exists a hyperlink
on page X, referring to page Y.
World Wide Web
The World Wide Web is a system of interlinked hypertext documents accessed via the Internet...
. A graph
Graph (mathematics)
In mathematics, a graph is an abstract representation of a set of objects where some pairs of the objects are connected by links. The interconnected objects are represented by mathematical abstractions called vertices, and the links that connect some pairs of vertices are called edges...
, in general, consists of several vertices, some pairs connected by edges. In a directed graph
Directed graph
A directed graph or digraph is a pair G= of:* a set V, whose elements are called vertices or nodes,...
, edges are directed lines or arcs. The webgraph is a directed graph, whose vertices correspond to the pages of the WWW, and a directed edge connects page X to page Y if there exists a hyperlink
Hyperlink
In computing, a hyperlink is a reference to data that the reader can directly follow, or that is followed automatically. A hyperlink points to a whole document or to a specific element within a document. Hypertext is text with hyperlinks...
on page X, referring to page Y.
Properties
- The degree distribution of the webgraph strongly differs from the degree distribution of the classical random graph model, the Erdős–Rényi modelErdos–Rényi modelIn graph theory, the Erdős–Rényi model, named for Paul Erdős and Alfréd Rényi, is either of two models for generating random graphs, including one that sets an edge between each pair of nodes with equal probability, independently of the other edges...
: in the Erdős–Rényi model, there are very few large degree nodes, relative to the webgraph's degree distribution. The precise distribution is unclear, however: it is well described by a lognormal distribution, as well as the Barabási–Albert model for power laws. - The webgraph is an example of a scale-free networkScale-free networkA scale-free network is a network whose degree distribution follows a power law, at least asymptotically. That is, the fraction P of nodes in the network having k connections to other nodes goes for large values of k as...
. - The webgraph is also an example of a small-world networkSmall-world networkIn mathematics, physics and sociology, a small-world network is a type of mathematical graph in which most nodes are not neighbors of one another, but most nodes can be reached from every other by a small number of hops or steps...
.
Applications
- The webgraph is used for computing the PageRankPageRankPageRank is a link analysis algorithm, named after Larry Page and used by the Google Internet search engine, that assigns a numerical weighting to each element of a hyperlinked set of documents, such as the World Wide Web, with the purpose of "measuring" its relative importance within the set...
of the WWW pages. - The webgraph is used for computing the personalized PageRank.
- The webgraph can be used for detecting webpages of similar topics, through graph-theoretical properties only, like co-citation
- The webgraph is applied in the HITS algorithmHITS algorithmHyperlink-Induced Topic Search is a link analysis algorithm that rates Web pages, developed by Jon Kleinberg. It was a precursor to PageRank...
for identifying hubs and authorities in the web.