online advertising
online advertising

Spamdexing

Spamdexing or search engine spamming is the practice of deliberately creating web pages which will be indexed by search engines in order to increase the chance of a website or page being placed close to the beginning of search engine results, or to influence the category to which the page is assigned. Many designers of web pages try to get a good ranking in search engines and design their pages accordingly. The word is a portmanteau of spamming and indexing.

Spamdexing refers exclusively to practices that are dishonest and mislead search and indexing programs to give a page a ranking it does not deserve. "White hat" techniques for making a website indexable by search engines, without misleading the indexation process, are known as search engine optimization (SEO). SEO techniques do not involve deceit.

Search engine spammers, on the contrary, are generally aware that the content that they promote is not very useful or relevant to the ordinary internet surfer. Search engines use a variety of algorithms to determine relevancy ranking. Some of these include determining whether the search term appears in the META keywords tag, others whether the search term appears in the body text of a web page. A variety of techniques are used to spamdex (see below). Many search engines check for instances of spamdexing and will remove suspect pages from their indexes.

The rise of spamdexing in the mid-1990s made the leading search engines of the time less useful, and the success of Google at both producing better search results and combating keyword spamming, through its reputation-based PageRank link analysis system, helped it become the dominant search site late in the decade, where it remains. While it has not been rendered useless by spamdexing, Google has not been immune to more sophisticated methods either. Google bombing is another form of web vandalism, which involves creating pages that directly affect the rank of other sites[1].

Common spamdexing techniques can be classified into two broad classes: content spam and link spam.

Content spam

These techniques involve altering the logical view that a search engine has over the page's contents. They all aim at variants of the vector space model for information retrieval on text collections.

Link spam

Link spam takes advantage of link-based ranking algorithms, such as Google's PageRank algorithm, which gives a higher ranking to a website the more other highly-ranked websites link to it. These techniques also aim at influencing other link-based ranking techniques such as the HITS algorithm.

Involves creating tightly-knit communities of pages referencing each other, also known humorously as mutual admiration societies [2]

Some of these techniques may be applied for creating a Google bomb, this is, to cooperate with other users to boost the ranking of a particular page for a particular query.

Other types of spamdexing

A form of this is 'code swapping, this is: optimizing a page for top ranking, then swapping another page in its place once a top ranking is achieved.

The following techniques are also widely acknowledged as being spam, or "black hat":

See also

External links

To report Spamdexed pages

Search engine help pages for Webmasters

Other tools and information for Webmasters


Back | Home | Up