• Home
  • About
  • Testimonials
  • Portfolio
  • Contact us
     
 

Web Optimization News

Creating unique useful content on your website  

     
Why is it so important to have unique content?

In the early days of the internet web developers discovered it's very easy to copy content from other websites via automated scripts. This became a huge problems for search engines, because when searching for a term the results would return literally millions of websites with exactly the same content.

From a searcher point of view there was no value in finding millions of pages with exactly the same content. Many search engines that could not deal with this problem fast enough quickly faded out of existence.

Duplicate Content Filtering


There were many techniques developed to filter duplicate content, but Google's system is by far the most advanced. Google uses several methods in combination to detect duplicate content. However there are many pitfalls when trying to detect duplicate content and can lead to the wrong site being dropped or penalized.

Problems Search engines have when trying to filter duplicate content

Deciding which site to keep and which site to drop in cases of duplicate content.
How to detect the originator of content. Who created it first? (impossible but guessable)
How to deal with very similar but not identical text.
How to find text that was machine translated (copied) from another language.
How to deal with plagiarized text.

Known filters used by Google


First Google will strip all consistent design elements from the page, anything that appears on multiple pages across the site gets stripped out. That includes navigation links, headers, footers, block statements ..etc.

Exact content filtering
Exact content filtering is the most obvious and widely used method, if two or more sites contain exactly the same data then only one site is kept and others are dropped.

Fingerprinting
Google will split the text up into paragraphs and sentences then create a "fingerprint image" based on the proximity and semantics of each block of text. This is to avoid template spammers and content plagiarization.

For example the text below

Are you a Doctor looking for locum work?? Do you believe in having yourself represented to your skills and attitude to work and not at your grade?Currently we are looking for a SHO, Acute Medicine Doctor to start ASAP.

Will have the same fingerprint as

proxmity content - shuffled sentences Are you a Doctor looking for locum work?? >Currently we are looking for a SHO, Acute Medicine Doctor to start ASAP. Do you believe in having yourself represented to your skills and attitude to work and not at your grade?
----

template content - word or phrase replacement Are you a >Nurse looking for locum work?? Do you believe in having yourself represented to your skills and attitude to work and not at your grade?Currently we are looking for a Obs & Gynae Nurse to start ASAP.
----

plagiarized content - semantically obvious rewording
Are you a Doctor looking for locum work?? Are your skills and attitude above your current grade? Currently we are looking for a SHO, Acute Medicine Doctor to start ASAP.

BackLink filtering
If two or more pages carry the same duplicated content, google will check the number of backlinks pointing each page, then keep the page with the highest number of backlinks. The reason is the page that most other webmasters link to is likely to be the most useful.

Site wide filters
Any site that contains a disproportionate amount of copied pages to unique content will have a penalty applied to it. This penalty will affect the whole website! including unique content as well as the duplicate content.

Historical profile
When a site is first launched google begins to build a historical profile of that site, a site that starts out spammy with duplicate content will find it much harder to rank in the future with clean unique content due to the previous history of spammy behaviour. The first 6 months of a new site is critical in building a nice clean history.

Hand check
Google will ocasionally send a human evaluator to your site to check for duplication that the algorithmn may have missed, if this happens and the site gets marked as a spam site.. it's near impossible to fix for at least 90 days, and can completely destroy any future progress on the website.
As a rule it's best to assume a human will be looking at your site so if you can spot duplicate content then a trained Google evaluator will spot it too.

Authority FreePass
Authority sites are trusted websites that have many links from other authorities pointing to them. With a very good historical record of clean non-spammy behaviour. Authority websites can get away with copying data and outranking the original content due to the trust built up over time in Google.

Sites like the BBC, MSN, .GOV (government), .EDU(education) are good examples of strong authority websites.
We try to build all our sites up over time to gain authority status. This takes careful link and content management over a year or two.

*Notes
Google is fairly lenient with copied data, you will sometimes see multiple sites ranking with the same content, this is because Google could not decide who originally created the content or the sites listed are all authorities in their own way. However you never want to rely on this as it's unpredictable and can go badly wrong.


From the above filters we can build a nice clear path to a successful site


Write clean unique content for your visitors.

Don't copy from elsewhere or use obvious templates to create content.

Never post your own content to other authority sites, they will outrank you and damage your future traffic.

Get as many backlinks as you can to each page.

Get authority sites linking to your site.

Asif

 


 

Understanding Optimization  

     
Although all our sites are build around Search Engine Optimization (SEO) few clients are actually aware of what this actually means.


I will try to explain some of the more arcane terminology used in the SEO world.

---------------------
There are two widely accepted types of optimization,

White Hat
This is the the classic optimization, widely used and acceptable to all search engines. White Hat optimization in theory will never get you penalized or banned, and covers these topics

Meta Tags - accurate Description and Title tags
Keyworded Content - main content must contains the keywords and variants .
Site Layout (Theme)- keeping the site within a particular theme
Links - Sensible linking to and from related sites, unsolicited inbound links, organic link growth.

Black Hat
This covers the lesser known highly aggressive approach, almost all competitive categories require some degree of black hat optimization.

Cloaking - displaying one page to a search engine and another to a human visitor
Shadow Domains - domain names that replicate the main site (or a segment) for the purpose of spamming a particular search engine.
Doorway Pages - pages that designed purely for search engines.
KeyWord Spamming - repeating phrases and keywords many times on a page
Page Hijacking - Stealing someone else's rankings with a redirect.
CSS Tricks - Hiding links, resizing elements, blocking PR transfer
Aggressive Linking - buying text links, utilizing links farms, guest books, solicited unrelated 2-way and 3-way links.

Generally most competitive sites use varying degrees of black hat optimization, understanding what type of optimization maximizes chances to rank well with the minimum risk of penalties is down to the skill of the optimizer.

-------------

PageRank (PR)
A method of working out the probability of landing on a particular page by randomly surfing links, used mainly by google although all other search engines have a similar system. Essentially the more links you have from sites with high Pagerank, the higher your Pagerank will be. Pagerank was used to boost on-page optimization factors, on its own its pretty useless.

Consider if you will the fastest Formula 1 car, without a good driver it never achieves it full potential. Pagerank is like that car

Pay-Per-Click (PPC)
As the name suggests its paying for adverts in search engines and affiliate sites, you only pay when someone clicks on the advert. The current system is prone to abuse but can be very effective for some sites.

Bots / Crawlers / Spiders
Automated programs that crawl and index your site.

Main Algorithm / Algo
This is the primary system used to determine what pages should rank where, this is usually made up of a series of filters and lesser algorithms.


Filters
This is where optimization really begins...
Usually used to detect spam, or spammy behaviour in websites, Filters are constantly being changed by search engines in response to growing optimized spam, Optimizers respond by tweaking their sites to match the Filters. This ongoing symbiosis between search engines and optimizers will always continue as long as sites are ranked in an automated manner.

Currently all optimizers make educated guesses on the probability of a filter being in place and the cut-off point of that filter. If you can just touch the upper limit before the filter kicks in you can rank highly, push too hard and you will be dropped to the bottom of the pile. Being able to take advantage of the various filters in place is key to good optimization, everything else is pretty much general knowledge, and can easily be read up on in a few days.


Asif.
The Optimizer.


 


 
     
 
     

Categories

  • General
  • Web Optimization News
 

Creating unique useful content on your website

Why is it so important to have unique content? In..
Read More
 

W3C Compliance and Web Standards

There is a minority group of developers that have..
Read More
 

Importance of Web Marketing & Designing Techniques

A simple illustration that makes a big point, unle..
Read More
 

e-Business and Web Marketing

A standard definition of marketing as: “The manag..
Read More
 

ELECTRONIC TRADE/COMMERCE

ELECTRONIC TRADE/COMMERCE: Electronic commerc..
Read More
 

What’s Microsoft .NET Framework?

Microsoft .NET is software that connects informati..
Read More
 

Web programming VS Application programming

Some people think that there isn’t much difference..
Read More
 


Tel: 020 8554 4847 - Fax: 020 8554 5959 - Email: info:@eits.info © 2003 - 2008 e-IT Solutions UK. All rights reserved