Explaining how a search engine works can be a tough task – especially when you’re talking to those who wouldn’t call themselves “technical”. I always try to keep it as simple as possible and in recent times have tried to use this video by Matt Cutts (head of webspam at Google – uploaded exactly 3 years ago!) to explain everything as it did a much better job coming from Google themselves. Now 3 years later, Google have released an interactive infographic that explains in an even simpler way how their search engine works, providing some pretty cool information along the way. This makes it much easier to understand and explain to others as well! You can check it out here.
Whats Inside the Infographic:
Some quick items/facts from the webpage:
- Search is made up of 30 trillion individual pages (and its constantly growing)
- The Google index is over 100 million gigabytes
- The algorithm has over 200 ranking factors. They fall into the below categories:
- Site and page quality
- There are 10 different types of spam according to Google
- Pure spam – general rubbish. You’ll know these sites when you see them!
- Hidden text and/or keyword stuffing – page contains text that the user can’t see!
- User-generated spam – remember the old “leave a comment in my guestbook!” websites? Things like that!
- Parked domains – domains that don’t yet have websites and little to no unique content. Google won’t show these in results
- Thin content with little or no added value – pages that do not provide much value to the user such as copied content, doorway pages etc.
- Unnatural links to a site – links from low-quality domains that appear to be deceptive/mainpulative
- Spammy free hosts and dynamic DNS providers – these generally have a significant fraction of spammy content. Don’t use them!
- Cloaking and/or sneaky redirects – showing different sets of content to the user and the search engines
- Hacked site – hacked sites are often unsafe and used to spam links.
- Unnatural links from a site – when Google detects a pattern on manipulative outbound links on your site such as selling links.
Besides the obvious “oh thats how it works” that comes out of going through the page, as I mentioned earlier there is some seriously awesome information available on the page as well.
Live spam screenshots
This feature shows examples of websites that Google determines as “pure spam” and taken down by manual action:
Manual Action by Month
You can even see the action taken over time in the form of a sexy graph with a history of milestones in the algorithm over time which include Penguin from last year:
Google have been sending webmasters notifications for a few years now and but have gotten into more detail recently with regards to unnatural link warnings. The below graph shows a history of their messages per month:
Even more interestingly, Google is openly sharing how many requests they get for reconsideration every week:
Besides creating an awesome resource that webmasters can refer to and help educate their clients, Google is doing a great job of becoming more transparent and open with how they work. Whilst the information and data here isn’t exactly actionable or mind-blowing, I find it very interesting and am somewhat happier knowing what exactly Google are doing to try and clean up their search results.
Personally I think they can do a much better job and have quite a long way to go, but props to them for being more transparent
For more information on how search engines work, check out these awesome articles!
- How Do Search Engines Work? – Paula Lay (E-Web Marketing)
- How Search Engines Work – Google
- How Search Work – Google (Video)
- How Does Google Search Work? – Google (Video)
- How Search Engines Operate – SEOmoz