How to Uncover Hot Untapped Niche Markets? Find Out...
Powered by MaxBlogPress 

Google Webmaster Central Tips

Latest News and Tips from Official Google Webmaster Central Blog. Get the tips on optimizing your website to increase the quality for your visitors and at the same time rank higher on Google Search!

Related SEO News Pages

Related Google Tools

  • Google Webmaster Tools
  • Google Analytics

FREE SEO NEWSLETTER

Get the latest tips and facts about search engine rankings for free. Each weekly SEO newsletter issue contains one main article and links to all important search engine news for that week.

Importance of link architecture

Published on 10/06/2008 at 10:57 PM
In Day 2 of links week, we'd like to discuss the importance of link architecture and answer more advanced questions on the topic. Link architecture—the method of internal linking on your site—is a crucial step in site design if you want your site indexed by search engines. It plays a critical role in Googlebot's ability to find your site's pages and ensures that your visitors can navigate and enjoy your site.

Keep important pages within several clicks from the homepage

Although you may believe that users prefer a search box on your site rather than category navigation, it's uncommon for search engine crawlers to type into search boxes or navigate via pulldown menus. So make sure your important pages are clickable from the homepage and for easy for Googlebot to find throughout your site. It's best to create a link architecture that's intuitive for users and crawlable for search engines. Here are more ideas to get started:
Intuitive navigation for users

Create common user scenarios, get "in character," then try working through your site. For example, if your site is about basketball, imagine being a visitor (in this case a "baller" :) trying to learn the best dribbling technique.
  • Starting at the homepage, if the user doesn't use the search box on your site or a pulldown menu, can they easily find the desired information (ball handling like a superstar) from the navigation links?

  • Let's say a user found your site through an external link, but they didn't land on the homepage. Starting from any (sub-/child) page on your site, make sure they can easily find their way to the homepage and/or other relevant sections. In other words, make sure users aren't trapped or stuck. Was the "best dribbling technique" easy for your imaginary user to find? Often breadcrumbs such as "Home > Techniques > Dribbling" help users to understand where they are.
Crawlable links for search engines
  • Text links are easily discovered by search engines and are often the safest bet if your priority is having your content crawled. While you're welcome to try the latest technologies, keep-in-mind that when text-based links are available and easily navigable for users, chances are that search engines can crawl your site as well.

    This <a href="new-page.html">text link</a> is easy for search engines to find.

  • Sitemap submission is also helpful for major search engines, though it shouldn't be a substitute for crawlable link architecture. If your site utilizes newer techniques, such as AJAX, see "Verify that Googlebot finds your internal links" below.
Use descriptive anchor text

Writing descriptive anchor text, the clickable words in a link, is a useful signal to help search engines and users alike to better understand your content. The more Google knows about your site—through your content, page titles, anchor text, etc.—the more relevant results we can return for users (and your potential search visitors). For example, if you run a basketball site and you have videos to accompany the textual content, a not-very-optimal way of linking would be:

To see all our basketball videos, <a href="videos.html">click here</a> for the entire listing.

However, instead of the generic "click here," you could rewrite the anchor text more descriptively as:

Feel free to browse all of our <a href="videos.html">basketball videos</a>.

Verify that Googlebot finds your internal links

For verified site owners, Webmaster Tools has the feature "Links > Pages with internal links" that's great for verifying that Googlebot finds most of the links you'd expect. This is especially useful if your site uses navigation involving JavaScript (which Googlebot doesn't always execute)—you'll want to make sure that Googlebot is finding other internal links as expected.

Here's an abridged snapshot of our internal links to the introductory post for "404 week at Webmaster Central." Our internal links are discovered as we had hoped.


Feel free to ask more internal linking questions
Here are some to get you started...

Q: What about using rel="nofollow" for maximizing PageRank flow in my internal link architecture (such as PageRank sculpting, or PageRank siloing)?
A: It's not something we, as webmasters who also work at Google, would really spend time or energy on. In other words, if your site already has strong link architecture, it's far more productive to work on keeping users happy with fresh and compelling content rather than to worry about PageRank sculpting.

Matt Cutts answered more questions about "appropriate uses of nofollow" in our webmaster discussion group.
Q: Let's say my website is about my favorite hobbies: biking and camping. Should I keep my internal linking architecture "themed" and not cross-link between the two?
A: We haven't found a case where a webmaster would benefit by intentionally "theming" their link architecture for search engines. And, keep-in-mind, if a visitor to one part of your site can't easily reach other parts of your site, that may be a problem for search engines as well.
Perhaps it's cliche, but at the end of the day, and at the end of this post, :) it's best to create solid link architecture (making navigation intuitive for users and crawlable for search engines)—implementing what makes sense for your users and their experience on your site.

Thanks for your time today! Information about outbound links will soon be available in Day 3 of links week. And, if you have helpful tips about internal links or questions for our team, please share them in the comments below.

Read Full Story...

Links information straight from the source

Published on 10/06/2008 at 10:20 AM
We hope that you're able to focus on helping users (and improving the web) by creating great content or providing a great service on your site. In between creating content and working on your site, you may have read some of the (often conflicting) link discussions circling the web. If you're asking, "What's going on -- what do I need to know about links?" then welcome to the first day of links week!

Day 2: Internal links (links within your site)
Internal linking is your homepage linking to your "Contact us" page, or your "Contact us" page linking to your "About me" page. Internal linking (also known as link architecture) is important because it's a major factor in how easily visitors can navigate your site. Additionally, internal linking contributes to your site's "crawlability" -- how easily a spider can reach your pages. More in Day 2 of links week.
Day 3: Outbound links (sites you link to)
Outbound links are external sites that you're linking to. For example, www.google.com/webmasters links to the domain googlewebmastercentral.blogspot.com (our lovely blog!). Outbound links allow us to surf the web -- they're a big reason why the web is so exciting and collaborative. Without outbound links, your site can seem isolated from the community because each page becomes "brochure-ware." Most sites include outbound links naturally and it shouldn't be a big concern. If you still have questions, we'll be covering outbound linking in more detail on Day 3.
Day 4: Inbound links (sites linking to you)
Inbound links are external sites linking to you. There are many webmasters who (rightfully) aren't preoccupied by the subject of inbound links. So why do some webmasters care? It's likely because merit-based or volunteered inbound links may seem like a quick way to increase rankings and traffic. Answers to your questions like, "Are there no-cost methods to maximize my merit-based links?" are provided on Day 4.

Read Full Story...

Advanced Website Diagnostics with Google Webmaster Tools

Published on 09/30/2008 at 11:55 AM
Running a website can be complicated—so we've provided Google Webmaster Tools to help webmasters to recognize potential issues before they become real problems. Some of the issues that you can spot there are relatively small (such as having duplicate titles and descriptions), other issues can be bigger (such as your website not being reachable). While Google Webmaster Tools can't tell you exactly what you need to change, it can help you to recognize that there could be a problem that needs to be addressed.

Let's take a look at a few examples that we ran across in the Google Webmaster Help Groups:

Is your server treating Googlebot like a normal visitor?

While Googlebot tries to act like a normal user, some servers may get confused and react in strange ways. For example, although your server may work flawlessly most of the time, some servers running IIS may react with a server error (or some other action that is tied to a server error occurring) when visited by a user with Googlebot's user-agent. In the Webmaster Help Group, we've seen IIS servers return result code 500 (Server error) and result code 404 (File not found) in the "Web crawl" diagnostics section, as well as result code 302 when submitting Sitemap files. If your server is redirecting to an error page, you should make sure that we can crawl the error page and that it returns the proper result code. Once you've done that, we'll be able to show you these errors in Webmaster Tools as well. For more information about this issue and possible resolutions, please see http://todotnet.com/archive/0001/01/01/7472.aspx and http://www.kowitz.net/archive/2006/12/11/asp.net-2.0-mozilla-browser-detection-hole.aspx.

If your website is hosted on a Microsoft IIS server, also keep in mind that URLs are case-sensitive by definition (and that's how we treat them). This includes URLs in the robots.txt file, which is something that you should be careful with if your server is using URLs in a non-case-sensitive way. For example, "disallow: /paris" will block /paris but not /Paris.

Does your website have systematically broken links somewhere?

Modern content management systems (CMS) can make it easy to create issues that affect a large number of pages. Sometimes these issues are straightforward and visible when you view the pages; sometimes they're a bit harder to spot on your own. If an issue like this creates a large number of broken links, they will generally show up in the "Web crawl" diagnostics section in your Webmaster Tools account (provided those broken URLs return a proper 404 result code). In one recent case, a site had a small encoding issue in its RSS feed, resulting in over 60,000 bad URLs being found and listed in their Webmaster Tools account. As you can imagine, we would have preferred to spend time crawling content instead of these 404 errors :).

Is your website redirecting some users elsewhere?

For some websites, it can make sense to concentrate on a group of users in a certain geographic location. One method of doing that can be to redirect users located elsewhere to a different page. However, keep in mind that Googlebot might not be crawling from within your target area, so it might be redirected as well. This could mean that Googlebot will not be able to access your home page. If that happens, it's likely that Webmaster Tools will run into problems when it tries to confirm the verification code on your site, resulting in your site becoming unverified. This is not the only reason for a site becoming unverified, but if you notice this on a regular basis, it would be a good idea to investigate. On this subject, always make sure that Googlebot is treated the same way as other users from that location, otherwise that might be seen as cloaking.

Is your server unreachable when we try to crawl?

It can happen to the best of sites—servers can go down and firewalls can be overly protective. If that happens when Googlebot tries to access your site, we won't be able crawl the website and you might not even know that we tried. Luckily, we keep track of these issues and you can spot "Network unreachable" and "robots.txt unreachable" errors in your Webmaster Tools account when we can't reach your site.

Has your website been hacked?

Hackers sometimes add strange, off-topic hidden content and links to questionable pages. If it's hidden, you might not even notice it right away; but nonetheless, it can be a big problem. While the Message Center may be able to give you a warning about some kinds of hidden text, it's best if you also keep an eye out yourself. Google Webmaster Tools can show you keywords from your pages in the "What Googlebot sees" section, so you can often spot a hack there. If you see totally irrelevant keywords, it would be a good idea to investigate what's going on. You might also try setting up Google Alerts or doing queries such as [site:example.com spammy words], where "spammy words" might be words like porn, viagra, tramadol, sex or other words that your site wouldn't normally show. If you find that your site actually was hacked, I'd recommend going through our blog post about things to do after being hacked.

There are a lot of issues that can be recognized with Webmaster Tools; these are just some of the more common ones that we've seen lately. Because it can be really difficult to recognize some of these problems, it's a great idea to check your Webmaster Tools account to make sure that you catch any issues before they become real problems. If you spot something that you absolutely can't pin down, why not post in the discussion group and ask the experts there for help?

Have you checked your site lately?

Read Full Story...

Keeping comment spam off your site and away from users

Published on 09/26/2008 at 02:26 PM
So, you've set up a forum on your site for the first time, or enabled comments on your blog. You carefully craft a post or two, click the submit button, and wait with bated breath for comments to come in.

And they do come in. Perhaps you get a friendly note from a fellow blogger, a pressing update from an MMORPG guild member, or a reminder from your Aunt Millie about dinner on Thursday. But then you get something else. Something... disturbing. Offers for deals that are too good to be true, bizarre logorrhean gibberish, and explicit images you certainly don't want Aunt Millie to see. You are now buried in a deluge of dreaded comment spam.

Comment spam is bad stuff all around. It's bad for you, because it adds to your workload. It's bad for your users, who want to find information on your site and certainly aren't interested in dodgy links and unrelated content. It's bad for the web as a whole, since it discourages people from opening up their sites for user-contributed content and joining conversations on existing forums.

So what can you, as a webmaster, do about it? 

A quick disclaimer: the list below is a good start, but not exhaustive. There are so many different blog, forum, and bulletin board systems out there that we can't possibly provide detailed instructions for each, so the points below are general enough to make sense on most systems.

Make sure your commenters are real people
  • Add a CAPTCHA. CAPTCHAs require users to read a bit of obfuscated text and type it back in to prove they're a human being and not an automated script. If your blog or forum system doesn't have CAPTCHAs built in you may be able to find a plugin like Recaptcha, a project which also helps digitize old books. CAPTCHAs are not foolproof but they make life a little more difficult for spammers. You can read more about the many different types of CAPTCHAS, but keep in mind that just adding a simple one can be fairly effective.

  • Block suspicious behavior. Many forums allow you to set time limits between posts, and you can often find plugins to look for excessive traffic from individual IP addresses or proxies and other activity more common to bots than human beings.

Use automatic filtering systems
  • Block obviously inappropriate comments by adding words to a blacklist. Spammers obfuscate words in their comments so this isn't a very scalable solution, but it can keep blatant spam at bay.

  • Use built-in features or plugins that delete or mark comments as spam for you. Spammers use automated methods to besmirch your site, so why not use an automated system to defend yourself?  Comprehensive systems like Akismet, which has plugins for many blogs and forum systems and TypePad AntiSpam, which is open-source and compatible with Akismet, are easy to install and do most of the work for you. 

  • Try using Bayesian filtering options, if available. Training the system to recognize spam may require some effort on your part, but this technique has been used successfully to fight email spam

Make your settings a bit stricter
  • Nofollow untrusted links. Many systems have a setting to add a rel="nofollow" attribute to the links in comments, or do so by default. This may discourage some types of spam, but it's definitely not the only measure you should take.

  • Consider requiring users to create accounts before they can post a comment. This adds steps to the user experience and may discourage some casual visitors from posting comments, but may keep the signal-to-noise ratio higher as well.

  • Change your settings so that comments need to be approved before they show up on your site. This is a great tactic if you want to hold comments to a high standard, don't expect a lot of comments, or have a small, personal site. You may be able to allow employees or trusted users to approve posts themselves, spreading the workload. 

  • Think about disabling some types of comments. For example, you may want to disable comments on very old posts that are unlikely to get legitimate comments. On blogs you can often disable trackbacks and pingbacks, which are very cool features but can be major avenues for automated spam.

Keep your site up-to-date
  • Take the time to keep your software up-to-date and pay special attention to important security updates. Some spammers take advantage of security holes in older versions of blogs, bulletin boards, and other content management systems. Check the Quick Security Checklist for additional measures.

You may need to strike a balance on which tactics you choose to implement depending on your blog or bulletin board software, your user base, and your level of experience. Opening up a site for comments without any protection is a big risk, whether you have a small personal blog or a huge site with thousands of users. Also, if your forum has been completely filled with thousands of spam posts and doesn't even show up in Google searches, you may want to submit a reconsideration request after you clear out the bad content and take measures to prevent further spam.

As a long-time blogger and web developer myself, I can tell you that a little time spent setting up measures like these up front can save you a ton of time and effort later. I'm new to the Webmaster Central team, originally from Cleveland. I'm very excited to help fellow webmasters, and have a passion for usability and search quality (I've even done a bit of academic research on the topic). Please share your tips on preventing comment and forum spam in the comments below, and as always you're welcome to ask questions in our discussion group.

Read Full Story...

More webmaster questions - Answered!

Published on 09/23/2008 at 02:05 PM
When it comes to answering your webmaster related questions, we just can't get enough. I wanted to follow-up and answer some additional questions that webmasters asked in our latest installment of Popular Picks. In case you missed it, you can find our answers to image search ranking, sitelinks, reconsideration requests, redirects, and our communication with webmasters in this blog post.



Check out these resources for additional details on questions answered in the video:Video Transcript:

Hi everyone, I'm Reid from the Search Quality team. Today I'd like to answer some of the unanswered questions from our latest round of popular picks.

Searchmaster had a question about duplicate content. Understandably, this is a popular concern from webmasters. You should check out the Google Webmaster Central Blogwhere my colleague Susan Moskwa recently posted "Demystifying the 'duplicate content penalty," which answers many questions and concerns about duplicate content.

Jay is the Boss wanted to know if e-commerce websites suffer if they have two or more different themes. For example, you could have a site that sells auto parts, but also sightseeing guides. In general, I'd encourage webmasters to create a website that they feel is relevant for users. If it makes sense to sell auto parts and sightseeing guides, then go for it. Those are the sites that perform well, because users want to visit those sites, and they'll link to them as well.

emma2 wanted to know if Google will follow links on a page using the "noindex" attribute in the "robots" meta tag. To answer this question, Googlebot will follow links on a page which uses the meta "noindex" tag, but that page will not appear in our search results. As a reminder, if you would like to prevent Googlebot from crawling any links on a page, use the "nofollow" attribute in the "robots" meta tag.

Aaron Pratt wanted to know about some ways a webmaster can rank well for local searches. A quick recommendation is to add your business to the Local Business Center. There, you can add contact information as well as store operating hours and coupons as well. Another example, or a tip, is to take advantage and purchase a country-specific top-level domain, or use the geotargeting feature in Webmaster Tools.

jdeb901 said it would be helpful if we could let webmasters know if we are having problems with Webmaster Tools. This is an excellent point, and we're always thinking about better ways to communicate with webmasters. If you're having problems with Webmaster Tools, chances are someone else is as well, and they've posted to the Google Webmaster Help Group about this. In the past, if we've experienced problems with Webmaster Tools, we've also created a "sticky" post to let users know that we know about these issues with Webmaster Tools, and we're working to find a solution.

Well, that about wraps it up with our Popular Picks. Thanks again for all of your questions, and I look forward to seeing you around the group.

Read Full Story...

Dynamic URLs vs. static URLs

Published on 09/22/2008 at 03:36 PM
Chatting with webmasters often reveals widespread beliefs that might have been accurate in the past, but are not necessarily up-to-date any more. This was the case when we recently talked to a couple of friends about the structure of a URL. One friend was concerned about using dynamic URLs, since (as she told us) "search engines can't cope with these." Another friend thought that dynamic URLs weren't a problem at all for search engines and that these issues were a thing of the past. One even admitted that he never understood the fuss about dynamic URLs in comparison to static URLs. For us, that was the moment we decided to read up on the topic of dynamic and static URLs. First, let's clarify what we're talking about:

What is a static URL?
A static URL is one that does not change, so it typically does not contain any url parameters. It can look like this: http://www.example.com/archive/january.htm. You can search for static URLs on Google by typing filetype:htm in the search field. Updating these kinds of pages can be time consuming, especially if the amount of information grows quickly, since every single page has to be hard-coded. This is why webmasters who deal with large, frequently updated sites like online shops, forum communities, blogs or content management systems may use dynamic URLs.

What is a dynamic URL?
If the content of a site is stored in a database and pulled for display on pages on demand, dynamic URLs maybe used. In that case the site serves basically as a template for the content. Usually, a dynamic URL would look something like this: http://code.google.com/p/google-checkout-php-sample-code/issues/detail?id=31. You can spot dynamic URLs by looking for characters like: ? = &. Dynamic URLs have the disadvantage that different URLs can have the same content. So different users might link to URLs with different parameters which have the same content. That's one reason why webmasters sometimes want to rewrite their URLs to static ones.

Should I try to make my dynamic URLs look static?
Following are some key points you should keep in mind while dealing with dynamic URLs:
  1. It's quite hard to correctly create and maintain rewrites that change dynamic URLs to static-looking URLs.
  2. It's much safer to serve us the original dynamic URL and let us handle the problem of detecting and avoiding problematic parameters.
  3. If you want to rewrite your URL, please remove unnecessary parameters while maintaining a dynamic-looking URL.
  4. If you want to serve a static URL instead of a dynamic URL you should create a static equivalent of your content.
Which can Googlebot read better, static or dynamic URLs?
We've come across many webmasters who, like our friend, believed that static or static-looking URLs were an advantage for indexing and ranking their sites. This is based on the presumption that search engines have issues with crawling and analyzing URLs that include session IDs or source trackers. However, as a matter of fact, we at Google have made some progress in both areas. While static URLs might have a slight advantage in terms of clickthrough rates because users can easily read the urls, the decision to use database-driven websites does not imply a significant disadvantage in terms of indexing and ranking. Providing search engines with dynamic URLs should be favored over hiding parameters to make them look static.

Let's now look at some of the widespread beliefs concerning dynamic URLs and correct some of the assumptions which spook webmasters. :)

Myth: "Dynamic URLs cannot be crawled."
Fact: We can crawl dynamic URLs and interpret the different parameters. We might have problems crawling and ranking your dynamic URLs if you try to make your urls look static and in the process hide parameters which offer the Googlebot valuable information. One recommendation is to avoid reformatting a dynamic URL to make it look static. It's always advisable to use static content with static URLs as much as possible, but in cases where you decide to use dynamic content, you should give us the possibility to analyze your URL structure and not remove information by hiding parameters and making them look static.

Myth: "Dynamic URLs are okay if you use fewer than three parameters."
Fact: There is no limit on the number of parameters, but a good rule of thumb would be to keep your URLs short (this applies to all URLs, whether static or dynamic). You may be able to remove some parameters which aren't essential for Googlebot and offer your users a nice looking dynamic URL. If you are not able to figure out which parameters to remove, we'd advise you to serve us all the parameters in your dynamic URL and our system will figure out which ones do not matter. Hiding your parameters keeps us from analyzing your URLs properly and we won't be able to recognize the parameters as such, which could cause a loss of valuable information.

Following are some questions we thought you might have at this point.

Does that mean I should avoid rewriting dynamic URLs at all?
That's our recommendation, unless your rewrites are limited to removing unnecessary parameters, or you are very diligent in removing all parameters that could cause problems. If you transform your dynamic URL to make it look static you should be aware that we might not be able to interpret the information correctly in all cases. If you want to serve a static equivalent of your site, you might want to consider transforming the underlying content by serving a replacement which is truly static. One example would be to generate files for all the paths and make them accessible somewhere on your site. However, if you're using URL rewriting (rather than making a copy of the content) to produce static-looking URLs from a dynamic site, you could be doing harm rather than good. Feel free to serve us your standard dynamic URL and we will automatically find the parameters which are unnecessary.

Can you give me an example?
If you have a dynamic URL which is in the standard format like foo?key1=value&key2=value2 we recommend that you leave the url unchanged, and Google will determine which parameters can be removed; or you could remove uncessary parameters for your users. Be careful that you only remove parameters which do not matter. Here's an example of a URL with a couple of parameters:

www.example.com/article/bin/answer.foo?language=en&answer=3&sid=98971298178906&query=URL
  • language=en - indicates the language of the article
  • answer=3 - the article has the number 3
  • sid=8971298178906 - the session ID number is 8971298178906
  • query=URL - the query with which the article was found is [URL]
Not all of these parameters offer additional information. So rewriting the URL to www.example.com/article/bin/answer.foo?language=en&answer=3 probably would not cause any problems as all irrelevant parameters are removed.

The following are some examples of static-looking URLs which may cause more crawling problems than serving the dynamic URL without rewriting:
  • www.example.com/article/bin/answer.foo/en/3/98971298178906/URL
  • www.example.com/article/bin/answer.foo/language=en/answer=3/
    sid=98971298178906/query=URL
  • www.example.com/article/bin/answer.foo/language/en/answer/3/
    sid/98971298178906/query/URL
  • www.example.com/article/bin/answer.foo/en,3,98971298178906,URL
Rewriting your dynamic URL to one of these examples could cause us to crawl the same piece of content needlessly via many different URLs with varying values for session IDs (sid) and query. These forms make it difficult for us to understand that URL and 98971298178906 have nothing to do with the actual content which is returned via this URL. However, here's an example of a rewrite where all irrelevant parameters have been removed:
  • www.example.com/article/bin/answer.foo/en/3
Although we are able to process this URL correctly, we would still discourage you from using this rewrite as it is hard to maintain and needs to be updated as soon as a new parameter is added to the original dynamic URL. Failure to do this would again result in a static looking URL which is hiding parameters. So the best solution is often to keep your dynamic URLs as they are. Or, if you remove irrelevant parameters, bear in mind to leave the URL dynamic as the above example of a rewritten URL shows:
  • www.example.com/article/bin/answer.foo?language=en&answer=3
We hope this article is helpful to you and our friends to shed some light on the various assumptions around dynamic URLs. Please feel free to join our discussion group if you have any further questions.

Read Full Story...

Webmaster Tools made easier in French, Italian, German and Spanish

Published on 09/16/2008 at 01:49 PM
We're always working for new ways to make life a bit easier for webmasters. We've had great feedback to many of the initiatives that have taken place in Webmaster Tools and beyond, but given the complex nature of managing a website, there are some questions regarding the tools that come up quite often across the Webmaster Help Groups. This got us thinking: how can we best address these questions?

Well, if you're like me, then you find it a lot easier to learn how to use something if you actually get to see someone else doing it first; with that in mind, we'll launch a series of six video tutorials in French, German, Italian and Spanish over the next couple of months. The videos will take you through the basics of Webmaster Tools as well as how to use the information in the tools to make improvements to your site and hence your site's visibility in Google's index.

Our first video provides an overview of the different information you can access depending on whether you've verified ownership of your site in Webmaster Tools. We'll also explain the different verification methods available. And just to whet your appetite, here are the topics covered in the series:

Video 1: Getting started, signing in, benefits of verifying a site
Video 2: Setting preferences for crawling and indexing
Video 3: Creating and submitting Sitemaps
Video 4: Removing and preventing your content from being indexed
Video 5: Utilizing the Diagnostics, Statistics and Links sections
Video 6: Communicating between Webmasters and Google

You can access the first of these videos in the links provided below and keep a lookout in the local Webmaster Help Groups for upcoming video releases.

Italian Video Tutorials - Italian Webmaster Help Group
Latin America and Spain Video Tutorials - Spanish Webmaster Help Group
French Video Tutorials - French Webmaster Help Group
German Video Tutorials - German Webmaster Help Group - German Webmaster Blog

Enjoy!

Read Full Story...

Demystifying the "duplicate content penalty"

Published on 09/12/2008 at 08:30 AM

Duplicate content. There's just something about it. We keep writing about it, and people keep asking about it. In particular, I still hear a lot of webmasters worrying about whether they may have a "duplicate content penalty."

Let's put this to bed once and for all, folks: There's no such thing as a "duplicate content penalty." At least, not in the way most people mean when they say that.

There are some penalties that are related to the idea of having the same content as another site—for example, if you're scraping content from other sites and republishing it, or if you republish content without adding any additional value. These tactics are clearly outlined (and discouraged) in our Webmaster Guidelines:

  • Don't create multiple pages, subdomains, or domains with substantially duplicate content.
  • Avoid... "cookie cutter" approaches such as affiliate programs with little or no original content.
  • If your site participates in an affiliate program, make sure that your site adds value. Provide unique and relevant content that gives users a reason to visit your site first.

(Note that while scraping content from others is discouraged, having others scrape you is a different story; check out this post if you're worried about being scraped.)

But most site owners whom I hear worrying about duplicate content aren't talking about scraping or domain farms; they're talking about things like having multiple URLs on the same domain that point to the same content. Like www.example.com/skates.asp?color=black&brand=riedell and www.example.com/skates.asp?brand=riedell&color=black. Having this type of duplicate content on your site can potentially affect your site's performance, but it doesn't cause penalties. From our article on duplicate content:

Duplicate content on a site is not grounds for action on that site unless it appears that the intent of the duplicate content is to be deceptive and manipulate search engine results. If your site suffers from duplicate content issues, and you don't follow the advice listed above, we do a good job of choosing a version of the content to show in our search results.

This type of non-malicious duplication is fairly common, especially since many CMSs don't handle this well by default. So when people say that having this type of duplicate content can affect your site, it's not because you're likely to be penalized; it's simply due to the way that web sites and search engines work.

Most search engines strive for a certain level of variety; they want to show you ten different results on a search results page, not ten different URLs that all have the same content. To this end, Google tries to filter out duplicate documents so that users experience less redundancy. You can find details in this blog post, which states:

  1. When we detect duplicate content, such as through variations caused by URL parameters, we group the duplicate URLs into one cluster.
  2. We select what we think is the "best" URL to represent the cluster in search results.
  3. We then consolidate properties of the URLs in the cluster, such as link popularity, to the representative URL.

Here's how this could affect you as a webmaster:

  • In step 2, Google's idea of what the "best" URL is might not be the same as your idea. If you want to have control over whether www.example.com/skates.asp?color=black&brand=riedell or www.example.com/skates.asp?brand=riedell&color=black gets shown in our search results, you may want to take action to mitigate your duplication. One way of letting us know which URL you prefer is by including the preferred URL in your Sitemap.
  • In step 3, if we aren't able to detect all the duplicates of a particular page, we won't be able to consolidate all of their properties. This may dilute the strength of that content's ranking signals by splitting them across multiple URLs.

In most cases Google does a good job of handling this type of duplication. However, you may also want to consider content that's being duplicated across domains. In particular, deciding to build a site whose purpose inherently involves content duplication is something you should think twice about if your business model is going to rely on search traffic, unless you can add a lot of additional value for users. For example, we sometimes hear from Amazon.com affiliates who are having a hard time ranking for content that originates solely from Amazon. Is this because Google wants to stop them from trying to sell Everyone Poops? No; it's because how the heck are they going to outrank Amazon if they're providing the exact same listing? Amazon has a lot of online business authority (most likely more than a typical Amazon affiliate site does), and the average Google search user probably wants the original information on Amazon, unless the affiliate site has added a significant amount of additional value.

Lastly, consider the effect that duplication can have on your site's bandwidth. Duplicated content can lead to inefficient crawling: when Googlebot discovers ten URLs on your site, it has to crawl each of those URLs before it knows whether they contain the same content (and thus before we can group them as described above). The more time and resources that Googlebot spends crawling duplicate content across multiple URLs, the less time it has to get to the rest of your content.

In summary: Having duplicate content can affect your site in a variety of ways; but unless you've been duplicating deliberately, it's unlikely that one of those ways will be a penalty. This means that:

  • You typically don't need to submit a reconsideration request when you're cleaning up innocently duplicated content.
  • If you're a webmaster of beginner-to-intermediate savviness, you probably don't need to put too much energy into worrying about duplicate content, since most search engines have ways of handling it.
  • You can help your fellow webmasters by not perpetuating the myth of duplicate content penalties! The remedies for duplicate content are entirely within your control. Here are some good places to start.

Read Full Story...

Your burning questions - Answered!

Published on 09/10/2008 at 01:12 PM
In a recent blog post highlighting our Webmaster Help Group, I asked for your webmaster-related questions. In our second installment of Popular Picks, we hoped to discover which issues webmasters wanted to learn more about, and then respond with some better documentation on those topics. It looks like it was a success, so please get clicking:
Thanks again for your questions! See you around the group.

Read Full Story...

Workin' it on all browsers

Published on 09/05/2008 at 12:48 PM
To web surfers, Google Chrome is a quick, exciting new browser. As webmasters, it's a good reminder that regardless of the browser your visitors use to access your site—Firefox, Internet Explorer, Google Chrome, Safari, etc.—browser compatibility is often a high priority. When your site renders poorly or is difficult to use on many browsers you risk losing your visitors' interest, and, if you're running a monetized site, perhaps their business. Here's a quick list to make sure you're covering the basics:

Step 1: Ensure browser compatibility by focusing on accessibility
The same techniques that make your site more accessible to search engines, such as static HTML versus fancy features like AJAX, often help your site's compatibility on various browsers and numerous browser versions. Simpler HTML is often more easily cross-compatible than the latest techniques.

Step 2: Consider validating your code
If your code passes validation, you've eliminated one potential issue in browser compatibility. With validated code, you won't need to rely on each browsers' error handling technique. There's a greater chance that your code will function across different browsers, and it's easier to debug potential problems.

Step 3: Check that it's usable (not just properly rendered)
It's important that your site displays well; but equally important, make sure that users can actually use your site's features in their browser. Rather than just looking at a snapshot of your site, try navigating through your site on various browsers or adding items to your shopping cart. It's possible that the clickable area of a linked image or button may change from browser to browser. Additionally, if you use JavaScript for components like your shopping cart, it may work in one browser but not another.

Step 4: Straighten out the kinks
This step requires some trial and error, but there are several good places to help reduce the "trials" as your make your site cross-browser compatible. Doctype is an open source reference with test cases for cross-browser compatibility, as well as CSS tips and tricks.

For example, let's say you're wondering how to find the offset for an element on your page. You notice that your code works in Internet Explorer, but not Firefox and Safari. It turns out that certain browsers are a bit finicky when it comes to finding the offset—thankfully contributors to Doctype provide the code to work around the issue.

Step 5: Share your browser compatibility tips and resources!
We'd love to hear the steps you're taking to ensure your site works for the most visitors. We've written a more in-depth Help Center article on the topic which discusses such things as specifying a character encoding. If you have additional tips, please share. And, if you have browser compatibility questions regarding search, please ask!

Read Full Story...

The Impact of User Feedback, Part 2 (and more Popular Picks!)

Published on 08/26/2008 at 05:34 PM
As a follow-up to my recent post about how user reports of webspam and paid links help improve Google's search results for millions of users, I wanted to highlight one of the most essential parts of Google Webmaster Central: our Webmaster Help Group. With over 37,000 members in our English group and support in 15 other languages, the group is the place to get your questions answered regarding crawling and indexing or Webmaster Tools. We're thankful for a fabulous group of Bionic Posters who have dedicated their time and energy to making the Webmaster Help Group a great place to be. When appropriate, Googlers, including myself, jump in to clarify issues or participate in the dialogue. One thing to note: we try hard to read most posts in the group, and although we may not respond to each one, your feedback and concerns help drive the features we work on. Here are a few examples:

Sitemap details
Submitting a Sitemap through Webmaster Tools is one way to let know Google know about what pages exist on your site. Users were quick to note that even though they submitted a Sitemap of all the pages on their site, they only found a sampling of URLs indexed through a site: search. In response, the Webmaster Tools team created a Sitemaps details page to better tell you how your Sitemap was processed. You can read a refresher about the Sitemaps details page in Jonathan's blog post.

Contextual help
One request we received early on with Webmaster Tools was for better documentation on the data displayed. We saw several questions about meta description and title tag issues using our Content Analysis tool, which led us to beef up our documentation on that page and link to that Help Center article directly from that page. Similarly, we discovered that users needed clarification on the distinction between "top search queries" and "top clicked queries" and how the data can be used. We added an expandable section entitled "How do I use this data?" and placed contextual help information across Webmaster Tools to explain what each feature is and where to get more information about it.

Blog posts
The Webmaster Help Group is also a way for us to keep a pulse on what overarching questions are on the minds of webmasters so we can address some of those concerns through this blog. Whether it's how to submit a reconsideration request using Webmaster Tools, deal with duplicate content, move a site, or design for accessibility, we're always open to hearing more about your concerns in the Group. Which reminds me...

It's time for more Popular Picks!
Last year, we devoted two weeks to soliciting and answering five of your most pressing webmaster-related questions. These Popular Picks covered the following topics:
Seeing as this was a well-received initiative, I'm happy to announce that we're going to do it again. Head on over to this thread to ask your webmaster-related questions. See you there!

Read Full Story...

silver_medal_count++

Published on 08/22/2008 at 04:29 PM
Since both tennis and table tennis are in the Olympics, perhaps you're wondering: if there's soccer, why not "table soccer?" Of course, we know table soccer by another name; and while foosball may not be an Olympic sport, we still cheered Nathan Johns and Jan Backes—two members of our Search Quality team—as they brought home the foosball silver medal at the search engine foosball smackdown at SES San Jose.

"Smackdown" doesn't quite equate to "Olympics," but check out the intensity—you could hear a pin drop!

silver medalists at foosball

The gold medal (cup) went to the search engine down the road. :)

gold medalists at foosball
Yahoo's first place winners Daniel Wong and Jake Rosenberg.

Just to be sure they weren't ringers, I quizzed Daniel and Jake, "How can you prevent a file from being crawled?" They correctly answered, "robots.txt."

Gold cup well deserved.

Read Full Story...

Hey Google, I no longer have badware

Published on 08/21/2008 at 05:43 PM
This post is for anyone who has been emailed or notified by Google about badware, received a badware warning when browsing their own site using Firefox, or has come across malware-labeled search results for their own site(s).  As you know, these warnings are produced by our automated scanning systems, which we've put in place to ensure the quality of our results by protecting our users.  Whatever the case, if you are dealing with badware, here are a few recommendations that can help you out. 





1.  If you have badware, it usually means that your web server, your website, or a database used by your website has been compromised. We have a nifty post on how to handle being hacked.  Be very careful when inspecting for malware on your site so as to avoid exposing your computer to infection.

2. Once everything is clear and dandy, you can follow the steps in our post about malware reviews via Webmaster Tools. Please note the screen shot on the previous post is outdated, and the new malware review form is on the Overview page and looks like this:



  • Other programs, such as Firefox, also use our badware data and may not recognize the change immediately due to their caching of the data.  So even if the badware label in search is removed, it may take some time for that to be visible in such programs.

3. Lastly, if you believe that your rankings were somehow affected by the malware, such as compromised content that violated our Webmaster Guidelines [i.e. hacked pages with hidden pharmacy text links], you should fill out a reconsideration request. To clarify, reconsideration requests are usually used for when you notice issues stemming from violations of our Webmaster Guidelines and are separate from malware requests.

If you have additional questions, please review our documentation or post to the discussion group with the URL of your site. We hope you find this updated feature in Webmaster Tools useful in discovering and fixing any malware-related problems. 

Written by Evan Tang, Search Quality Team
Read Full Story...

Make your 404 pages more useful

Published on 08/19/2008 at 12:51 PM
Your visitors may stumble into a 404 "Not found" page on your website for a variety of reasons:
  • A mistyped URL, or a copy-and-paste mistake
  • Broken or truncated links on web pages or in an email message
  • Moved or deleted content
Confronted by a 404 page, they may then attempt to manually correct the URL, click the back button, or even navigate away from your site. As hinted in an earlier post for "404 week at Webmaster Central", there are various ways to help your visitors get out of the dead-end situation. In our quest to make 404 pages more useful, we've just added a section in Webmaster Tools called "Enhance 404 pages". If you've created a custom 404 page this allows you to embed a widget in your 404 page that helps your visitors find what they're looking for by providing suggestions based on the incorrect URL.


Example: Jamie receives the link www.example.com/activities/adventurecruise.html in an email message. Because of formatting due to a bad email client, the URL is truncated to www.example.com/activities/adventur. As a result it returns a 404 page. With the 404 widget added, however, she could instead see the following:



In addition to attempting to correct the URL, the 404 widget also suggests the following, if available:
  • a link to the parent subdirectory
  • a sitemap webpage
  • site search query suggestions and search box

How do you add the widget? Visit the "Enhance 404 pages" section in Webmaster Tools, which allows you to generate a JavaScript snippet. You can then copy and paste this into your custom 404 page's code. As always, don't forget to return a proper 404 code.

Can you change the way it looks? Sure. We leave the HTML unstyled initially, but you can edit the CSS block that we've included. For more information, check out our guide on how to customize the look of your 404 widget.

This feature is currently experimental -- we might not provide corrections and suggestions for your site but we'll be working to improve the coverage. In the meantime, let us know what you think in the comments below or in our group discussion. Thanks for helping us make the Internet a more friendly place!

Written by Sahala Swenson, Webmaster Tools team
Read Full Story...

More on 404

Published on 08/15/2008 at 03:21 PM
Now that we've bid farewell to soft 404s, in this post for 404 week we'll answer your burning 404 questions.

How do you treat the response code 410 "Gone"?
Just like a 404.

Do you index content or follow links from a page with a 404 response code?
We aim to understand as much as possible about your site and its content. So while we wouldn't want to show a hard 404 to users in search results, we may utilize a 404's content or links if it's detected as a signal to help us better understand your site.

Keep in mind that if you want links crawled or content indexed, it's far more beneficial to include them in a non-404 page.

What about 404s with a 10-second meta refresh?
Yahoo! currently utilizes this method on their 404s. They respond with a 404, but the 404 content also shows:

<meta http-equiv="refresh" content="10;url=http://www.yahoo.com/?xxx">

We feel this technique is fine because it reduces confusion by giving users 10 seconds to make a new selection, only offering the homepage after 10 seconds without the user's input.

Should I 301-redirect misspelled 404s to the correct URL?
Redirecting/301-ing 404s is a good idea when it's helpful to users (i.e. not confusing like soft 404s). For instance, if you notice that the Crawl Errors of Webmaster Tools shows a 404 for a misspelled version of your URL, feel free to 301 the misspelled version of the URL to the correct version.

For example, if we saw this 404 in Crawl Errors:
http://www.google.com/webmsters  <-- typo for "webmasters"

we may first correct the typo if it exists on our own site, then 301 the URL to the correct version (as the broken link may occur elsewhere on the web):
http://www.google.com/webmasters

Have you guys seen any good 404s?
Yes, we have! (Confession: no one asked us this question, but few things are as fun to discuss as response codes. :) We've put together a list of some of our favorite 404 pages. If you have more 404-related questions, let us know, and thanks for joining us for 404 week!
http://www.metrokitchen.com/nice-404-page
"If you're looking for an item that's no longer stocked (as I was), this makes it really easy to find an alternative."
-Riona, domestigeek

http://www.comedycentral.com/another-404
"Blame the robot monkeys"
-Reid, tells really bad jokes

http://www.splicemusic.com/and-another
"Boost your 'Time on site' metrics with a 404 page like this."
-Susan, dabbler in music and Analytics

http://www.treachery.net/wow-more-404s
"It's not reassuring, but it's definitive."
-Jonathan, has trained actual spiders to build websites, ants handle the 404s

http://www.apple.com/iPhone4g
"Good with respect to usability."
http://thcnet.net/lost-in-a-forest
"At least there's a mailbox."
-JohnMu, adventurous

http://lookitsme.co.uk/404
"It's pretty cute. :)"
-Jessica, likes cute things

http://www.orangecoat.com/a-404-page.html
"Flow charts rule."
-Sahala, internet traveller

http://icanhascheezburger.com/iz-404-page
"I can has useful links and even e-mail address for questions! But they could have added 'OH NOES! IZ MISSING PAGE! MAYBE TIPO OR BROKN LINKZ?' so folks'd know what's up."
-Adam, lindy hop geek

Read Full Story...

Next?

RSS-Feed Subscribe to RSS-Feed
Subscribe via Email Subscribe via E-Mail

Add to Technorati Favorites


Link to this article! Copy & Paste code below into your page.