[Home] [Headlines] [Latest Articles] [Latest Comments] [Post] [Mail] [Sign-in] [Setup] [Help] [Register]
Status: Not Logged In; Sign In
United States News Title: Dumb Asses That Use Facebook Will Now Have Their Comments Indexed by Google Mind what you say in Facebook comments, Google will soon be indexing them and serving them up as part of the companys standard search results. Googles all-seeing search robots still cant find comments on private pages within Facebook, but now any time you use a Facebook comment form on a other sites, or a public page within Facebook, those comments will be indexed by Google. The new indexing plan isnt just about Facebook comments, but applies to nearly any content thats previously been accessible only through an HTTP POST request. Googles goal is to include anything hiding behind a form comment systems like Disqus or Facebook and other JavaScript-based sites and forms. Typically when Google announces its going to expand its search index in some way everyone is happy sites get more searchable content into Google and users can find more of what theyre looking for but thats not the case with the latest changes to Googles indexing policy. Developers are upset because Google is no longer the passive crawler it once was and users will likely become upset once they realize that comments about drunken parties, embarrassing moments or what they thought were private details are going to start showing up next to their names in Googles search results. For now most of the ire seems limited to concerned web developers worried that Googles new indexing plan ignores the HTML specification and breaks the webs underlying architecture. To understand what Google is planning to do and why it breaks one of the fundamental gentlemans agreements of the web, you first have to understand how various web requests work. There are two primary requests you can initiate on the web GET and POST. In a nutshell, GET requests are intended for reading data, POST for changing or adding data. Thats why search engine robots like Googles have always stuck to GET crawling. Theres no danger of the Googlebot altering a sites data with GET, it just reads the page, without ever touching the actual data. Now that Google is crawling POST pages the Googlebot is no longer a passive observer, its actually interacting with and potentially altering the websites it crawls. While its unlikely that the new Googlebot will alter a sites data as the Google Webmaster Blog writes, Googlebot may now perform POST requests when we believe its safe and appropriate its certainly possible now and thats what worries some developers. As any webmaster knows, mistakes happen, especially when robots are involved, and no one wants to wake up one day to discover that the Googlebot has wreaked havoc across their site. If youd like to stop the Googlebot from crawling your sites forms, Google suggests using the robots.txt file to disallow the Googlebot on any POST URLs your site might have. So long as youre surfacing your content in other ways and you should be, provided you want it indexed there shouldnt be any harm in blocking the Googlebot from POST requests. If, on the other hand, youd like to stop the Googlebot from indexing any embarrassing comments you may have left on the web, well, youre out of luck.
Post Comment Private Reply Ignore Thread Top Page Up Full Thread Page Down Bottom/Latest Begin Trace Mode for Comment # 2.
#2. To: A K A Stone (#0)
Your site is indexed by google as well.
#3. To: jwpegler (#2)
absent your real name.
Top Page Up Full Thread Page Down Bottom/Latest |
[Home] [Headlines] [Latest Articles] [Latest Comments] [Post] [Mail] [Sign-in] [Setup] [Help] [Register]
|