blog
HOME · CREATIVE · WEB · TECH · BLOG

spiders - Tag Archive

Saturday, April 28th, 2007

I feel sorry for Google…

I managed to botch the launching of over 166,000 pages, and GoogleBot just dealt with it…

Read More...
Tuesday, April 17th, 2007

Be Careful: Robots.txt Is Case Sensitive

Controlling spiders on your site can be difficult. Now you find they’re accessing pages you never intended because they view the robots.txt file as case sensitive - even if they know your site is not case sensitive…

Read More...
Tuesday, April 17th, 2007

Google’s "As-It-Happens" Alert Isn’t As It Happens

Google Alerts is a wonderful service - but you don’t always get what you expect when you set up a Google alert. Case and point - old pages in an “as-it-happens” alert…

Read More...
Sunday, April 15th, 2007

Keep Control of Your Feed Subscribers AND Use FeedBurner

The URL you give out for your feed matters. Ideally you want people to subscribe to a URL that’s on your domain, yet you also want to use a service like FeedBurner to make your feed more powerful and track it’s usage. Now you can do both, but it requires a few server configuration tricks…

Read More...
Friday, April 13th, 2007

Media/Image Crawlers Need to See HTML Pages

When crafting your robots.txt file, don’t forget that the search engines have specialized spiders that crawl for image search. These spiders need to see not only the image file, but the page that it is used on.

Read More...
HOME · CREATIVE · WEB · TECH · BLOG