Premium Domain Names for Sale at CrocoDom.com
sel logo
Search Engine Land » Platforms » Google » Google Offers Robots.txt Generator
Google’s

rolled out
a new tool at Google
Webmaster Central
, a robots.txt generator. It’s designed to allow site
owners to easily create a robots.txt file, one of the two main ways (along with
the meta robots tag)
to prevent search engines from indexing content. Robots.txt generators aren’t
new. You can find many of them out there by searching. But this is the first
time a major search engine has provided a generator tool of its own.
It’s nice to see the addition. Robots.txt files aren’t complicated to create.
You can write them using a text editor such as notepad with just a few simple
commands. But they can still be scary or hard for some site owners to
contemplate.

To access the tool, log-in to your
Google Webmaster Tools
account, then click on the Tools menu option on the left-hand side of the screen
after you select one of your verified sites. You’ll see a “Generate robots.txt”
link among the tool options. That’s what you want.
By default, the tool is designed to let you create a robots.txt file to allow
all robots into your site. That’s kind of odd. By default, all robots will come
into your site. If you want them, then there’s no need to have a robots.txt file
at all. It’s like pinning a note to your chest reminding yourself to breathe.
Promise, you’ll keep breathing even if you forget to look at the note.
Instead, you generally want to put up a robots.txt file to block crawling of
some type. I may dig into a future article to examine when you might want to mix
allow and disallow statements, but off the top of my head, there’s not a lot of reasons
to do so.
You can change the default option to “Block all robots” easily enough. Do
that, and you get the standard and familiar two line keep out code:
User-Agent: *
Disallow: /
The first line — User-Agent — is how you tell particular spiders or robots
to pay attention to the following instructions. Using the wildcard — * — says
“hey ALL spiders, listen up.”
The second line says what they can’t access. In this case, the / means to not
spider anything within the web site. You know how pages within a web site all
begin domain/something, like this:
https://website.com/page.html
See that / between website.com and page.html? Technically, that slash is the
start of the URL. So if you disallow all pages beginning with a slash, you’re
blocking all pages within the entire site.
Let’s move on from our mini-robots.txt 101 course. Maybe you only want to
block Google. Well, the tool is supposed to make this type of thing easy, but I
was perplexed. Step one is to either allow or block ALL robots. Then in Step 2,
you decide if you want to block specific robots. So which do you go with in step
1, block all or none?
I figured you’d want to allow all robots, then believe the reassuring text
next to that option that said “you can fine-tune this rule in the next step.”
The problem is, I couldn’t. If I tried to block Googlebot, the instructions
didn’t change. If I tried to choose, say, Googlebot-Mobile, same thing.
Eventually, I figured it out. If you decide to block specific spiders, you
have to choose the spider, then specify also what you want to block in the
“Files or directories” box, such as a particular file or directory. So say I
kept all print-only versions of stories in a directory called /print. I’d enter
that directory to get this:
User-Agent: *
Allow: /
User-Agent: Googlebot
Disallow: /print
Allow: /
The first part tells spiders they can access the entire site. As I said, this
is entirely unnecessary, but you get it anyway. The second part says that
Googlebot cannot access the /print area.
The tool lets you craft specific rules for these particular Google crawlers:
I wish the names were accompanied by parenthesis quickly explaining what each
crawler does, and what blocking them will do, say, something like this:
Instead, you have to look through the various

help files
to understand what each does. Ironically, the
older Analyze Robots.txt
tool
within Google Webmaster Tools DOES have these helpful explanations, so I
expect they’ll migrate over.
You can also use the tool to enter a name for another crawler. The problem
is, someone using this tool probably doesn’t know the crawler names out there
that they want to block. I’d have given Google serious kudos points if they added
some of the other major crawlers. But then again, if they had, no doubt someone
would have accused them of trying to get people to block other search engines 🙂
Another thing that would have been nice was if people could have pasted full
URLs into the box to have them converted. A site owner using this tool might not
realize they need to drop the domain portion of a URL to block a particular
page. But if you could paste something like this:
https://website.com/page-i-want-to-block.html
And have the tool automatically turn it into this:
User-Agent: *
Disallow: /page-i-want-to-block.html
After you make your file, upload it to the root directory of your web site.
If you don’t know what that is, find someone who does! This is important. Google
allows for subdirectories of web sites to be registered within Google Webmaster
Tools. However, robots.txt files do NOT work on a subdirectory basis. They have
to go at the root level of a web site. If you don’t put them there, then you
won’t be preventing access to any part of the site. Remember, after you upload
to the root level, you can go back into Google Webmaster Tools and use that
aforementioned analysis tool to see if it is really blocking the pages you want
to keep out.
Overall, I’m glad to see the new tool, and I imagine it will improve more
over time to make it even more user friendly.
In related news, Google says that the Web Crawl diagnostics area now has a new
filter letting you see only web crawl errors related to sitemaps you’ve
submitted. Also, there have been some UI tweaks to the iGoogle gadgets from
Webmaster Central that were
rolled out last
month.
For more about Google’s webmaster tools, be sure to check out the
quick start
guide
they offer and see our
Google
Webmaster Central archives
.
Opinions expressed in this article are those of the guest author and not necessarily Search Engine Land. Staff authors are listed here.
Related stories
New on Search Engine Land
About the author
Related topics

Get the daily newsletter search marketers rely on.
See terms.
Learn actionable search marketing tactics that can help you drive more traffic, leads, and revenue.
March 15-16, 2023: SMX Munich
Online June 13-14, 2023: SMX Advanced
Online November 14-15, 2023: SMX Next
Start training now: SMX Master Classes
Discover time-saving technologies and actionable tactics that can help you overcome crucial marketing challenges.
April 15-17, 2020: San Jose
Your Customer Data Roadmap to Staying Ahead of the Competition
Data Down the Drain? CDPs Bring Value to an Underutilized Asset
Maximize ROI and Lead Gen With This Virtual Events Marketing Formula
Enterprise Email Marketing Platforms: A Marketer’s Guide
Enterprise SEO Platforms: A Marketer’s Guide
Enterprise Identity Resolution Platforms: A Marketer’s Guide
28 Growth Strategies From 12 Leading Agencies
Receive daily search news and analysis.
Topics
Our events
About
Follow us
© 2023 Third Door Media, Inc. All rights reserved.
Third Door Media, Inc. is a publisher and marketing solutions provider incorporated in Delaware, USA, with an address 88 Schoolhouse Road, PO Box 3103, Edgartown, MA 02539. Third Door Media operates business-to-business media properties and produces events. It is the publisher of Search Engine Land the leading Search Engine Optimization digital publication.

source
Premium Domain Names:

A premium domain name is a highly sought-after domain that is typically short, memorable, and contains popular keywords or phrases. These domain names are considered valuable due to their potential to attract more organic traffic and enhance branding efforts. Premium domain names are concise and usually consist of one to two words or two to four individual characters.

Top-Level Domain Names for Sale on Crocodom.com:

If you are looking for top-level domain names for sale, you can visit Crocodom.com. Crocodom.com is a platform that offers a selection of domain names at various price ranges. It is important to note that the availability of specific domain names may vary, and it’s recommended to check the website for the most up-to-date information.

Contact at crocodomcom@gmail.com:

If you have any inquiries or need assistance regarding the domain names available on Crocodom.com, you can reach out to them via email at crocodomcom@gmail.com. Feel free to contact them for any questions related to the domain names or the purchasing process.

Availability on Sedo.com, Dan.com, and Afternic.com:

Apart from Crocodom.com, you can also explore other platforms like Sedo.com, Dan.com, and Afternic.com for available domain names. These platforms are popular marketplaces for buying and selling domain names. Each platform may have its own inventory of domain names, so it’s worth checking multiple sources to find the perfect domain name for your needs.

#PremiumDomains #DomainInvesting #DigitalAssets #DomainMarketplace #DomainFlipping #BrandableDomains #DomainBrokers #DomainAcquisition #DomainPortfolio #DomainIndustry #DomainAuctions #DomainInvestors #DomainSales #DomainExperts #DomainValue #DomainBuyers #DomainNamesForSale #DomainBrand #DomainInvestment #DomainTrading

Leave a comment