i created new website , not want crawled search engines not appear in search results.
i created robots.txt
user-agent: * disallow: /
i have html page. wanted use
<meta name="robots" content="noindex">
but google page says should used when page not blocked robots.txt robots.txt not see noindex tag @ all.
is there way can use both noindex robots.txt?
there 2 solutions, neither of elegant.
you correct if disallow: /
urls might still appear in search results, without meta description , google generated title.
assuming doing temporarily, the recommended approach basic http auth in front of site. isn't great since users have put in basic username , password, prevent site getting crawled , indexed.
if can't or don't want put basic auth in front of site, alternative still disallow: /
in robots.txt file, , use google search console regularly purge google index requesting site removed index.
this inelegant in multiple ways.
- you'll have monitor search results see if urls indexed
- you'll have manually request removal in google search console
- google didn't intend removal feature used in fashion, , knows if they'll start ignoring requests on time. i'd imagine continue work though they'd prefer didn't use way.
Comments
Post a Comment