Usually, if I didn't want Google to crawl a page, I would add the page to my robots.txt file like so:
User-agent: *
Disallow: /my-page
To prevent Google indexing that page, I would remove the page from my sitemap.xml
add the following meta tag to the <head>
of the page:
<meta name="robots" content="noindex">
Now, if I use AngularJS to handle all routing for a single page application, how do stop Google indexing and/or crawling a route? Angular brings content in the ng-view for each route, so information in the <head>
remains the same on every route. I don't think I can add the meta tag in this case.
If your root module is placed on the <html>
tag (<html ng-app="myApp">
), you can modify all properties in the <head>
. That allows you to dynamically set the robots <meta>
for each page. You can do that with the $routeChangeSuccess event in your root module. If you are using ui-router, you could set a 'data' property on the route which you can read in with every state change. You could also use $rootScope to update this value from other modules, but it's not a good practice. The best way would be to broadcast a change to your root module from child controllers/directives.
I have an example that changes the page <title>
dynamically, but it is a bit more complex because this app is being bootstrapped manually. However, imagine there is a ng-app="" and ng-controller="" directive on the <html>
tag.
Here is the state change event: https://github.com/danmindru/angular-boilerplate-study/blob/master/src/app/_app-main/_app-main.controller.js#L14-L24
Here is the listener for the broadcast: https://github.com/danmindru/angular-boilerplate-study/blob/master/src/app/_app-main/_app-main.controller.js#L38-L40
Here's how the broadcast is triggered: https://github.com/danmindru/angular-boilerplate-study/blob/master/src/app/profile-feature/customer-page/customer-page.controller.js#L12
Here's the <title>
binding: https://github.com/danmindru/angular-boilerplate-study/blob/master/src/index.html#L4
However, Google is not that good at reading these properties, so you would have to use a pre-rendering service to make sure the googlebot will parse <meta name="robots" content="noindex">
instead of something like <meta name="robots" content="{{index}}">
.