Prevent Google indexing an AngularJS route

Usually, if I didn't want Google to crawl a page, I would add the page to my robots.txt file like so:

User-agent: *
Disallow: /my-page

To prevent Google indexing that page, I would remove the page from my sitemap.xml add the following meta tag to the <head> of the page:

<meta name="robots" content="noindex">

Now, if I use AngularJS to handle all routing for a single page application, how do stop Google indexing and/or crawling a route? Angular brings content in the ng-view for each route, so information in the <head> remains the same on every route. I don't think I can add the meta tag in this case.

If your root module is placed on the <html> tag (<html ng-app="myApp">), you can modify all properties in the <head>. That allows you to dynamically set the robots <meta> for each page. You can do that with the $routeChangeSuccess event in your root module. If you are using ui-router, you could set a 'data' property on the route which you can read in with every state change. You could also use $rootScope to update this value from other modules, but it's not a good practice. The best way would be to broadcast a change to your root module from child controllers/directives.

I have an example that changes the page <title> dynamically, but it is a bit more complex because this app is being bootstrapped manually. However, imagine there is a ng-app="" and ng-controller="" directive on the <html> tag.

Here is the state change event: https://github.com/danmindru/angular-boilerplate-study/blob/master/src/app/_app-main/_app-main.controller.js#L14-L24

Here is the listener for the broadcast: https://github.com/danmindru/angular-boilerplate-study/blob/master/src/app/_app-main/_app-main.controller.js#L38-L40

Here's how the broadcast is triggered: https://github.com/danmindru/angular-boilerplate-study/blob/master/src/app/profile-feature/customer-page/customer-page.controller.js#L12

Here's the <title> binding: https://github.com/danmindru/angular-boilerplate-study/blob/master/src/index.html#L4

However, Google is not that good at reading these properties, so you would have to use a pre-rendering service to make sure the googlebot will parse <meta name="robots" content="noindex"> instead of something like <meta name="robots" content="{{index}}">.