A Change to Hubspot Blog Topic Behaviour That Hubspot Users Should Know About
UPDATE: In January 2018 Hubspot updated their approach to canonicals and now provide users with some control over canonical URLs for blog pages and blog listing pages (I’d still like more granular control over topic/tag page canonicals): https://www.hubspot.com/product-updates/edit-the-canonical-url-of-a-page
If you or your clients use the Hubspot blogging software and make use of its topic pages (similar to category pages in WordPress) here’s something you should know about and take a close look at.
At some point in the past few months, Hubspot’s topic pages have changed how they use canonical tags. This was an unannounced change (which I feel is poor form), which could mean you suddenly have a lot of blog topic pages in Google’s index where you didn’t before. For those with well-organized topic pages, this could well be a great move, for many others it could be a source of significant index bloat, duplicate content, and may have impacted the distribution of some of your rankings. In the image above, you can see the jump in indexed pages seen after this change in canonical behaviour.
When I first set up this particular client’s blog on Hubspot earlier this year I made a point of looking at how Hubspot handles topic pages, which like WordPress category pages can be a source of large amounts of duplicate content when categories or topics are not well managed. Of course, properly used, category pages can also be a great way to rank for broader category/topic keywords. Previously Hubspot’s support documentation read as follows:
“Topics and pagination listing pages have their canonical URL tag set to the URL of your main listing page. For example, http://blog.hubspot.com/marketing/topic/marketing-automation is a listing page for a particular blog topic. This page automatically includes the following meta tag:
<link rel=”canonical” href=”http://blog.hubspot.com/marketing”/>”
The documentation has now been updated to state:
“Topics and pagination listing pages have their canonical URL tag set to the URL of the topic page itself. For example, http://blog.hubspot.com/marketing/topic/marketing-automation is a listing page for a particular blog topic. This page automatically, includes the following meta tag:
<link rel=”canonical” href=”http://blog.hubspot.com/marketing/topic/marketing-automation”/>”
I’m a Hubspot partner and I love Hubspot, but I feel like this is too big of a change to just gloss over. Hubspot users should be armed with this knowledge to decide for themselves if this is the right approach for them. I have spoken with Hubspot support about this, but they didn’t seem to think this was a significant issue. (Update July 28th 2015: I have had further conversations with the Hubspot support team and they are looking in to this further.)
So what can you do with this information? I would suggest:
- Check your Google Search Console (GSC) reports. Did you see an increase in indexed blog pages recently? Is GSC reporting an increase in duplicate page titles and meta descriptions
- Figure out if this change is helping or hurting your search visibility. Have topic pages taken keyword rankings from your individual posts? Is this desirable? Have you seen any kind of recent drop in keyword rankings, organic traffic, or in user engagement?
- If you are happy with your topic pages being indexed then do everything you can to make these topic pages unique. Avoid using large numbers of topics and cross-posting the same posts across many different topics. If you do reduce your numbers of topics be careful what you prune, check how each topic page is performing in terms of rankings and traffic. Make sure to redirect removed topic pages to their new equivalent (http://knowledge.hubspot.com/site-pages-user-guide/how-to-use-the-url-mapping-tool-to-redirect-pages).
- If you don’t want your topic pages indexed, the best you can do to avoid this at the moment is to exclude your topic pages via robots.txt. See this Hubspot article on how to edit your robots.txt file, but of course, make sure you know what you are doing before adding or changing anything in this file: http://knowledge.hubspot.com/site-pages-user-guide/how-to-customize-the-robotstxt-file
Note that Google may still list your topic page URLs in their index even if you block them via robots.txt. The only way to remove these URLs completely is to noindex the pages and perhaps request their removal via GSC. However, Hubspot doesn’t currently allow us to add a noindex directive to these topic pages.
Still not the solution you are looking for? Then upvote this idea to allow users to choose to noindex topic pages or to choose which approach to canonical URLs they prefer: