This may come as a surprise to you, but I’m pretty conservative and behind-the-times when it comes to trying out new technology and online services. I created my first blog in 2003 on LiveJournal, only to discover that everyone I knew had been using it for a few years. I moved to WordPress in 2006 and have been using it ever since for my blog. I tested Facebook out when it launched (back before it was publicly available) and didn’t think it was all that exciting or would go anywhere. I never got into MySpace. Admittedly, I’ve had my Twitter account for quite some time, but I only signed up for Instagram a short while ago.
I think about these things, and try to find a use-case for my life before signing up. I might test something out, but it really needs to fit a niche that isn’t being met by another service. And much to my surprise, Tumblr is fitting into my set of online tools really nicely.
I’ve spent a lot of quality time with the Google Webmaster Tools (GWT) this week, and it has been an altogether frustrating and enlightening experience. The bottom line is that it is showing my site as having a lot of errors of the 404 – Not Found variety, and this caused a bit of concern because 250+ of those has got to be hurting my search engine ranking.
It is additionally frustrating because I’ve gone to great lengths to prevent this very sort of thing from happening. I use Robots Meta to prevent certain pages from being indexed by search engines, All In One SEO to create meta data, and Redirection to make sure modification or deletion of posts doesn’t cause any disruption. And yet, there they are, staring me in the face. A bunch of pages that can’t be found and are returning errors. First, I’m going to talk about where these errors came from–because not all errors are equal–and whether they actually need to be fixed or not. Second, I’ll let you in on the secret to 404s and SEO.
What causes the errors?
GWT admits that not all errors are really a problem with the text:
Note: Not all errors may be actual problems. For example, you may have chosen to deliberately block crawlers from some pages. If that’s the case, there’s no need to fix the error.
If you have deleted a post or page, updated your sitemap, and you consider the case closed, you probably don’t need to worry about it. Eventually Google will stop trying to reach the link and the error will disappear all on its own. The problem is if you have other pages on your site that link to those you have deleted. GWT will tell you what those pages are, and you should edit them to remove the offending links.
This is probably the most benign of the errors because you can see it coming. Others are more mysterious.
Related Posts Plugin
Similar to the last, Related Posts plugins (I use YARPP and rather like it) don’t generally set all of their links nofollow, so they generate a ton of internal links on your site. These links aren’t generally set to nofollow because 1) they’re internal and 2) if you delete a post, Related Posts will update automatically and won’t link to the deleted post anymore. Unfortunately, Google has indexed that Page A links to Page B, so when Page B gets deleted, Google decides there’s an error. This, too, will pass in time as Google catches up, but it’s something of which you should be aware.
Back-end or Codeish Errors
I have no idea what causes these or where they come from, but GWT claims that a lot of my pages are linking to things that simply don’t exist. Namely, some pages are supposedly linking to */function.include, but near as I can tell, there are no links on the originating page that point at */function.include. This would point to there being a problem with the theme I’m using–maybe it has some code pointing to the wrong place and that’s throwing errors–but if that were the case, the errors should be happening from every single page, not just a few.
I went through and manually removed these links from Google’s index, but I’m skeptical of that solution. I’d rather know what is causing it and get it fixed, but this issue is so perplexing that I don’t know how. The good news is that actual users of the site aren’t attempting to follow these links because they don’t really exist on the page, so while the crawler may have trouble, the readers won’t.
This one is more because I’m spastic than anything else. For those of you who have followed this site for a while, you might recall that it has undergone significant changes in the last four years. I’ve gone from WordPress to Mambo!+WordPress to Joomla!+WordPress and then back to WordPress exclusively. I have created a dozen different sub-sites, spin-off blogs, forums, wikis, etc., and consequently deleted those blogs and come back to just having the one centralized site.
As such, I should have gone back and edited my robots.txt to exclude… well, pretty much everything. I’ve done that now, in addition to removing those links from Google’s index, so hopefully that will take care of it.
Combining WordPress blogs
When I closed the blogs I mentioned above, I usually imported their posts into my primary site. This causes so many headaches if you’re not careful, so be prepared to sort out the kinks. GWT’s ability to tell you where the errors are happening is great for going back end editing posts to remove or update links, but it’s definitely a manual process. There is simply no way around fixing this stuff: you’re going to have to set aside a block of time, sit down, and get it right.
This one originally perplexed me, as I had pages and pages of errors due to Pagination. This is where you’re browsing through the site and you’re on */page/108, and you can go to either */page/107 or */page/109. When I was typing this, it finally hit me what caused this: going from a single blog post on each page to 5 or 10. I suddenly have less pages, but Google hasn’t caught up yet and is still trying to hit those old links. It’ll learn eventually.
So, do 404s hurt SEO?
That depends, as I alluded to above, on whether they are internal or external links that are Not Found. Search engines won’t penalize you if other sites link incorrectly to your content and those links can’t be followed. If they did penalize you for that, then spammers or trolls could create sites with massive amounts of broken links to any site they wanted and drop its pagerank immediately. This obviously wouldn’t be fair, and thankfully search engines don’t work that way. Regardless, it is best to have a custom 404 page to deal with external links that 404. The key is making sure that actual people (rather than bots or crawlers) find your site helpful and get to the information they need/want.
Internal 404s will most certainly cause harm, and that’s where GWT can be of great benefit. By displaying not just the pages that can’t be found but also the pages that link to the 404ed, it helps you find the pages and fix them. As far as search engines are concerned, if your site can’t maintain internal link integrity, it isn’t trustworthy or helpful, so why would they send people your way? If Google started sending people to a bunch of broken sites that didn’t work well, people would stop trusting Google to provide good search results and they’d use a different search provider. That’s why the search engine checks to make sure sites are holding up and working well, and if the site isn’t, it’s pagerank will drop.
Maintaining internal link integrity is essential, not just for SEO, but also for keeping you readers happy. If someone clicks on a link on your site that goes to your site, they expect that link to work. When it doesn’t, no custom 404 page is going to make them happy. They might accept one error, but beyond that they’re more likely to just surf away.
While it would be ideal to never generate errors, chances are you’ll have at least a few if you’ve been around for a while and actually do something with your website. After 4+ years of active development and changes and well over 300 blog posts in just the last year and a half, these things happen, so I’m going to try to not let them get me down. Use the Google Webmaster Tools to your benefit and get your errors sorted. The work will be worth it in the end, and both the crawlers and your users will be happier when they are able to breeze through without hitting brick walls.
And once you get them taken care of, make sure to check back with GWT regularly to make sure the problem never gets out of hand. Once I get this all fixed, I’ll be logging into GWT at least once a week to make sure nothing new has cropped up. I am confident that my pagerank will benefit from the dilligence, and it’ll make my readers happier to have a site that functions entirely as it should. For that happiness, it is well worth the extra work.
This post is part of an ongoing series of collaborative conversations. See that initial post for a table of contents of all articles in the series.
I was recently having a conversation with a young photographer I know about his aspirations for having a fancy new website designed. He was looking at spending a decent amount of cash to have something really slick put together for his photo gallery, and though the company was going to charge him a reasonable rate for that level of design work and manageability (meaning that it would be easily updated by the photographer himself), I wasn’t sure spending that much money on a website was a good idea at this point in his career. Though a fancy website is nice and will help accent, present, and convey your material, it is secondary to the material itself.
I read an article several years ago that looked with great curiousity at a number of online businesses that seemed to be succeeding despite their best efforts. These businesses had ugly, poorly formatted websites with outdated modes of communication and little information about their business or product. Designed in a style I usually refer to as “Angelfire-esque” or “Geocities ghetto,” the independent owners had put together something on the web that looked similar to what a cat might produce after eating too fast. They had a product, but they had no idea how to market it on the web.
And yet, they were succeeding. They were doing business online and turning a decent profit, to the confusion of everyone else who felt that a great design was needed to make your voice heard.
When surveying their customers, the journalist discovered that the people ordering goods from these sites actually preferred the poor design. It communicated to the customer that the owner cared less about a fancy website and more about them, the customers; that they spent more time on their product than on marketing; and that the end-result was higher quality service and goods.
I would never go so far as to say that this is always the case. Rather, I tend to think that if you are a seller of repute and quality, all aspects of your business should be of similar quality, and that extends to your website. But I do think the story highlights something that a lot of people are beginning to forget: the Content is More Important than the Wrapper.
Yes, a good design will help sell your product better, and once you’ve got a good product, your next step should be a good marketing approach and/or website design. If your product is no good, though, the fanciness of your website becomes irrelevant.
I have known numerous photographers, webcomic artists, and authors whose websites were little more than a page with a single picture and the most rudimentary of navigation, or maybe they just threw their work onto a Blogger account (note: I personally detest Blogger and highly recommend WordPress as an alternative), and yet they were remarkable successes. This is because their work was of high quality and appealed to people. The content was good, so the wrapper or site design didn’t matter as much.
And generally speaking, once you’ve got the audience and fans, things move of their own accord and you eventually get a nicer website. But no one starts at the top, and likewise it probably isn’t wise to invest like you’re already there when you’re not.
A beginning musician doesn’t buy a five-million dollar Stradivarius violin, just like a beginning photographer doesn’t learn how to shoot photos on a ten-thousand dollar camera and a beginning author usually has nothing but a pen and paper. We all have to start somewhere and learn what we’re doing. We move up to the higher quality tools as we learn how to use them most effectively. Eventually, we reach a point where our work demands a better toolset, and we adjust accordingly.
But just because you have a Stradivarius doesn’t mean you can play like a master, and just because you have spent a few thousand dollars on a site doesn’t mean you’ll instantly have a booming business. So start small and focus on the quality of your product. Your customers will be attracted by your work, and they’ll be more attracted if they know that your focus is on them, not on yourself or your site. Put your work and your fans first and the rest will fall into place.
When I was a wee lad, I was quite unpopular at school. Regularly picked on, beat up, and mocked, it was no secret that I was a pushover and the other kids could get away with whatever torture they devised for me. The problem was that I was trying to be everyone’s friend, to please everyone, and subsequently I attempted to become whatever anyone wanted me to be. But because I didn’t know how to become what they wanted, I was just an uncool, dorkish poser, painting a big target on his chest for the barbs of others.
Sometime late in 7th grade, though, I snapped and decided to be my own person. Screw them, I thought, I’m going to figure out what I want and do it; who cares what they think? And, much to my surprise, the mocking stopped. Within a year I was, if not popular, at least respected. When I stopped trying to be everything to everyone and became my own person, I was finally recognized as such.
I say this by way of introduction to Unix. A common mandate or philosophy of Unix and Unix software/commands is to do only one thing but do it very well. Too many software companies try to make their product do everything, or try to please all of their customers, and what they end up doing is making something overly complex that nobody can use or even likes. By trying to do everything, they end up doing nothing.
We can take some obvious life lessons from this, but also find some guidance regarding the tools we settle on. In building this site, as well as my other work resources, I try to find those things that do their job simply and well. WordPress is simply the best blogging software I’ve found, and since this site is primarily a blog, it’s my tool. It’s not nearly as powerful as Joomla!, but it works significantly better. Just like Zenphoto doesn’t have all the capabilities of Coppermine, or PunBB is considerably leaner than phpBB, it does one thing and does it well. Coppermine and phpBB are bloated and difficult to work with… so I don’t. If a tool makes my life more difficult, requiring more of my time than it saves, then it fails and isn’t worth using.
I am finally beginning to learn what I do well, and I’m going to focus on that. Throughout this week, I’ll be writing about some of the changes happening both in my life and at SilverPen Publishing. I’m taking steps for what I am calling SilverPen Pub rev. 3.1. Revision 3 began around the end of August 2007, I believe, and this is the next phase of that progression. I’ll be implementing changes a bit at a time until December 31, with revision 3.1 formally going live on January 1. You probably won’t notice many differences, to be honest, but we’ll get into that later. Stay tuned!