Are you tired of having dates in WordPress permalinks?
I’ve been working on two sites lately where events tend to come back around every year. But if the URL has a date in it, that’s confusing to the visitor and will hurt click-through in Google.
In both cases, we wanted to switch to a simple URL that would use just the slug (for the WordPress site) and be in the form /events/great-annual-event on Sitecore (but I’m not going to discuss Sitecore here).
In WordPress, there are two simple steps, and one somewhat complicated one.
- Change your permalink structure.
- Add the necessary RewriteRule.
- Search and replace all date-based URLs and remove the URLs
Because we’re dealing with serialized data in many cases, step #3 is the one that gets complicated. More on that in a second.
1. Update your permalink structure
This is super simple. From your WordPress dashboard, just go to Settings ➔ Permalinks and choose Post Name, one of the built-in options. In the old days, this used to have bad performance implications if you had more than about 50 pages, but this was fixed as of WordPress 3.3, so it should be fine. You do have the possibility of some namespace collisions, but I have never had a problem with that. But if you name a page the same as a post, things could get squirrely.
2. Redirect to your new URLs
Again, this step is quite easy on Apache and most systems that run WordPress. These instructions are specific to Apache.
- Look for the lines in your .htaccess file in your WordPress root that look like this:
RewriteEngine On
RewriteBase / - Now add the following rule below the RewriteBase rule:
RewriteRule ^([0-9]{4})/([0-9]{2})/([0-9]{2})/(?!page/)(.+)$ http://%{HTTP_HOST}/$4 [L,R=301]
Note that this applies in an .htaccess or Directory context. If you are doing this in a Virtual Host file outside of a Directory context you need to add a leading slash like so:
RewriteRule ^/([0-9]{4})/([0-9]{2})/([0-9]{2})/(?!page/)(.+)$ http://%{HTTP_HOST}/$4 [L,R=301]
If that is completely confusing, see the “What’s Matched” section of the mod_rewrite documentation or the friendly explanation on this page on Virtual Host vs Directory Context.
If you need help figuring out which RewriteRule to use, you can use the neat tool by Joost that creates the regular expression for you. Two things to note, however
- He generates a RedirectMatch, not a RewriteRule. RedirectMatch uses mod_alias and RewriteRule uses mod_rewrite. In general, you shouldn’t mix these in the same .htaccess file, and you already have mod_rewrite running for the basic WordPress rewriting, so it’s better to use a rewrite rule. The reason for this is that unexpected things may happen because while all RewriteRules will fire in the order they are written and all RedirectMatches will fire in the order they are written, which set of rules fires first depends on the orders in which the modules load. If the RewriteRule fires first, the RedirectMatch will never get picked up unless you use the [PT] (passthrough) flag on your RewriteRule. In other words, don’t do this. It is true that Redirect is more efficient than RewriteRule, but most of that efficiency is gained by not loading and running the more complex mod_rewrite. Once it’s evaluating your .htaccess file anyway, you save almost no CPU time and add complexity by mixing them.
- The RedirectMatch has a leading slash in the regular expression search string. In a RewriteRule in a .htaccess or Directory context, you have to remove that.
3. Updating the URLs in your WordPress database
Now comes the tricky part. You might think that you can just download your database file, and with a little bit of regular expression magic, change out all your URLs. However, a lot of the data in the WordPress is serialized. In other words, a complex data structure like an array is turned into a string like so:
a:3:{s:4:"home";s:42:"https://www.example.com/2017/09/04/my-post";s:4:"link";}
That means that it is an array (“a”) with 3 elements, all of which are strings (“s”) and our URL string (the second element) is 42 characters long. If you do a simple regular expression search and replace on a MySQL dump file and get rid of the date, you’ll end up with:
a:3:{s:4:"home";s:42:"https://www.example.com/my-post";s:4:"link";}
That will break because the string is now the wrong size. What you need is
a:3:{s:4:"home";s:31:"https://www.example.com/my-post";s:4:"link";}
Note that our second string is now defined as being 31 characters long. That’s what we need.
Interconnect IT Database Search and Replace to the rescue
Fortunately, Interconnect IT has a great PHP script that lets you do regular expression searches on a MySQL database in a way that properly handles serialized data.
You have to check off a set of boxes verifying that you understand that, incorrectly used, this will be a site killer and a major security risk. I ran it on my development environment. If you know that nobody will be modifying the live content, you can import your live database to dev, run the script, and then export your dev database to live.
In my case, I didn’t want to do that, so I thoroughly tested on dev, then uploaded the script to live, ran it, then deleted it immediately. It was on my live server with an obfuscated name for only a few minutes. Also, note that more recent versions of the script are more secure, so there is less risk here. Still, it’s not the kind of thing to leave lying about a live server.
Follow the instructions and install in a separate directory that’s a peer to wp-admin and wp-content. I gave the directory a name like dbsr-185903 so that it would be short, but not easy to guess, even though it was only on the server for a few minutes. Then you just go to http://example.com/dbsr-195903/index.php and you’ll see a nice admin interface.
Enter your regular expression for search, what you want to replace it with, and enter your database credentials. You can also restrict the action to certain tables or columns.
Then you can test out your search and replace with a dry run that lets you examine the results.
When you’re happy with that, you perform the live run and you’re done. Test it on your dev platform and if it works, deploy to test and live.
And you’re done.
- New URLs will be in the new form
- Pages will be accessible by the new URL, which will be your canonical URL now
- You’ll have redirects in place for when people follow older external links
- And you have updated all internal links
Thanks for taking the time to document the full process.
I switched my permalinks in a fit of “productivity” and just needed the regex for my .htaccess file. Yours worked great, and I was delighted to also find a full guide that I can point others to and know they’re getting the right instructions.
Thanks Sarah. I’m glad it was useful. Your comment got me to reread the post, which I had forgotten about, and realize I had an error. I mention that in a Directory context you need to add a leading slash to your RewriteRule, which I then failed to do in the original! It’s corrected now. Thanks 🙂