If your website allows users to create content, you will likely want to equip the content’s URL with the name or title of that content. The classic example, a blog post, would have an URL like this one: http://example.com/blog/1234/title-of-the-post
As the SEO experts will explain you, having good keywords or content titles in the URL should increase the odds that your page is displayed to search engine users first. Also, friendly-URLs are more easy to remember, to type and they look great.
The most common problem with UGC is data filtering and normalization. You will want all of your URLs to have URL-friendly characters only so you do not generate broken links. This block of PHP code strips all unwanted characters and builds a clean content “name” that can be used in your URLs.
// Strip accents
$name = str_replace(
array('à', 'á', 'â', 'ã', 'ä', 'ç', 'è', 'é', 'ê', 'ë', 'ì', 'í', 'î',
'ï', 'ñ', 'ò', 'ó', 'ô', 'õ', 'ö', 'ù', 'ú', 'û', 'ü', 'ý', 'ÿ',
'À', 'Á', 'Â', 'Ã', 'Ä', 'Ç', 'È', 'É', 'Ê', 'Ë', 'Ì', 'Í', 'Î',
'Ï', 'Ñ', 'Ò', 'Ó', 'Ô', 'Õ', 'Ö', 'Ù', 'Ú', 'Û', 'Ü', 'Ý'),
array('a', 'a', 'a', 'a', 'a', 'c', 'e', 'e', 'e', 'e', 'i', 'i', 'i',
'i', 'n', 'o', 'o', 'o', 'o', 'o', 'u', 'u', 'u', 'u', 'y', 'y',
'A', 'A', 'A', 'A', 'A', 'C', 'E', 'E', 'E', 'E', 'I', 'I', 'I',
'I', 'N', 'O', 'O', 'O', 'O', 'O', 'U', 'U', 'U', 'U', 'Y'),
$name
);
// Replaces all but alphanumeric characters, underscores and dashes.
$name = preg_replace('/[^\w\d\s_-]+/', '', $name);
// Trim the white space at the start and at the end of the string.
$name = trim($name);
// Replace whitespace with dashes.
$name = str_replace(' ', '-', $name);
// Repleace double dashes with single dashes.
while(strpos($name, '--') !== false)
{
$name = str_replace('--', '-', $name);
}
// Lowercase all characters.
$name = strtolower($name);
// URL Encode the string.
$name = urlencode($name);
You think this could have been done in a better way? Feel free to add your idea in the comments.