Posted by: horizonguy | October 18, 2008

Optimize a URL for SEO using C#

Below is some code you can use to generate a valid optimized URL for webpages.

Example: Given the title of the post, it will generate what the url should be for URL rewriting for SEO optimization.
Url: http://www.knowledgedrink.com/View-Class-Workshop-Details.aspx?postId=480
Post Title: How to SEO optimize site
Output title: how-to-seo-optimize-site
Final Url: http://www.knowledgedrink.com/class-workshop-how-to-seo-optimize-site-480.aspx

Unit Tests:
Unit tests have been done on the code.
Here is a sample of the input title, and output title.

Input Output
1. “Too—many—dashes”      

 “too-many-dashes”      

2. “Too many spaces”     

 “too-many-spaces”     

3. “Too ^&* many ;’\”[]\\_+= invalidChars”     

 “too-many-invalidchars”     

4. “Really~1@# Bad-*7s;:Title”     

 “really1-bad-7stitle”     

5. “This is a post about optimizing a title”     

 “this-is-a-post-about-optimizing-a-title”     

    /// <summary>
    /// Url optimizer utility class.
    /// </summary>
    public class UrlSeoOptimizer
    {
        /// <summary>
        /// Map of invalid characters that should not appear in
        /// an SEO optimized url.
        /// </summary>
        private static IDictionary<char, bool> _invalidChars;
        /// <summary>
        /// String containing each invalid character.
        /// </summary>
        public const string InvalidSeoUrlChars = @”$%#@!*?;:~`_+=()[]{}|\'<>,/^&””.”;

       
        /// <summary>
        /// Initialize the list of mappings.
        /// </summary>
        static UrlSeoOptimizer()
        {
            char[] invalidChars = InvalidSeoUrlChars.ToCharArray();
            _invalidChars = new Dictionary<char, bool>();

            // Store each invalid char.
            foreach (char invalidChar in invalidChars)
            {
                _invalidChars.Add(invalidChar, true);
            }
        }
        /// <summary>
        /// Generates an SEO optimized url.
        /// </summary>
        /// <param name=”url”></param>
        /// <returns></returns>
        public static string GenerateUrl(string title)
        {
            // Validate.
            if (string.IsNullOrEmpty(title)) return string.Empty;

            // Get all lowercase without spaces.
            title = title.ToLower().Trim();

            StringBuilder buffer = new StringBuilder();
            char currentChar, lastAddedChar = ‘a’;

            // Now go through each character.
            for (int ndx = 0; ndx < title.Length; ndx++)
            {
                currentChar = title[ndx];
               
                // Invalid char ? Go to next one.
                if ( _invalidChars.ContainsKey(currentChar) )
                {
                    continue;
                }

                // Handle spaces or dashes.
                if (currentChar == ‘ ‘ || currentChar == ‘-‘)
                {
                    // Only add if the previous char was not a space or dash (‘ ‘, ‘-‘).
                    // This is to avoid having multiple “-” dashes in the url.
                    if (lastAddedChar != ‘ ‘ && lastAddedChar != ‘-‘)
                    {
                        buffer.Append(‘-‘);
                        lastAddedChar = ‘-‘;
                    }
                }
                else
                {
                    buffer.Append(currentChar);
                    lastAddedChar = currentChar;
                }               
            }
            return buffer.ToString();
        }
    }

Advertisements

Responses

  1. You can find URL Rewriting for Asp.net @ http://dotnetguts.blogspot.com/2008/07/generate-url-seo-friendly-aspnet.html

  2. Great post & found just at the right time thanks. My question would be with regards to performance – what is faster this method or use of a suitable regular expression?

  3. Very late reply… but I’ve done some testing and using a RegEx seems considerably slower.

    I did a test of the above method vs. a simple regex in a loop of 100,000 times and the RegEx was about 9x slower.

    The tests were run in batches and then averaged.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories

%d bloggers like this: