You want to extract a domain from URL. The list of URLs to support might be as below.
For this list you expect the domain sub.domain.com :
- sub.domain.com/folder?p1=v1
- www.sub.domain.com/folder?p1=v1
- http://sub.domain.com/folder?p1=v1
- https://sub.domain.com/folder?p1=v1
- https://www.sub.domain.com/folder?p1=v1
For this list you expect the domain domain.com:
- domain.com/folder?p1=v1
- www.domain.com/folder?p1=v1
- http://domain.com/folder?p1=v1
- https://domain.com/folder?p1=v1
- https://www.domain.com/folder?p1=v1
For this list you expect the domain sub.sub.domain.com:
- sub.sub.domain.com/folder?p1=v1
- www.sub.sub.domain.com/folder?p1=v1
- http://sub.sub.domain.com/folder?p1=v1
- https://sub.sub.domain.com/folder?p1=v1
- https://www.sub.sub.domain.com/folder?p1=v1
Extract domain
If all your URLs start with HTTP or HTTPS you may use Uri class.
var host = new Uri(url).Host; if (host.StartsWith("www.")) host = host.Remove(0, 4);
In order to extract the same domain from URLs without specified protocol, you should use regular expression.
const string urlPattern = @"^(http://|https://)?(www.)?((?<domain>[a-zA-Z0-9.\-_]+)\/)"; var matchedGroups = Regex.Match(url, urlPattern).Groups; if (matchedGroups.Count > 0) { var domainGroup = matchedGroups["domain"]; if (domainGroup != null) return domainGroup.Value; } return string.Empty;
This expression tells Regex to look for matches starting from the beginning of the string (^). Then look for some text which ends with / and extract another group without /. It successfully extracts all required domains which are stated above.
See example at RegexStorm.net.