How To Get Domain Name From Subdomain Using JavaScript?

Published September 30, 2024

Problem: Extracting Domain Names from Subdomains

Subdomains contain information about the main domain they belong to. Extracting the primary domain name from a subdomain can be difficult. This task is important when working with JavaScript applications that need to handle different URL structures.

JavaScript Solutions for Domain Name Extraction

Using the split() Method

The split() method is a way to extract domain names from subdomains. Here's how it works:

  1. Break down the hostname into parts using the split() method.
  2. Remove the subdomain component.
  3. Join the remaining parts to form the domain name.

Here's an example of how to use the split() method:

function getDomainName(url) {
    const parts = url.split('.');
    return parts.slice(-2).join('.');
}

const subdomain = 'sub1.example.com';
console.log(getDomainName(subdomain)); // Output: example.com

This function splits the URL by periods, takes the last two parts (which typically represent the domain name), and joins them back together.

Tip: Handling Complex Domain Structures

To handle more complex domain structures, such as those with country-code top-level domains (ccTLDs), you can modify the function to check for known ccTLDs:

function getDomainName(url) {
    const parts = url.split('.');
    const knownCcTLDs = ['co.uk', 'com.au', 'co.jp']; // Add more as needed
    if (knownCcTLDs.includes(parts.slice(-2).join('.'))) {
        return parts.slice(-3).join('.');
    }
    return parts.slice(-2).join('.');
}

console.log(getDomainName('sub.example.co.uk')); // Output: example.co.uk

Using Regular Expressions

Regular expressions offer a way to extract domain names from subdomains. Here's how to use them:

  1. Create a regex pattern to match domain names.
  2. Apply the regex to extract the domain from a full hostname.

Here's an example using regular expressions:

function getDomainName(url) {
    const regex = /([a-z0-9]+\.)*[a-z0-9]+\.[a-z]+/i;
    const match = url.match(regex);
    return match ? match[0] : null;
}

const subdomain = 'sub1.example.com';
console.log(getDomainName(subdomain)); // Output: example.com

This regex pattern matches one or more alphanumeric characters followed by a period, optionally repeated, and then followed by the top-level domain. It can handle various subdomain structures.

Both methods are useful for extracting domain names from subdomains in JavaScript. The split() method is simpler and works well for basic cases, while regular expressions offer more flexibility for complex URL structures.

Advanced Techniques for Domain Extraction

Handling Multiple Subdomain Levels

When working with URLs that have multiple subdomain levels, you need to adjust the split() method to handle different subdomain depths. Here's an improved function that can handle multiple subdomain levels:

function getDomainName(url) {
    const parts = url.split('.');
    if (parts.length > 2) {
        return parts.slice(-2).join('.');
    }
    return url;
}

console.log(getDomainName('sub1.sub2.example.com')); // Output: example.com
console.log(getDomainName('sub1.sub2.sub3.example.com')); // Output: example.com

This function splits the URL into parts and returns the last two parts, which usually represent the domain name. It works well with complex URL structures that have multiple subdomain levels.

Tip: Handling IP Addresses

When extracting domain names, remember to account for IP addresses. You can modify the function to check if the input is an IP address before processing:

function getDomainName(url) {
    // Check if the input is an IP address
    if (/^\d+\.\d+\.\d+\.\d+$/.test(url)) {
        return url; // Return the IP address as is
    }

    const parts = url.split('.');
    if (parts.length > 2) {
        return parts.slice(-2).join('.');
    }
    return url;
}

console.log(getDomainName('192.168.0.1')); // Output: 192.168.0.1
console.log(getDomainName('sub.example.com')); // Output: example.com

Working with Top-Level Domains (TLDs)

To handle country code TLDs (ccTLDs) and new generic TLDs (gTLDs), you need a more flexible approach. Here's a function that can handle different TLD lengths:

function getDomainName(url) {
    const parts = url.split('.');
    const knownTLDs = ['com', 'org', 'net', 'edu', 'gov', 'co.uk', 'com.au'];

    for (let i = 0; i < parts.length - 1; i++) {
        const domain = parts.slice(i).join('.');
        if (knownTLDs.some(tld => domain.endsWith(tld))) {
            return domain;
        }
    }

    return parts.slice(-2).join('.');
}

console.log(getDomainName('sub.example.com')); // Output: example.com
console.log(getDomainName('sub.example.co.uk')); // Output: example.co.uk
console.log(getDomainName('sub.example.com.au')); // Output: example.com.au

This function checks the URL against a list of known TLDs, including ccTLDs and gTLDs. It returns the domain name that matches any of these known TLDs. If no match is found, it returns the last two parts of the URL.

To make this approach more robust, you can add more TLDs to the list or use a full TLD database. This method offers a flexible solution for handling various TLD lengths and structures.

Example: Using a TLD Database

To improve the accuracy of domain extraction, you can use a comprehensive TLD database. Here's an example using the 'tldjs' library:

const tldjs = require('tldjs');

function getDomainName(url) {
    const parsed = tldjs.parse(url);
    return parsed.domain || url;
}

console.log(getDomainName('sub.example.com')); // Output: example.com
console.log(getDomainName('sub.example.co.uk')); // Output: example.co.uk
console.log(getDomainName('sub.example.yokohama')); // Output: example.yokohama

This approach handles a wide range of TLDs, including newer and less common ones, without the need to maintain your own TLD list.