Получить имя домена python

Содержание

Get Domain Name Information using Python
Introduction
Check domain name registration using Python
Get domain name information using Python
Conclusion
Extract Domain From URL in Python
Use urlparse() to Extract Domain From the URL
Related Article — Python URL

Get Domain Name Information using Python

In this article we will discuss how to get domain name information using Python.

Table of Contents

Introduction

A domain name is a representation of an IP address of a resource. When you decide to visit https://pyshark.com/ you are going to an IP address of the website and the domain name here is just its identification string.

To get any domain name it needs to be purchases from a domain registration company. During the registration process of a domain name the registrants provide a lot of information like name, address, country, and more.

All of this information is stored and can be retrieved using WHOIS. It is a protocol that is widely used to get data from databases that store information about the domain name.

Let’s see how we can get domain name information using Python.

To continue following this tutorial we will need the following Python library: python-whois .

If you don’t have it installed, please open “Command Prompt” (on Windows) and install them using the following code:

Check domain name registration using Python

To get started we will first import the required library and create a domain variable and pass the URL we want to get information about:

The usage of the functions of the python-whois library is very simple. Now, we know that www.pyshark.com exists since you are on this website and are reading this article.

To get an object that contains the WHOIS information about this domain name we need to use the following code:

Note that this code will only execute successfully if the domain name is registered. If it’s not it will give you an error.

We can use this information to build a function that will simply return True/False is the domain name is registered/not registered:

What this function will do is it will try to retrieve the WHOIS object with information about the domain name, and if it succeeds, it will return True. If not False which means that the domain name isn’t registered.

What this result tells us is that it is a registered domain. For us this means that we can retrieve some information about it.

Now if you tried running this function against some random domain that doesn’t exist, the function would return “False” which will mean that any further information retrieval of information is not possible simply because the domain isn’t registered.

Get domain name information using Python

Now let’s look into how we can actually retrieve the registrar’s information from a valid domain name.

From the previous section we already learnt how to get an object containing the WHOIS information:

What we get in return in a WHOIS object that we will work with like a dictionary.

Since we can work with it like a dictionary, we can get its keys to determine what information we have contained in it:

domain_name registrar whois_server referral_url updated_date creation_date expiration_date name_servers status emails dnssec name org address city state zipcode country

It’s quite a bit of available information and depending on which one you would like to retrieve, you can select the needed ones.

The final step is to print out the key-value pairs to have the actual information about our domain:

domain_name : PYSHARK.COM registrar : FastDomain Inc. whois_server : whois.bluehost.com referral_url : None updated_date : [datetime.datetime(2020, 2, 4, 0, 39, 22), datetime.datetime(2020, 2, 4, 0, 39, 23)] creation_date : 2020-02-04 00:39:22 expiration_date : 2021-02-04 00:39:22 name_servers : ['NS1.BLUEHOST.COM', 'NS2.BLUEHOST.COM'] status : clientTransferProhibited https://icann.org/epp#clientTransferProhibited emails : ['support@bluehost.com', 'WHOIS@BLUEHOST.COM'] dnssec : unsigned name : DOMAIN PRIVACY SERVICE FBO REGISTRANT org : THE ENDURANCE INTERNATIONAL GROUP, INC. address : 10 CORPORATE DR, STE 300 city : BURLINGTON state : MASSACHUSETTS zipcode : 01803 country : US

Conclusion

In this article we explored how to retrieve the domain name information using WHOIS.

This information is publicly available and when you purchase a domain name, you provide the information during registration which then is available and can be retrieved.

Feel free to leave comments below if you have any questions or have suggestions for some edits.

Источник

Extract Domain From URL in Python

This article will use practical examples to explain Python’s urlparse() function to parse and extract the domain name from a URL. We’ll also discuss improving our ability to resolve URLs and use their different components.

Use urlparse() to Extract Domain From the URL

The urlparse() method is part of Python’s urllib module, useful when you need to split the URLs into different components and use them for various purposes. Let us look at the example:

from urllib.parse import urlparse component = urlparse('http://www.google.com/doodles/mothers-day-2021-april-07') print(component)

In this code snippet, we have first included the library files from the urllib module. Then we passed a URL to the urlparse function. The return value of this function is an object that acts like an array having six elements that are listed below:

scheme — Specify the protocol we can use to get the online resources, for instance, HTTP / HTTPS .
netloc — net means network and loc means location; so it means URLs’ network location.
path — A specific pathway a web browser uses to access the provided resources.
params — These are the path elements’ parameters.
query — Adheres to the path component & the data’s steam that a resource can use.
fragment — It classifies the part.

When we display this object using the print function, it will print its components’ value. The output of the above code fence will be as follows:

ParseResult(scheme='http', netloc='www.google.com', path='/doodles/mothers-day-2021-april-07', params='', query='', fragment='')

You can see from the output that all the URL components are separated and stored as individual elements in the object. We can get the value of any component by using its name like this:

from urllib.parse import urlparse domain_name = urlparse('http://www.google.com/doodles/mothers-day-2021-april-07').netloc print(domain_name)

Using the netloc component, we can get the domain name of the URL as follows:

This way, we can get our URL parsed and use its different components for various purposes in our programming.