Php head http method

What is the easiest way to use the HEAD command of HTTP in PHP?

I would like to send the HEAD command of the Hypertext Transfer Protocol to a server in PHP to retrieve the header, but not the content or a URL. How do I do this in an efficient way? The probably most common use-case is to check for dead web links. For this I only need the reply code of the HTTP request and not the page content. Getting web pages in PHP can be done easily using file_get_contents(«http://. «) , but for the purpose of checking links, this is really inefficient as it downloads the whole page content / image / whatever.

5 Answers 5

You can do this neatly with cURL:

As an alternative to curl you can use the http context options to set the request method to HEAD . Then open a (http wrapper) stream with these options and fetch the meta data.

$context = stream_context_create(array('http' =>array('method'=>'HEAD'))); $fd = fopen('http://php.net', 'rb', false, $context); var_dump(stream_get_meta_data($fd)); fclose($fd); 

I prefer this solution over the ones using curl, because I like using built-in functions. Maybe somebody else can comment on the performance of each possibility?

Note that this will throw an error for 401 response codes, while curl provides you with the actual response.

stream_context_create() can be used also with file_get_contents() . Maybe get_headers() is much better combined with stream_context_set_default() to the method of the request to HEAD. See php.net/manual/es/function.get-headers.php

Источник

This website needs your consent to use cookies in order to customize ads and content.

If you give us your consent, data may be shared with Google.

PHP: HEAD Request (cURL)

How to send a HTTP HEAD request using cURL in PHP.

d

PHP tutorial

When a server receives a HEAD request, it should only return the response headers of the given resource. If content is still returned (Aka. a response body), it should be ignored by clients.

Clients may send a HTTP HEAD request to check if a resource has been updated by comparing the response headers with a timestamp of a cached copy. If the cached copy is outdated, it will typically be invalidated, and a fresh GET request for the resource will be performed.

When a server responds to a HEAD request, the body part of the response should be excluded. While this is mostly useful for caching mechanisms, it is also useful to developers while testing request and response headers in their applications.

Читайте также:  Python aes 256 gcm

In PHP, you can send a HEAD request through the cURL extension by setting the CURLOPT_NOBODY option to true; to have the response returned to you, the CURLOPT_RETURNTRANSFER should also be used:

$url = 'https://beamtic.com/Examples/ip.php'; $ch = curl_init(); curl_setopt($ch, CURLOPT_URL, $url); curl_setopt($ch, CURLOPT_RETURNTRANSFER, true); curl_setopt($ch, CURLOPT_NOBODY, true); curl_setopt($ch, CURLOPT_HEADER, true); $response = curl_exec($ch); if (curl_errno($ch))  echo 'Error:' . curl_error($ch); exit(); > curl_close($ch); // Output the response: echo $response; 

Note. The CURLOPT_HEADER option is used to include the response headers in the response. Without it, the response will be empty.

Verbose information

The cURL library has an option to return verbose information about the request, which is very useful while debugging; this can be enabled by setting the CURLOPT_VERBOSE option to true:

curl_setopt($ch, CURLOPT_VERBOSE, true); 

This allows you to view the request type, but it will also allow you to extract other useful details, such as info on the TLS handshake and SSL certificate.

How are HEAD requests used

A HEAD request is similar to a GET, but it specifically means that the server should only return the HTTP response headers of the requested resource, without including the response body; sometimes the body may be included anyway due to errors and carelessness, but clients will generally ignore it.

The main advantage of sending a HEAD request to a resource, is that a client will be able to compare the caching headers before deciding if the full resource should be requested.

Servers may inform clients about the supported request methods for a given resource in allow header, each method separated by a comma:

According to rfc7231#section-7.4.1: If an unsupported request method is used by a client, the server should respond with a 405 Method Not Allowed status, and then it must include the allow header to show supported methods.

As you may have noticed, it is not all servers or web applications that uses the allow header properly.

The format of a HTTP response

In the HTTP protocol, response headers are sent before the response body, and will look like this in plain text:

HTTP/1.1 200 OK Date: Fri, 22 Jan 2021 12:41:43 GMT Server: Apache Upgrade: h2 Connection: Upgrade, Keep-Alive Vary: Accept-Encoding Keep-Alive: timeout=5, max=100 Transfer-Encoding: chunked Content-Type: text/plain; charset=utf-8 

Hallo World

The response headers and the response body are separated by two pairs of CRLF (A carriage return + a line feed character).

In PHP, CRLF is represented with \r\n; you can output CRLF using the following:

Each header is separated by a single CRLF, while the headers- and body parts of the response is separated by CRLFCRLF (the equivalent of \r\n\r\n in PHP).

The idea is to allow HTTP clients to check caching headers such as last-modified and etag before deciding if a client-sided cache should be invalidated. If the cached copy is determined by a client to be outdated, the resource is typically re-downloaded with a fresh GET request. The main benefit of supporting HEAD is that the client avoids having to re-download resources that has not been changed, and at the same time servers also avoid wasting resources on generating and uploading resources that has already been downloaded by a client before.

It is far from all web-resources that support HEAD requests. In fact, if you are using a server-sided scripting language such as PHP, then you will have to manually add support for caching on dynamically generated pages — a good CMS system will already support client-sided caching without users having to do anything.

Parsing Response headers

In order to work with the response headers easily, it can be helpful to place them in an associative array; but, since the response headers will not be available to us when using cURL, you will first ned to cut them out of the response.

1. Obtain the headers after performing a request:

$header_size = curl_getinfo($ch, CURLINFO_HEADER_SIZE); $headers = substr($response, 0, $header_size); $body = substr($response, $header_size); 

Now you got the response headers and the response body stored in separate variables.

2. Now you can iterate over the lines in the $headers variable, creating an array in the process. Using the strtok function is probably the fastest way to do it:

// Define the $response_headers array for later use $response_headers = []; // Get the first line (The Status Code) $line = strtok($headers, "\r\n"); $status_code = trim($line); // Parse the string, saving it into an array instead while (($line = strtok("\r\n")) !== false)  if(false !== ($matches = explode(':', $line, 2)))  $response_headers["$matches[0]>"] = trim($matches[1]); > > 

3. Since the headers are now stored as an associative array that uses the header-names as keys, you can now use isset to check if a given header was returned:

if (isset($response_headers['allow']))  echo '

The allow header was present, here is its contents:'; var_dump($response_headers['allow']); exit(); >

For more information about this subject, you may want to read: Parsing HTTP Response Headers

Sources

Tools:

You can use the following API endpoints for testing purposes:

https://beamtic.com/api/user-agent
https://beamtic.com/api/request-headers

Источник

Метод HEAD

Метод HEAD аналогичен методу GET, за исключением того, что сервер ничего не посылает в информационной части ответа. Метод HEAD запрашивает только информацию о файле и ресурсе, т.е. ничего кроме заголовков не возвращает. Этот метод используется, когда клиент хочет найти информацию о документе, не получая его. Удобно, когда нужно узнать размер файла, существует ли файл, когда последний раз изменялся файл. Позволяет сэкономить много времени и трафика.

Состав заголовка запроса HEAD должен быть такой же, как в запросе GET.

Пример метода HEAD

 else < // формируем http-заголовки к серверу $request ; $request .= "Host: ekimoff.ru\r\n"; $request .= "User-Agent: Mozilla/2.0\r\n"; $request .= "Connection: close\r\n\r\n"; // отсылаем запрос серверу fputs($fp, $request); // получем ответ от сервера $content = ''; while(!feof($fp) )< $content .= fgets($fp); >echo $content; fclose($fp); > ?>

Как мы видим, содержимое файла в ответ на запрос HEAD не передается. Приходят только заголовки

HTTP/1.1 200 OK Server: nginx/0.6.32 Date: Sat, 24 Apr 2010 14:19:41 GMT Content-Type: text/plain; charset=UTF-8 Connection: close Last-Modified: Sat, 24 Apr 2010 13:19:02 GMT ETag: "43ace4b-b-4bd2efc6" Accept-Ranges: bytes Content-Length: 11

Теперь заменим в исходном коде метод HEAD на метод GET

 else < // формируем http-заголовки к серверу $request ; $request .= "Host: ekimoff.ru\r\n"; $request .= "User-Agent: Mozilla/2.0\r\n"; $request .= "Connection: close\r\n\r\n"; // отсылаем запрос серверу fputs($fp, $request); // получем ответ от сервера $content = ''; while(!feof($fp) )< $content .= fgets($fp); >echo $content; fclose($fp); > ?>

Ответ аналогичен, но теперь сервер отправил еще содержимое файла:

HTTP/1.1 200 OK Server: nginx/0.6.32 Date: Sat, 24 Apr 2010 15:04:36 GMT Content-Type: text/plain; charset=UTF-8 Connection: close Last-Modified: Sat, 24 Apr 2010 13:19:02 GMT ETag: "43ace4b-b-4bd2efc6" Accept-Ranges: bytes Content-Length: 11 bla-bla-bla

Размер файлика всего 11 байт. Представим, что в файл весит 11 Мб и запрашивается каждую секунду (например, последние обновления продуктов компании). Если внешнее приложение будет запрашивать данный файл напрямую каждую секунду, то трафик будет расти как на дрожжах.

Кстати, подобная схема используется у нас на работе, за тем исключением, что рядом с большим xml-файлом лежит маленький файлик в несколько байт, в котором хранится md5–хэш от большого файла. Таким образом, сделав сначала GET-запрос к маленькому файлу с md-хэшем, мы узнаем нужно ли нам делать запрос к основному файлу. md5-хэш хранится у нас в кэше для того чтобы сравнить его с содержимым md5-файлика на удаленном сервере. В данном примере не используется метод HEAD, но принцип экономии ресурсов и трафика аналогичен методу HEAD.

Источник

Оцените статью