- Saved searches
- Use saved searches to filter your results more quickly
- License
- kbsali/apache-log-parser
- Name already in use
- Sign In Required
- Launching GitHub Desktop
- Launching GitHub Desktop
- Launching Xcode
- Launching Visual Studio Code
- Latest commit
- Git stats
- Files
- README.md
- Saved searches
- Use saved searches to filter your results more quickly
- License
- BenMorel/apache-log-parser
- Name already in use
- Sign In Required
- Launching GitHub Desktop
- Launching GitHub Desktop
- Launching Xcode
- Launching Visual Studio Code
- Latest commit
- Git stats
- Files
- README.md
Saved searches
Use saved searches to filter your results more quickly
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.
PHP Apache Log Parser Library
License
kbsali/apache-log-parser
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Name already in use
A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Sign In Required
Please sign in to use Codespaces.
Launching GitHub Desktop
If nothing happens, download GitHub Desktop and try again.
Launching GitHub Desktop
If nothing happens, download GitHub Desktop and try again.
Launching Xcode
If nothing happens, download Xcode and try again.
Launching Visual Studio Code
Your codespace will open once ready.
There was a problem preparing your codespace, please try again.
Latest commit
Git stats
Files
Failed to load latest commit information.
README.md
Web server access Log Parser
php composer.phar require kassner/apache-log-parser:dev-master
Simply instantiate the class :
$parser = new \Kassner\ApacheLogParser\ApacheLogParser();
And then parse the lines of your access log file :
$lines = file('/var/log/apache2/access.log', FILE_IGNORE_NEW_LINES | FILE_SKIP_EMPTY_LINES); foreach ($lines as $line) < $entry = $parser->parse($line); >
Where $entry object will hold all data parsed.
stdClass Object ( [host] => 193.191.216.76 [logname] => - [user] => www-data [stamp] => 1390794676 [time] => 27/Jan/2014:04:51:16 +0100 [request] => GET /wp-content/uploads/2013/11/whatever.jpg HTTP/1.1 [status] => 200 [responseBytes] => 58678 )
You may customize the log format (by default it matches the Apache common log format)
# default Nginx format : $parser->setFormat('%h %l %u %t "%r" %>s %O "%i" \"%i"');
Here is the full list of log format strings supported by Apache, and whether they are supported by the library :
Supported? | Format String | Property name | Description |
---|---|---|---|
Y | %% | percent | The percent sign |
Y | %A | localIp | Local IP-address |
Y | %a | remoteIp | Remote IP-address |
Y | %B | responseBytes | Size of response in bytes, excluding HTTP headers. |
Y | %b | responseBytes | Size of response in bytes, excluding HTTP headers. In CLF format, i.e. a ‘-‘ rather than a 0 when no bytes are sent. |
Y | %D | responseTime | The time taken to serve the request, in microseconds. |
Y | %f | filename | Filename |
Y | %h | host | Remote host |
N | %H | protocol | The request protocol |
Y | %I | receivedBytes | Bytes received, including request and headers, cannot be zero. You need to enable mod_logio to use this. |
Y | %k | keepAliveRequests | Number of keepalive requests handled on this connection. Interesting if KeepAlive is being used, so that, for example, a ‘1’ means the first keepalive request after the initial one, ‘2’ the second, etc. ; otherwise this is always 0 (Y indicating the initial request). Available in versions 2.2.11 and later. |
Y | %l | logname | Remote logname (from identd, if supplied). This will return a dash unless mod_ident is present and IdentityCheck is set On. |
Y | %m | requestMethod | The request method |
Y | %O | sentBytes | Bytes sent, including headers, cannot be zero. You need to enable mod_logio to use this. |
Y | %p | port | The canonical port of the server serving the request |
Y | %P | PID | The process ID of the child that serviced the request. |
N | %q | queryString | The query string (prepended with a ? if a query string exists, otherwise an empty string) |
Y | %r | request | First line of request |
N | %R | handler | The handler generating the response (if any). |
Y | %s | originalStatus | Status. For requests that got internally redirected, this is the status of the original request — %>s for the last. |
Y | %>s | status | status |
Y | %T | timeRequestServed | The time taken to serve the request, in seconds. |
Y | %t | time | Time the request was received (standard english format) |
Y | %u | user | Remote user (from auth; may be bogus if return status (%s) is 401) |
Y | %U | URL | The URL path requested, not including any query string. |
Y | %v | serverName | The canonical ServerName of the server serving the request. |
Y | %V | canonicalServerName | The server name according to the UseCanonicalName setting. |
Y | %X | connectionStatus | Connection status when response is completed: X = connection aborted before the response completed. + = connection may be kept alive after the response is sent. — = connection will be closed after the response is sent. |
Y | %C | *Cookie | The contents of cookie Foobar in the request sent to the server. Only version 0 cookies are fully supported. |
Y | %e | *Env | The contents of the environment variable FOOBAR |
Y | %i | *Header | The contents of Foobar: header line(s) in the request sent to the server. Changes made by other modules (e.g. mod_headers) affect this. If you’re interested in what the request header was prior to when most modules would have modified it, use mod_setenvif to copy the header into an internal environment variable and log that value with the %e described above. |
N | %n | *Note | The contents of note Foobar from another module. |
N | %o | *Headers | The contents of Foobar: header line(s) in the reply. |
N | %p | *Port | The canonical port of the server serving the request or the server’s actual port or the client’s actual port. Valid formats are canonical, local, or remote. |
N | %P | *PID | The process ID or thread id of the child that serviced the request. Valid formats are pid, tid, and hextid. hextid requires APR 1.2.0 or higher. |
N | %t | *Time | The time, in the form given by format, which should be in strftime(3) format. (potentially localized) (This directive was %c in late versions of Apache 1.3, but this conflicted with the historical ssl %c syntax.) |
If a line does not match with the defined format, an \Kassner\ApacheLogParser\FormatException will be thrown.
Saved searches
Use saved searches to filter your results more quickly
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.
PHP library to parse Apache logs
License
BenMorel/apache-log-parser
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Name already in use
A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Sign In Required
Please sign in to use Codespaces.
Launching GitHub Desktop
If nothing happens, download GitHub Desktop and try again.
Launching GitHub Desktop
If nothing happens, download GitHub Desktop and try again.
Launching Xcode
If nothing happens, download Xcode and try again.
Launching Visual Studio Code
Your codespace will open once ready.
There was a problem preparing your codespace, please try again.
Latest commit
Git stats
Files
Failed to load latest commit information.
README.md
A PHP library to parse Apache logs.
This library is installable via Composer. Just run:
composer require benmorel/apache-log-parser
This library requires PHP 7.1 or later.
Project status & release process
This library is under development.
The current releases are numbered 0.x.y . When a non-breaking change is introduced (adding new methods, optimizing existing code, etc.), y is incremented.
When a breaking change is introduced, a new 0.x version cycle is always started.
It is therefore safe to lock your project to a given release cycle, such as 0.1.* .
If you need to upgrade to a newer release cycle, check the release history for a list of changes introduced by each further 0.x.0 version.
This library provides a single class, Parser .
First construct a Parser object with the LogFormat defined in the httpd.conf file of the server that generated the log file:
use BenMorel\ApacheLogParser\Parser; $logFormat pl-s">%h %l %u %t \"%i\" \"%r\" %>s %b \"%i\" \"%i\""; $parser = new Parser($logFormat);
The library converts every format string of your log format to a field name; the list of fields can be accessed through the getFieldNames() method:
var_export( $parser->getFieldNames() );
array ( 0 => 'remoteHostname', 1 => 'remoteLogname', 2 => 'remoteUser', 3 => 'time', 4 => 'requestHeader:Host', 5 => 'firstRequestLine', 6 => 'status', 7 => 'responseSize', 8 => 'requestHeader:Referer', 9 => 'requestHeader:User-Agent', )
You’re then ready to parse a single line of your log file: the parse() method accepts the log line, and a boolean to indicate whether you want the results as a numeric array, whose keys match the ones of the field names array:
$line = '1.2.3.4 - - [30/May/2018:15:00:23 +0200] "www.example.com" "GET / HTTP/1.0" 200 1234 "-" "Mozilla/5.0'; var_export( $parser->parse($line, false) );
array ( 0 => '1.2.3.4', 1 => '-', 2 => '-', 3 => '30/May/2018:15:00:23 +0200', 4 => 'www.example.com', 5 => 'GET / HTTP/1.0', 6 => '200', 7 => '1234', 8 => '-', 9 => 'Mozilla/5.0', )
Or as an associative array, with the field names as keys:
var_export( $parser->parse($line, true) );
array ( 'remoteHostname' => '1.2.3.4', 'remoteLogname' => '-', 'remoteUser' => '-', 'time' => '30/May/2018:15:00:23 +0200', 'requestHeader:Host' => 'www.example.com', 'firstRequestLine' => 'GET / HTTP/1.0', 'status' => '200', 'responseSize' => '1234', 'requestHeader:Referer' => '-', 'requestHeader:User-Agent' => 'Mozilla/5.0', )
If a line cannot be parsed, an InvalidArgumentException is thrown. Be sure to wrap your parse() calls in a try-catch block:
try < $parser->parse($line, true) > catch (\InvalidArgumentException $e) < // . >
Field names returned by the library
This table shows how format strings are mapped to field names by the library:
Format string | Field name |
---|---|
%a | clientIp |
%a | clientIp:c |
%A | localIp |
%B | responseSize |
%b | responseSize |
%C | cookie:VARNAME |
%D | responseTime |
%e | env:VARNAME |
%f | filename |
%h | remoteHostname |
%H | requestProtocol |
%i | requestHeader:VARNAME |
%k | keepaliveRequests |
%l | remoteLogname |
%L | requestLogId |
%m | requestMethod |
%n | note:VARNAME |
%o | responseHeader:VARNAME |
%p | canonicalPort |
%p | canonicalPort:FORMAT |
%P | processId |
%P | processId:FORMAT |
%q | queryString |
%r | firstRequestLine |
%R | handler |
%s | status |
%t | time |
%t | time:FORMAT |
%T | timeToServe |
%T | timeToServe:UNIT |
%u | remoteUser |
%U | urlPath |
%v | serverName |
%V | serverName |
%X | connectionStatus |
%I | bytesReceived |
%O | bytesSent |
%S | bytesTransferred |
%^ti | requestTrailerLine:VARNAME |
%^to | responseTrailerLine:VARNAME |
If two or more format strings yield the same field name, the second one will get a :2 suffix, the third one a :3 suffix, etc.
You can expect to parse more than 250,000 records per second (> 50 MiB/s) when reading logs from a file on a modern server with an SSD drive.
Returning records as an associative array comes with a small performance penalty of about 6%.