Php parse raw http request

Manually parse raw multipart/form-data data with PHP

Manually parse raw multipart/form-data data with PHP

I can’t seem to find a real answer to this problem so here I go:

How do you parse raw HTTP request data in multipart/form-data format in PHP? I know that raw POST is automatically parsed if formatted correctly, but the data I’m referring to is coming from a PUT request, which is not being parsed automatically by PHP. The data is multipart and looks something like:

------------------------------b2449e94a11c Content-Disposition: form-data; name="user_id" 3 ------------------------------b2449e94a11c Content-Disposition: form-data; name="post_id" 5 ------------------------------b2449e94a11c Content-Disposition: form-data; name="image"; filename="/tmp/current_file" Content-Type: application/octet-stream �����JFIF���������. a bunch of binary data 

I’m sending the data with libcurl like so (pseudo code):

curl_setopt_array( CURLOPT_POSTFIELDS => array( 'user_id' => 3, 'post_id' => 5, 'image' => '@/tmp/current_file'), CURLOPT_CUSTOMREQUEST => 'PUT' ); 

If I drop the CURLOPT_CUSTOMREQUEST bit, the request is handled as a POST on the server and everything is parsed just fine.

Is there a way to manually invoke PHPs HTTP data parser or some other nice way of doing this?
And yes, I have to send the request as PUT 🙂

Solution – 1

I haven’t dealt with http headers much, but found this bit of code that might help

function http_parse_headers( $header ) < $retVal = array(); $fields = explode("rn", preg_replace('/x0Dx0A[x09x20]+/', ' ', $header)); foreach( $fields as $field ) < if( preg_match('/([^:]+): (.+)/m', $field, $match) ) < $match[1] = preg_replace('/(?<=^|[x09x20x2D])./e', 'strtoupper("")', strtolower(trim($match[1]))); if( isset($retVal[$match[1]]) ) < $retVal[$match[1]] = array($retVal[$match[1]], $match[2]); >else < $retVal[$match[1]] = trim($match[2]); >> > return $retVal; > 

Solution – 2

Have you looked at fopen(«php://input», «r») for parsing the content?

Headers can also be found as $_SERVER[‘HTTP_*’] , names are always uppercased and dashes become underscores, eg $_SERVER[‘HTTP_ACCEPT_LANGUAGE’] .

Solution – 3

I would suspect the best way to go about it is ‘doing it yourself’, although you might find inspiration in multipart email parsers that use a similar (if not the exact same) format.

Grab the boundary from the Content-Type HTTP header, and use that to explode the various parts of the request. If the request is very large, keep in mind that you might store the entire request in memory, possibly even multiple times.

The related RFC is RFC2388, which fortunately is pretty short.

Solution – 4

Edit – please read first: this answer is still getting regular hits 7 years later. I have never used this code since then and do not know if there is a better way to do it these days. Please view the comments below and know that there are many scenarios where this code will not work. Use at your own risk.

Ok, so with Dave and Everts suggestions I decided to parse the raw request data manually. I didn’t find any other way to do this after searching around for about a day.

Читайте также:  Как открыть txt файл в html

I got some help from this thread. I didn’t have any luck tampering with the raw data like they do in the referenced thread, as that will break the files being uploaded. So it’s all regex. This wasnt’t tested very well, but seems to be working for my work case. Without further ado and in the hope that this may help someone else someday:

function parse_raw_http_request(array &$a_data) < // read incoming data $input = file_get_contents('php://input'); // grab multipart boundary from content type header preg_match('/boundary=(.*)$/', $_SERVER['CONTENT_TYPE'], $matches); $boundary = $matches[1]; // split content by boundary and get rid of last -- element $a_blocks = preg_split("/-+$boundary/", $input); array_pop($a_blocks); // loop data blocks foreach ($a_blocks as $id =>$block) < if (empty($block)) continue; // you'll have to var_dump $block to understand this and maybe replace n or r with a visibile char // parse uploaded files if (strpos($block, 'application/octet-stream') !== FALSE) < // match "name", then everything after "stream" (optional) except for prepending newlines preg_match('/name="([^"]*)".*stream[n|r]+([^nr].*)?$/s', $block, $matches); >// parse all other fields else < // match "name" and optional value in between newline sequences preg_match('/name="([^"]*)"[n|r]+([^nr].*)?r$/s', $block, $matches); >$a_data[$matches[1]] = $matches[2]; > > 

Usage by reference (in order not to copy around the data too much):

$a_data = array(); parse_raw_http_request($a_data); var_dump($a_data); 

Solution – 5

I used Chris‘s example function and added some needed functionality, such as R Porter‘s need for array’s of $_FILES. Hope it helps some people.

Solution – 6

I’m surprised no one mentioned parse_str or mb_parse_str :

$result = []; $rawPost = file_get_contents('php://input'); mb_parse_str($rawPost, $result); var_dump($result); 

Solution – 7

Here is a universal solution working with arbitrary multipart/form-data content and tested for POST, PUT, and PATCH:

/** * Parse arbitrary multipart/form-data content * Note: null result or null values for headers or value means error * @return array|null [<"headers":array|null,"value":string|null>] * @param string|null $boundary * @param string|null $content */ function parse_multipart_content(?string $content, ?string $boundary): ?array < if(empty($content) || empty($boundary)) return null; $sections = array_map("trim", explode("--$boundary", $content)); $parts = []; foreach($sections as $section) < if($section === "" || $section === "--") continue; $fields = explode("rnrn", $section); if(preg_match_all("/([a-z0-9-_]+)s*:s*([^rn]+)/iu", $fields[0] ?? "", $matches, PREG_SET_ORDER) === 2) < $headers = []; foreach($matches as $match) $headers[$match[1]] = $match[2]; >else $headers = null; $parts[] = ["headers" => $headers, "value" => $fields[1] ?? null]; > return empty($parts) ? null : $parts; > 

Solution – 8

Update
The function was updated to support arrays in form fields. That is fields like level1[level2] will be translated into proper (multidimensional) arrays.

I’ve just added a small function to my HTTP20 library, that can help with this. It is made to parse form data for PUT, DELETE and PATCH and add it to respective static variable to simulate $_POST global.
For now it’s just for text fields, though, no binary support, since I currently do not have a good use case in my project to properly test it and I’d prefer not to share something I can’t test extensively. But if I do get to it at some point – I will update this answer.
Here is the code:

public function multiPartFormParse(): void < #Get method $method = $_SERVER['HTTP_ACCESS_CONTROL_REQUEST_METHOD'] ?? $_SERVER['REQUEST_METHOD'] ?? null; #Get Content-Type $contentType = $_SERVER['CONTENT_TYPE'] ?? ''; #Exit if not one of the supported methods or wrong content-type if (!in_array($method, ['PUT', 'DELETE', 'PATCH']) || preg_match('/^multipart/form-data; boundary=.*$/ui', $contentType) !== 1) < return; >#Get boundary value $boundary = preg_replace('/(^multipart/form-data; boundary=)(.*$)/ui', '$2', $contentType); #Get input stream $formData = file_get_contents('php://input'); #Exit if failed to get the input or if it's not compliant with the RFC2046 if ($formData === false || preg_match('/^s*--'.$boundary.'.*s*--'.$boundary.'--s*$/muis', $formData) !== 1) < return; >#Strip ending boundary $formData = preg_replace('/(^s*--'.$boundary.'.*)(s*--'.$boundary.'--s*$)/muis', '$1', $formData); #Split data into array of fields $formData = preg_split('/s*--'.$boundary.'s*Content-Disposition: form-data;s*/muis', $formData, 0, PREG_SPLIT_NO_EMPTY); #Convert to associative array $parsedData = []; foreach ($formData as $field) < $name = preg_replace('/(name=")(?[^"]+)("s*)(?.*$)/mui', '$2', $field); $value = preg_replace('/(name=")(?[^"]+)("s*)(?.*$)/mui', '$4', $field); #Check if we have multiple keys if (str_contains($name, '[')) < #Explode keys into array $keys = explode('[', trim($name)); $name = ''; #Build JSON array string from keys foreach ($keys as $key) < $name .= '#Add the value itself (as string, since in this case it will always be a string) and closing brackets $name .= '"' . trim($value) . '"' . str_repeat('>', count($keys)); #Convert into actual PHP array $array = json_decode($name, true); #Check if we actually got an array and did not fail if (!is_null($array)) < #"Merge" the array into existing data. Doing recursive replace, so that new fields will be added, and in case of duplicates, only the latest will be used $parsedData = array_replace_recursive($parsedData, $array); >> else < #Single key - simple processing $parsedData[trim($name)] = trim($value); >> #Update static variable based on method value self::$ = $parsedData; > 

Obviously you can safely remove method check and assignment to a static, if you do not those.

Источник

cwhsu1984 / parse_raw_http_request.php

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters

// The code is inspired by the following discussions and post:
// http://stackoverflow.com/questions/5483851/manually-parse-raw-http-data-with-php/5488449#5488449
// http://www.chlab.ch/blog/archives/webdevelopment/manually-parse-raw-http-data-php
/**
* Parse raw HTTP request data
*
* Pass in $a_data as an array. This is done by reference to avoid copying
* the data around too much.
*
* Any files found in the request will be added by their field name to the
* $data[‘files’] array.
*
* @param array Empty array to fill with data
* @return array Associative array of request data
*/
function parse_raw_http_request ( $ a_data = [])
// read incoming data
$ input = file_get_contents( ‘php://input’ );
// grab multipart boundary from content type header
preg_match( ‘/boundary=(.*)$/’ , $ _SERVER [ ‘CONTENT_TYPE’ ], $ matches );
// content type is probably regular form-encoded
if (!count( $ matches ))
// we expect regular puts to containt a query string containing data
parse_str(urldecode( $ input ), $ a_data );
return $ a_data ;
>
$ boundary = $ matches [ 1 ];
// split content by boundary and get rid of last — element
$ a_blocks = preg_split(» /-+ $ boundary / «, $ input );
array_pop( $ a_blocks );
$ keyValueStr = » ;
// loop data blocks
foreach ( $ a_blocks as $ id => $ block )
if (empty( $ block ))
continue ;
// you’ll have to var_dump $block to understand this and maybe replace \n or \r with a visibile char
// parse uploaded files
if (strpos( $ block , ‘application/octet-stream’ ) !== FALSE )
// match «name», then everything after «stream» (optional) except for prepending newlines
preg_match(» /name= \» ([^ \» ]*) \» .*stream[ \n | \r ]+([^ \n\r ].*)?$/s «, $ block , $ matches );
$ a_data [ ‘files’ ][ $ matches [ 1 ]] = $ matches [ 2 ];
>
// parse all other fields
else
// match «name» and optional value in between newline sequences
preg_match( ‘/name=\»([^\»]*)\»[\n|\r]+([^\n\r].*)?\r$/s’ , $ block , $ matches );
$ keyValueStr .= $ matches [ 1 ].» = «. $ matches [ 2 ].» & «;
>
>
$ keyValueArr = [];
parse_str( $ keyValueStr , $ keyValueArr );
return array_merge( $ a_data , $ keyValueArr );
>

Источник

Saved searches

Use saved searches to filter your results more quickly

You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session.

Analyzes the raw data received using the verb PUT and treats it to function as POST (including, especially, the attached files)

hericklr/parse-raw-http-request

This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Sign In Required

Please sign in to use Codespaces.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching GitHub Desktop

If nothing happens, download GitHub Desktop and try again.

Launching Xcode

If nothing happens, download Xcode and try again.

Launching Visual Studio Code

Your codespace will open once ready.

There was a problem preparing your codespace, please try again.

Latest commit

Git stats

Files

Failed to load latest commit information.

README.md

Analyzes the raw data received using the verb PUT and treats it to function as POST (including, especially, the attached files)

  • expansion of field names with square brackets to multi-dimensional array
  • extraction of any attached files and writing to the temporary folder until the execution of the script is complete (with automatic removal at the end)
  • construction of the same data structure with details about each file received (name, type, tmp_name, error and size)
  • automatic scope execution and removal
  • currently supports the verbs get, post, put and delete
  • keeps the usual superglobal variables untouched ($_GET, $_POST and $_FILES)

Just import the file at the beginning of the script that should receive the form data

 require 'parse_raw_http_request.php'; trigger_error(print_r($_GET,true)); trigger_error(print_r($_POST,true)); trigger_error(print_r($_PUT,true)); trigger_error(print_r($_DELETE,true)); trigger_error(print_r($_FILES,true)); // . 

There is a complete example in the «test» folder

Feel free to submit any contributions

Donations are more than welcome If you like my work or encourage development, please, use the link below

About

Analyzes the raw data received using the verb PUT and treats it to function as POST (including, especially, the attached files)

Источник

Оцените статью