Convert file to array php

jaywilliams / csv_to_array.php

This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters

/**
* Convert a comma separated file into an associated array.
* The first row should contain the array keys.
*
* Example:
*
* @param string $filename Path to the CSV file
* @param string $delimiter The separator used in the file
* @return array
* @link http://gist.github.com/385876
* @author Jay Williams
* @copyright Copyright (c) 2010, Jay Williams
* @license http://www.opensource.org/licenses/mit-license.php MIT License
*/
function csv_to_array ( $ filename = » , $ delimiter = ‘,’ )
if (!file_exists( $ filename ) || !is_readable( $ filename ))
return FALSE ;
$ header = NULL ;
$ data = array ();
if (( $ handle = fopen( $ filename , ‘r’ )) !== FALSE )
while (( $ row = fgetcsv( $ handle , 1000 , $ delimiter )) !== FALSE )
if (! $ header )
$ header = $ row ;
else
$ data [] = array_combine( $ header , $ row );
>
fclose( $ handle );
>
return $ data ;
>
/**
* Example
*/
print_r(csv_to_array( ‘example.csv’ ));
?>

Источник

str_getcsv

Parses a string input for fields in CSV format and returns an array containing the fields read.

Note:

The locale settings are taken into account by this function. If LC_CTYPE is e.g. en_US.UTF-8 , strings in one-byte encodings may be read wrongly by this function.

Parameters

Set the field delimiter (one single-byte character only).

Set the field enclosure character (one single-byte character only).

Set the escape character (at most one single-byte character). Defaults as a backslash ( \ ) An empty string ( «» ) disables the proprietary escape mechanism.

Note: Usually an enclosure character is escaped inside a field by doubling it; however, the escape character can be used as an alternative. So for the default parameter values «» and \» have the same meaning. Other than allowing to escape the enclosure character the escape character has no special meaning; it isn’t even meant to escape itself.

Return Values

Returns an indexed array containing the fields read.

Changelog

Version Description
7.4.0 The escape parameter now interprets an empty string as signal to disable the proprietary escape mechanism. Formerly, an empty string was treated like the default parameter value.

Examples

Example #1 str_getcsv() example

$string = ‘PHP,Java,Python,Kotlin,Swift’ ;
$data = str_getcsv ( $string );

The above example will output:

array(5) < [0]=>string(3) "PHP" [1]=> string(4) "Java" [2]=> string(6) "Python" [3]=> string(6) "Kotlin" [4]=> string(5) "Swift" >

See Also

User Contributed Notes 36 notes

[Editor’s Note (cmb): that does not produce the desired results, if fields contain linebreaks.]

Handy one liner to parse a CSV file into an array

$csv = array_map ( ‘str_getcsv’ , file ( ‘data.csv’ ));

Based on James’ line, this will create an array of associative arrays with the first row column headers as the keys.

$csv = array_map ( ‘str_getcsv’ , file ( $file ));
array_walk ( $csv , function(& $a ) use ( $csv ) $a = array_combine ( $csv [ 0 ], $a );
>);
array_shift ( $csv ); # remove column header
?>

This will yield something like
[2] => Array
(
[Campaign ID] => 295095038
[Ad group ID] => 22460178158
Convert file to array php => 3993587178

As the str_getcsv(), unlike to fgetcsv(), does not parse the rows in CSV string, I have found following easy workaround:

$Data = str_getcsv ( $CsvString , «\n» ); //parse the rows
foreach( $Data as & $Row ) $Row = str_getcsv ( $Row , «;» ); //parse the items in rows
?>

Why not use explode() instead of str_getcsv() to parse rows? Because explode() would not treat possible enclosured parts of string or escaped characters correctly.

Like some other users here noted, str_getcsv() cannot be used if you want to comply with either the RFC or with most spreadsheet tools like Excel or Google Docs.

These tools do not escape commas or new lines, but instead place double-quotes («) around the field. If there are any double-quotes in the field, these are escaped with another double-quote (» becomes «»). All this may look odd, but it is what the RFC and most tools do .

For instance, try exporting as .csv a Google Docs spreadsheet (File > Download as > .csv) which has new lines and commas as part of the field values and see how the .csv content looks, then try to parse it using str_getcsv() . it will spectacularly regardless of the arguments you pass to it.

Here is a function that can handle everything correctly, and more:

— doesn’t use any for or while loops,
— it allows for any separator (any string of any length),
— option to skip empty lines,
— option to trim fields,
— can handle UTF8 data too (although .csv files are likely non-unicode).

Here is the more human readable version of the function:

// returns a two-dimensional array or rows and fields

function parse_csv ( $csv_string , $delimiter = «,» , $skip_empty_lines = true , $trim_fields = true )
$enc = preg_replace ( ‘/(? $enc = preg_replace_callback (
‘/»(.*?)»/s’ ,
function ( $field ) return urlencode ( utf8_encode ( $field [ 1 ]));
>,
$enc
);
$lines = preg_split ( $skip_empty_lines ? ( $trim_fields ? ‘/( *\R)+/s’ : ‘/\R+/s’ ) : ‘/\R/s’ , $enc );
return array_map (
function ( $line ) use ( $delimiter , $trim_fields ) $fields = $trim_fields ? array_map ( ‘trim’ , explode ( $delimiter , $line )) : explode ( $delimiter , $line );
return array_map (
function ( $field ) return str_replace ( ‘!!Q!!’ , ‘»‘ , utf8_decode ( urldecode ( $field )));
>,
$fields
);
>,
$lines
);
>

?>

Since this is not using any loops, you can actually write it as a one-line statement (one-liner).

Here’s the function using just one line of code for the function body, formatted nicely though:

// returns the same two-dimensional array as above, but with a one-liner code

function parse_csv ( $csv_string , $delimiter = «,» , $skip_empty_lines = true , $trim_fields = true )
return array_map (
function ( $line ) use ( $delimiter , $trim_fields ) return array_map (
function ( $field ) return str_replace ( ‘!!Q!!’ , ‘»‘ , utf8_decode ( urldecode ( $field )));
>,
$trim_fields ? array_map ( ‘trim’ , explode ( $delimiter , $line )) : explode ( $delimiter , $line )
);
>,
preg_split (
$skip_empty_lines ? ( $trim_fields ? ‘/( *\R)+/s’ : ‘/\R+/s’ ) : ‘/\R/s’ ,
preg_replace_callback (
‘/»(.*?)»/s’ ,
function ( $field ) return urlencode ( utf8_encode ( $field [ 1 ]));
>,
$enc = preg_replace ( ‘/(? )
)
);
>

?>

Replace !!Q!! with another placeholder if you wish.

PHP is failing when parsing UTF-8 with Byte Order Mark. Strip it with this one from string before passing it to csv parser:

$bom = pack ( ‘CCC’ , 0xEF , 0xBB , 0xBF );
if ( strncmp ( $yourString , $bom , 3 ) === 0 ) $body = substr ( $yourString , 3 );
>
?>

Here is a quick and easy way to convert a CSV file to an associated array:

/**
* @link http://gist.github.com/385876
*/
function csv_to_array ( $filename = » , $delimiter = ‘,’ )
if(! file_exists ( $filename ) || ! is_readable ( $filename ))
return FALSE ;

$header = NULL ;
$data = array();
if (( $handle = fopen ( $filename , ‘r’ )) !== FALSE )
while (( $row = fgetcsv ( $handle , 1000 , $delimiter )) !== FALSE )
if(! $header )
$header = $row ;
else
$data [] = array_combine ( $header , $row );
>
fclose ( $handle );
>
return $data ;
>

I wanted the best of the 2 solutions by james at moss dot io and Jay Williams (csv_to_array()) — create associative array from a CSV file with a header row.

$array = array_map ( ‘str_getcsv’ , file ( ‘data.csv’ ));

$header = array_shift ( $array );

array_walk ( $array , ‘_combine_array’ , $header );

?>

Then I thought why not try some benchmarking? I grabbed a sample CSV file with 50,000 rows (10 columns each) and Vulcan Logic Disassembler (VLD) which hooks into the Zend Engine and dumps all the opcodes (execution units) of a script — see http://pecl.php.net/package/vld and example here: http://fabien.potencier.org/article/8/print-vs-echo-which-one-is-faster

array_walk() and array_map() — 39 opcodes
csv_to_array() — 69 opcodes

@normadize — that is a nice start, but it fails on situations where a field is empty but quoted (returning a string with one double quote instead of an empty string) and cases like «»»»»foo»»»»» that should result in «»foo»» but instead return «foo». I also get a row with 1 empty field at the end because of the final CRLF in the CSV. Plus, I don’t really like the !!Q!! magic or urlencoding to get around things. Also, \R doesn’t work in pcre on any of my php installations.

//parse a CSV file into a two-dimensional array
//this seems as simple as splitting a string by lines and commas, but this only works if tricks are performed
//to ensure that you do NOT split on lines and commas that are inside of double quotes.
function parse_csv($str)
//match all the non-quoted text and one series of quoted text (or the end of the string)
//each group of matches will be parsed with the callback, with $matches[1] containing all the non-quoted text,
//and $matches[3] containing everything inside the quotes
$str = preg_replace_callback(‘/([^»]*)(«((«»|[^»])*)»|$)/s’, ‘parse_csv_quotes’, $str);

//remove the very last newline to prevent a 0-field array for the last line
$str = preg_replace(‘/\n$/’, », $str);

//split on LF and parse each line with a callback
return array_map(‘parse_csv_line’, explode(«\n», $str));
>

//replace all the csv-special characters inside double quotes with markers using an escape sequence
function parse_csv_quotes($matches)
//anything inside the quotes that might be used to split the string into lines and fields later,
//needs to be quoted. The only character we can guarantee as safe to use, because it will never appear in the unquoted text, is a CR
//So we’re going to use CR as a marker to make escape sequences for CR, LF, Quotes, and Commas.
$str = str_replace(«\r», «\rR», $matches[3]);
$str = str_replace(«\n», «\rN», $str);
$str = str_replace(‘»»‘, «\rQ», $str);
$str = str_replace(‘,’, «\rC», $str);

//The unquoted text is where commas and newlines are allowed, and where the splits will happen
//We’re going to remove all CRs from the unquoted text, by normalizing all line endings to just LF
//This ensures us that the only place CR is used, is as the escape sequences for quoted text
return preg_replace(‘/\r\n?/’, «\n», $matches[1]) . $str;
>

//split on comma and parse each field with a callback
function parse_csv_line($line)
return array_map(‘parse_csv_field’, explode(‘,’, $line));
>

//restore any csv-special characters that are part of the data
function parse_csv_field($field) $field = str_replace(«\rC», ‘,’, $field);
$field = str_replace(«\rQ», ‘»‘, $field);
$field = str_replace(«\rN», «\n», $field);
$field = str_replace(«\rR», «\r», $field);
return $field;
>

Источник

Читайте также:  Mail function php not sending
Оцените статью