Set charset php script

PHP declare encoding

PHP 5.6 comes with a new default charset directive set to UTF-8 , in some case this may be a problem with pages served in metatag as latin1, you can override this directive by calling ini_set(‘default_charset’, ‘iso-8859-1’) in your scripts.

For doing that put on each php file you want to be coded to latin1 this piece of code at the beginning of your scripts:

Then create a folder «php» under your root website and put this piece of code into config.php :

If your php.ini is set to latin1 ( ISO-8859-1 ) and you want serve a utf-8 (unicode) page you can force encoding using the same way but putting instead of iso-8859-1, utf-8. Look at that:

I hope you find my answer useful, I solved my problem in this way!

I appreciate your answer but the question was about PHP’s declare function when used to declare encoding, what it does, and how it differs from other ways of declaring the character set such as what you proposed here.

; Allows to set the default encoding for the scripts. This value will be used ; unless "declare(encoding=. )" directive appears at the top of the script. ; Only affects if zend.multibyte is set. ; Default: "" ;zend.script_encoding = 

handled as the file is being compiled.

A script’s encoding can be specified per-script using the encoding directive.

In other words if the zend.multibyte directive is set, an optional declare directive at the top of each PHP file can be used to declare each file’s character encoding. This means you can have each of your PHP files in different encodings as long as you declare their encodings at the top of each PHP file, and the string literals contained in each of the files will be transparently converted at compile time to the internal_encoding set in php.ini (tested in PHP 7.4.6). The default_charset and internal_encoding configuration options are not changed and your code is unaware of the original encodings since the conversions have taken place at compile time.

  1. How does this differ from setting the directives mbstring.internal_encoding (before PHP 5.6) and default_charset (as of PHP 5.6) or using the mb_internal_encoding() function?

internal_encoding directive (formerly mbstring.internal_encoding)

The declared character encoding at the top of each file is the actual encoding of said file, while the internal_encoding setting in php.ini is the desired character encoding. So if you want your code to see UTF-8 but your PHP files are saved in Windows-1252, you could set your internal_encoding in php.ini to UTF-8 while putting a declare directive at the top of each of your files stating that they are encoded as Windows-1252 and the string literals contained within them will be converted to UTF-8 at compile time. (Tested in PHP 7.4.6)

This setting is used for multibyte modules such as mbstring and iconv.

If empty, default_charset is used.

For more information see mb_internal_encoding() function below

Читайте также:  Python словарь вывод всех значений

mb_internal_encoding function

Setting mb_internal_encoding at run time tells your mb_* functions what multibyte encoding you are using so that calls to functions like mb_strtolower will be able to recognize your multibyte characters so that they can substitute them with their lowercase equivalents. If you don’t set this at runtime it will assume the encoding set in the internal_encoding directive in php.ini.

The mb_internal_encoding function executes at runtime and therefore can’t be used to tell PHP what each PHP file’s declared encoding should be converted to at compile time. (See above.)

[Set/Get] the character encoding name used for the HTTP input character encoding conversion, HTTP output character encoding conversion, and the default character encoding for string functions defined by the mbstring module. You should notice that the internal encoding is totally different from the one for multibyte regex.

default_charset directive

Setting the default_charset directive tells PHP what value to use in the content-type HTTP response header. For example content-type: text/html; charset=UTF-8

This directive also tells PHP what character encoding to look for in certain functions such as htmlspecialchars and htmlentities. For example if your default_charset is UTF-8 but your database is set to use latin1 then htmlspecialchars will have trouble with non-ascii characters if Windows-1252 is not specified as the encoding because Windows-1252 contains byte sequences that are considered invalid in UTF-8. It’s also used as the internal_encoding if the internal_encoding is not explicitly set.

default_charset string

In PHP 5.6 onwards, «UTF-8» is the default value and its value is used as the default character encoding for htmlentities(), html_entity_decode() and htmlspecialchars() if the encoding parameter is omitted. The value of default_charset will also be used to set the default character set for iconv functions if the iconv.input_encoding, iconv.output_encoding and iconv.internal_encoding configuration options are unset, and for mbstring functions if the mbstring.http_input mbstring.http_output mbstring.internal_encoding configuration option is unset.

All versions of PHP will use this value as the charset within the default Content-Type header sent by PHP if the header isn’t overridden by a call to header().

Setting default_charset to an empty value is not recommended.

Источник

mysqli_set_charset

Задаёт набор символов, который будет использоваться при обмене данными с сервером баз данных.

Список параметров

Только для процедурного стиля: объект mysqli , полученный с помощью mysqli_connect() или mysqli_init() .

Читайте также:  Php server local path

Набор символов, который необходимо установить.

Возвращаемые значения

Возвращает true в случае успешного выполнения или false в случае возникновения ошибки.

Ошибки

Если уведомления об ошибках mysqli включены ( MYSQLI_REPORT_ERROR ) и запрошенная операция не удалась, выдаётся предупреждение. Если, кроме того, установлен режим MYSQLI_REPORT_STRICT , вместо этого будет выброшено исключение mysqli_sql_exception .

Примеры

Пример #1 Пример использования mysqli::set_charset()

mysqli_report ( MYSQLI_REPORT_ERROR | MYSQLI_REPORT_STRICT );
$mysqli = new mysqli ( «localhost» , «my_user» , «my_password» , «test» );

printf ( «Начальный набор символов: %s\n» , $mysqli -> character_set_name ());

/* изменение набора символов на utf8mb4 */
$mysqli -> set_charset ( «utf8mb4» );

printf ( «Текущий набор символов: %s\n» , $mysqli -> character_set_name ());

mysqli_report ( MYSQLI_REPORT_ERROR | MYSQLI_REPORT_STRICT );
$link = mysqli_connect ( ‘localhost’ , ‘my_user’ , ‘my_password’ , ‘test’ );

printf ( «Начальный набор символов: %s\n» , mysqli_character_set_name ( $link ));

/* изменение набора символов на utf8mb4 */
mysqli_set_charset ( $link , «utf8mb4» );

printf ( «Текущий набор символов: %s\n» , mysqli_character_set_name ( $link ));

Результат выполнения данных примеров:

Начальный набор символов: latin1 Текущий набор символов: utf8mb4

Примечания

Замечание:

Чтобы использовать эту функцию на Windows платформах, вам потребуется клиентская библиотека MySQL версии 4.1.11 или выше (для MySQL 5.0 соответственно 5.0.6 или выше).

Замечание:

Это предпочтительный способ задания набора символов. Использование для этих целей функции mysqli_query() (например SET NAMES utf8 ) не рекомендуется. Дополнительно смотрите Наборы символов в MySQL.

Смотрите также

  • mysqli_character_set_name() — Возвращает текущую кодировку, установленную для соединения с БД
  • mysqli_real_escape_string() — Экранирует специальные символы в строке для использования в SQL-выражении, используя текущий набор символов соединения
  • Концепции кодировок MySQL
  • » Список поддерживаемых MySQL наборов символов

User Contributed Notes 5 notes

Setting the charset (it’s really the encoding) like this after setting up your connection:
$connection->set_charset(«utf8mb4»)

FAILS to set the proper collation for the connection:

character_set_client: utf8mb4
character_set_connection: utf8mb4
character_set_database: utf8mb4
character_set_filesystem: binary
character_set_results: utf8mb4
character_set_server: utf8mb4
character_set_system: utf8
collation_connection: utf8mb4_general_ci collation_database: utf8mb4_unicode_ci
collation_server: utf8mb4_unicode_ci

If you use SET NAMES, that works:
$connection->query(«SET NAMES utf8mb4 COLLATE utf8mb4_unicode_ci»);

character_set_client: utf8mb4
character_set_connection: utf8mb4
character_set_database: utf8mb4
character_set_filesystem: binary
character_set_results: utf8mb4
character_set_server: utf8mb4
character_set_system: utf8
collation_connection: utf8mb4_unicode_ci collation_database: utf8mb4_unicode_ci
collation_server: utf8mb4_unicode_ci

Please note, that I set the following variables on the server:

Set the following to be: utf8mb4_unicode_ci

character-set-client-handshake = FALSE or 0
skip-character-set-client-handshake = TRUE or 1

So in my case, I had tried changing the collation from utf8mb4_unicode_ci for mysql and had to change it to uft8_general_ci.

mysqli_set_charset( $con, ‘utf8’);

right before I did the SELECT command.

This is my code for reading from db :

$con = mysqli_connect($DB_SERVER, $DB_USER_READER, $DB_PASS_READER, $DB_NAME, $DB_PORT);//this is the unique connection for the selection

mysqli_set_charset( $con, ‘utf8’);

$slct_stmnt = «SELECT «.$SELECT_WHAT.» FROM «.$WHICH_TBL.» WHERE «.$ON_WHAT_CONDITION;

$slct_query = mysqli_query($con, $slct_stmnt);

if ($slct_query==true) //Do your stuff here . . .
>

And it worked like a charm. All the best. The above code can work with reading chineese, russian or arabic or any international language from the database’s table column holding such data.

Читайте также:  Вызов html файла из php

Although the documentation says that using that function is preferred than using SET NAMES, it is not sufficient in case you use a collation different from the default one:

// That will reset collation_connection to latin1_swedish_ci
// (the default collation for latin1):
$mysqli -> set_charset ( ‘latin1’ );

// You have to execute the following statement *after* mysqli::set_charset()
// in order to get the desired value for collation_connection:
$mysqli -> query ( «SET NAMES latin1 COLLATE latin1_german1_ci» );

To align both the character set (e.g., utf8mb4) AND the collation sequence with the schema (database) settings:

$mysqli = new mysqli ( DB_HOST , DB_USER , DB_PASSWORD , DB_SCHEMA , DB_PORT );
if ( 0 !== $mysqli -> connect_errno )
throw new \ Exception ( $mysqli -> connect_error , $mysqli -> connect_errno );

if ( TRUE !== $mysqli -> set_charset ( ‘utf8mb4’ ) )
throw new \ Exception ( $mysql -> error , $mysqli -> errno );

if ( TRUE !== $mysqli -> query ( ‘SET collation_connection = @@collation_database;’ ) )
throw new \ Exception ( $mysql -> error , $mysqli -> errno );
?>

To confirm:

echo ‘character_set_name: ‘ , $mysqli -> character_set_name (), ‘
‘ , PHP_EOL ;
foreach( $mysqli -> query ( «SHOW VARIABLES LIKE ‘%_connection’;» )-> fetch_all () as $setting )
echo $setting [ 0 ], ‘: ‘ , $setting [ 1 ], ‘
‘ , PHP_EOL ;
?>

will output something like:
character_set_name: utf8mb4
character_set_connection: utf8mb4
collation_connection: utf8mb4_unicode_520_ci

Note that using utf8mb4 with this function may cause this function to return false, depending on the MySQL client library compiled into PHP. If the client library is older than the introduction of utf8mb4, then PHP’s call of the libraries ‘mysql_set_character_set’ will return an error because it won’t recognise that character set.

The only way you will know there’s an error is by checking the return value, because PHP warnings are not emitted by this function.
mysqli_error will return something like:
«Can’t initialize character set utf8mb4 (path: /usr/share/mysql/charsets/)»
(I don’t think the directory has anything to do with it; I think the utf8mb4 vs utf8 distinction is handled internally)

A workaround is to recall with utf8, then do a ‘SET NAMES’ query with utf8mb4.

If your MySQL server is configured to use utf8 by default, then you may not notice any of this until you get obscure bugs. It seems it will still save into the database correctly in terms of bytes. However, you may get «Data too long for column» errors if you are truncating strings to fit fields, because from MySQL’s point of view during the length check every 4-byte character will actually be multiple individual characters. This caused me hours of debugging.

Источник

Оцените статью