PHP8.4 - the Quantum Leap
PHP 8.4 has introduced some huge quantum leaps in the modernization of the lanugage. Up to now, i’ve not really seen much of a deep dive of all the changes, so here’s one that might just blow your socks off.
OpCache Improvements
PHP 8.4 introduces significant changes to the OPcache JIT (Just-In-Time) compilation defaults, which have important implications for PHP performance and development practices. Here’s a comprehensive summary of these changes, their significance, and implications:
Overview of OPcache JIT
OPcache JIT is a feature introduced in PHP 8.0 that compiles PHP code into machine code at runtime, potentially improving performance by reducing the overhead of interpreting PHP scripts. This can lead to faster execution times, especially for CPU-intensive operations, as the code is executed directly by the CPU rather than being interpreted by the PHP engine
Specific Changes in PHP 8.4
opcache.jit Default Value:
Previous default: tracing
New default in PHP 8.4: disable
opcache.jit_buffer_size Default Value:
Previous default: 0
New default in PHP 8.4: 64M
Despite these changes, JIT remains disabled by default
Rationale Behind the Changes
Stability and Compatibility: By setting opcache.jit to disable by default, PHP ensures that applications that do not explicitly enable JIT will not experience unexpected behavior or performance issues. This is crucial for maintaining the reliability of PHP applications across different environments
Simplification of Configuration: The change in default values simplifies the configuration process for developers who wish to enable JIT. By explicitly setting opcache.jit to disable, it becomes clearer that this setting needs to be changed to enable JIT
Performance Considerations: Increasing the default opcache.jit_buffer_size to 64M provides a more reasonable starting point for those who choose to enable JIT. This change reflects a balance between memory usage and performance, allowing for better performance out of the box without requiring immediate configuration adjustments
Addressing Segfault Issues: There have been concerns about segfaults related to JIT. By disabling JIT by default, PHP mitigates the risk of these issues affecting users who are not prepared to handle them.
Significance and Implications
For CPU-intensive tasks: JIT can significantly enhance performance by reducing the time taken to execute these tasks
For I/O-bound applications: The impact of JIT might be less pronounced since the bottleneck is not CPU processing but rather input/output operations
With JIT, PHP can now handle tasks traditionally reserved for compiled languages, such as C or C++. This includes persistent daemons, parsers, and other long-running CPU-intensive processes. JIT offers significantly better performance for numerical code, which can be beneficial for applications that involve complex calculations
Smaller Changes
Increased Minimum OpenSSL Version Requirement
PHP 8.4 has raised the minimum required version of OpenSSL to 1.1.1
This change is crucial for several reasons:
Enhanced Security: It ensures that PHP applications can leverage the latest security features and improvements provided by OpenSSL 1.1.1. Modern Cryptographic Standards: This update aligns PHP with more secure and modern cryptographic protocols. Compatibility: Systems and applications using PHP 8.4 must have OpenSSL 1.1.1 or higher installed to utilize OpenSSL functions.
This change reflects PHP’s commitment to maintaining a secure environment for web development, protecting developers and users against vulnerabilities associated with older OpenSSL versions
PHP 8.4 introduces a significant change to the PHP_ZTS and PHP_DEBUG constants:
Data Type Change: Both constants have been changed from int to bool
PHP_ZTS: Indicates whether the current PHP build is thread-safe. PHP_DEBUG: Indicates whether the current PHP build is a debug build.
This change improves type safety and clarity, ensuring these constants are used in a way that aligns with their true/false nature
Developers may need to update their code if they were previously relying on integer values for these constants.
Unicode 16 support
PHP 8.4 introduces support for Unicode 16.0 within its MBString extension, a significant upgrade from the Unicode 14.0 support in PHP 8.3
Key features include:
Character Additions: Support for 10,301 new characters, including 5,185 from Unicode 16.0 alone, and 11 additional scripts. MBString Extension Enhancements: Updated character case folding rules and East Asian Width value assignments, affecting functions like mb_strtolower, mb_strtoupper, and mb_strwidth. Emoji Support: Improved handling of the latest emoji characters. Backward Compatibility: The update is designed to be backward compatible, but slight differences in data processing may occur. Parity Considerations: While MBString is updated to Unicode 16, other extensions like Intl and PCRE may still use older Unicode versions, potentially leading to minor inconsistencies in character handling.
Exit and Die Are Now Functions
The exit keyword and its alias die are language constructs that output a message and terminate the current script. In CLI applications, exit/die can be used to terminate the application with a given exit code.
Some language constructs such as require, include, echo, and exit work similarly to PHP functions, but they have their own tokens, functionality, and do not necessarily have return values nor need to be called with parentheses.
Due to exit and die being language constructs, it allowed various ways of calling it. It did not require parentheses, and accepted a string or an int value that were either printed to STDOUT (in case of string), or used as the exit code (in case of int):
<?php
exit; // Allowed
exit(); // Allowed
exit(1); // Allowed
exit("Fatal error, exiting"); // Allowed
The optional “parameter”, prior to PHP 8.4, accepted a string or an int value, but it did not follow the same type juggling or strict typing behavior.
<?php
declare(strict_types=1);
exit([]);
// Warning: Array to string conversion in ... on line ...
// Array
In PHP 8.4, exit and die are declared as PHP functions. They have special handling to allow them to be called without the parentheses to ensure backward compatibility with older PHP applications.
Removals
PHP 8.4 introduces significant changes to its extension management strategy by unbundling several extensions from the core distribution. This comprehensive summary focuses on the unbundling of pspell, imap, and oci8/pdo_oci extensions, analyzing the reasons behind these changes, their impacts, and providing recommendations for developers and organizations.
Overview of Unbundled Extensions
Pspell Extension The Pspell extension, which provides spell-checking capabilities, has been moved from the PHP core to PECL (PHP Extension Community Library)
IMAP Extension The IMAP (Internet Message Access Protocol) extension, used for email handling, has also been unbundled and relocated to PECL
OCI8 and PDO_OCI Extensions These extensions, used for interfacing with Oracle databases, have been removed from the PHP core and are now available through PECL
Reasons for Unbundling
Streamlining the PHP Core The decision to unbundle these extensions is part of a broader strategy to streamline the PHP core by moving less commonly used extensions to PECL. This allows for a more modular approach where developers can choose to install only the extensions they need
Dependency Management Particularly for OCI8 and PDO_OCI, the dependency on Oracle’s proprietary libraries complicates maintenance and updates within the PHP core
Resource Allocation Moving these extensions to PECL allows the PHP development community to focus on core language features while specialized extensions can be maintained by those with specific expertise
Flexibility in Updates Unbundling enables more frequent updates and bug fixes for these extensions, independent of the PHP core release cycle
Addressing Accumulated Issues Some extensions, like OCI8 and PDO_OCI, have accumulated unfixed bugs over time, making it challenging to maintain them within the core
Deprecations
PHP 8.4 introduces several significant deprecations as part of its ongoing effort to modernize the language, improve code clarity, and remove outdated or redundant features.
Alternative Signature for session_set_save_handler()
Deprecation:
- The overloaded signature of session_set_save_handler() that accepts six or more individual callable values has been deprecated
Reasons:
- Part of a broader initiative to remove overloaded functions
- Encourages the use of object-oriented programming principles
Impacts:
- Requires code refactoring for applications using the deprecated signature
- May affect session handling in existing applications
Implicitly Marking Parameter Types as Nullable
Deprecation:
- The practice of implicitly marking parameters as nullable by setting a default value of null without explicitly declaring the type as nullable is deprecated
Reasons:
- Improves code clarity and consistency
- Enforces a clearer contract in function signatures
Impacts:
- Affects existing codebases that rely on implicit nullable behavior
- Requires review and update of function and method signatures
CURLOPT_BINARYTRANSFER Option
Deprecation:
The CURLOPT_BINARYTRANSFER option in the Curl extension has been deprecated
Reasons:
- Redundancy: The option has had no functional impact since PHP 5.1.2
- Simplification of the Curl extension
Impacts:
- Minimal impact on most developers due to long-standing redundancy
- May cause deprecation warnings in existing code
E_STRICT Error Level
Deprecation:
The E_STRICT error level has been deprecated
Reasons:
- Redundancy: E_STRICT has been included in E_ALL since PHP 5.4
- Simplification of the error reporting system
Impacts:
- Requires updates to error reporting configurations
- May cause deprecation notices in projects using E_STRICT explicitly
Escape Parameter in CSV Functions
Deprecation:
The $escape parameter in CSV-related functions (e.g., fgetcsv(), fputcsv(), str_getcsv()) is deprecated if not explicitly provided
Reasons:
- Ensures compliance with established standards like RFC 4180
- Prevents creation of non-compliant CSV files
Impacts:
- Affects code that relies on default escape behavior in CSV functions
- May cause deprecation notices in existing CSV handling code
SUNFUNCS_RET_* Constants
Deprecation:
The SUNFUNCS_RET_* constants (SUNFUNCS_RET_TIMESTAMP, SUNFUNCS_RET_STRING, SUNFUNCS_RET_DOUBLE) have been deprecated
Reasons:
- Part of phasing out older functions and constants
- Encourages use of more modern and flexible alternatives
Impacts:
- Affects code using these constants with date_sunrise() and date_sunset() functions (which were deprecated in PHP 8.1)
- Will emit deprecation notices when used
Implicit Nullable Deprecation
PHP 8.4 introduces a significant deprecation regarding the way nullable parameter types are declared in function and method signatures. This change aims to improve code clarity, reduce ambiguity, and prevent potential errors related to type handling.
In PHP 8.4, the practice of implicitly marking parameters as nullable by setting a default value of null without explicitly declaring the type as nullable is deprecated. This deprecation is part of a broader initiative to enhance type safety and code readability in PHP.
Deprecated Behavior
Previously, PHP allowed parameters to be implicitly nullable by assigning a default value of null to a typed parameter. For example:
<?php
function exampleFunction(string $param = null) {
// Function logic
}
In this case, $param was implicitly nullable because it defaulted to null, but there was no explicit type declaration indicating this.
Recommended Approach
In PHP 8.4, the recommended approach is to explicitly declare the parameter as nullable using the ? syntax:
<?php
function exampleFunction(?string $param = null) {
// Function logic
}
Here, ?string explicitly indicates that $param can be either a string or null, making the code more readable and reducing ambiguity. Alternatively, developers can use union types (introduced in PHP 8.0) to declare nullable parameters:
<?php
function exampleFunction(string|null $param = null) {
// Function logic
}
This syntax is more verbose but provides the same level of explicitness
AEGIS-128L and AEGIS256 Support
PHP 8.4 introduces a significant enhancement to its cryptographic capabilities with the integration of AEGIS-128L and AEGIS-256 encryption algorithms into the Sodium extension. This new feature is available when the extension is compiled with libsodium version 1.0.19 or later, bringing state-of-the-art encryption performance to PHP developers. AEGIS, an AES-based family of authenticated encryption algorithms, is renowned for its superior speed compared to AES-GCM. The implementation in PHP 8.4 includes six new functions and four constants specifically designed for AEGIS-128L and AEGIS-256, following a consistent pattern of _keygen, _encrypt, and _decrypt operations.
These functions enable developers to generate cryptographically secure keys, encrypt messages, and decrypt them while verifying their authenticity, all within the familiar Sodium extension framework. The introduction of AEGIS encryption in PHP 8.4 is particularly noteworthy due to its performance benefits and enhanced security features. AEGIS-128L and AEGIS-256 are significantly faster than their predecessors, AES-GCM and CHACHA20-POLY1305, making them ideal for high-speed encryption requirements in PHP applications.
Moreover, AEGIS encryption boasts robust security characteristics, including resistance to partitioning oracle attacks and support for random nonces without practical limits.
These algorithms were developed as part of the CAESAR competition and have been optimized for CPUs with hardware support for parallelizable AES block encryption, offering performance that substantially surpasses AES-GCM.
By incorporating AEGIS encryption, PHP 8.4 equips developers with a powerful tool to implement high-performance, secure encryption in their applications, addressing the growing need for efficient and reliable cryptographic solutions in modern web development.
New Array Functionality
In PHP8.4 we have a set of 4 new functions that help us dramatically with common tasks in arrays. Yes, of course frameworks have added this up to now, but with the native support for array_find, array_find_key, array_any and array_all make it significantly more convenient to use, and what’s more, it’s likely to be more optimised than your own solutions.
The PHP 8.3 Versions
<?php
// PHP 8.3 way of finding elements and checking conditions
$fruits = ['apple' => 2, 'banana' => 5, 'orange' => 1, 'mango' => 4];
// Finding first element meeting a condition
$firstFruitWithMoreThanThree = null;
foreach ($fruits as $fruit => $quantity) {
if ($quantity > 3) {
$firstFruitWithMoreThanThree = $quantity;
break;
}
}
// Finding first key meeting a condition
$firstFruitKeyWithMoreThanThree = null;
foreach ($fruits as $fruit => $quantity) {
if ($quantity > 3) {
$firstFruitKeyWithMoreThanThree = $fruit;
break;
}
}
// Checking if any element meets condition
$hasAnyMoreThanThree = false;
foreach ($fruits as $quantity) {
if ($quantity > 3) {
$hasAnyMoreThanThree = true;
break;
}
}
// Checking if all elements meet condition
$allMoreThanOne = true;
foreach ($fruits as $quantity) {
if ($quantity <= 1) {
$allMoreThanOne = false;
break;
}
}
var_dump($firstFruitWithMoreThanThree); // int(5)
var_dump($firstFruitKeyWithMoreThanThree); // string(6) "banana"
var_dump($hasAnyMoreThanThree); // bool(true)
var_dump($allMoreThanOne); // bool(false)
The PHP 8.4 Versions
<?php
$fruits = ['apple' => 2, 'banana' => 5, 'orange' => 1, 'mango' => 4];
// array_find: Find first element meeting condition
$firstFruitWithMoreThanThree = array_find($fruits, fn($quantity) => $quantity > 3);
// array_find_key: Find first key meeting condition
$firstFruitKeyWithMoreThanThree = array_find_key($fruits, fn($quantity) => $quantity > 3);
// array_any: Check if any element meets condition
$hasAnyMoreThanThree = array_any($fruits, fn($quantity) => $quantity > 3);
// array_all: Check if all elements meet condition
$allMoreThanOne = array_all($fruits, fn($quantity) => $quantity > 1);
var_dump($firstFruitWithMoreThanThree); // int(5)
var_dump($firstFruitKeyWithMoreThanThree); // string(6) "banana"
var_dump($hasAnyMoreThanThree); // bool(true)
var_dump($allMoreThanOne); // bool(false)
Grapheme Split - Better String Splitting
PHP 8.4 introduces a new function called grapheme_str_split that splits strings into arrays based on grapheme clusters. This is particularly important for handling complex Unicode characters and emojis correctly. The function addresses a gap in PHP’s string handling capabilities, specifically for:
- Correctly splitting strings containing complex Unicode characters
- Handling emojis with skin modifiers (though there are some limitations with PCRE2 versions ≤ 10.43)
- Providing better internationalization support through ICU (International Components for Unicode)
<?php
// Split string into individual graphemes
$result = grapheme_str_split("Hello 👋🏽");
// Results in array containing each character/emoji as separate elements
// Split into chunks of specific length
$result = grapheme_str_split("Hello 👋🏽", 2);
// Results in array with chunks of 2 graphemes each
This function is particularly valuable for applications dealing with international text or emoji-heavy content, as it provides more accurate string splitting than traditional methods when working with Unicode characters.
The Addition of bcdivmod
The bcdivmod function, introduced in PHP 8.4, is a significant addition to the BCMath extension, designed to enhance arbitrary precision arithmetic operations. This new function combines division and modulus operations into a single call, offering improved efficiency and convenience for developers working with high-precision calculations. The primary purpose of bcdivmod is to simultaneously return both the quotient and remainder of a division operation, eliminating the need for separate function calls to bcdiv and bcmod.
The introduction of bcdivmod brings several benefits to PHP developers, particularly those working in fields that require high precision calculations such as financial applications, scientific computing, and cryptography. By providing both the quotient and remainder in one operation, bcdivmod ensures consistency between these values, which is crucial in scenarios where the relationship between the quotient and remainder is critical, such as in algorithms relying on modular arithmetic.
Additionally, this function maintains the arbitrary precision capabilities of the BCMath extension, allowing for accurate calculations with very large numbers beyond the limits of PHP’s native integer and float types
To illustrate the difference between PHP 8.3 and PHP 8.4 approaches, let’s consider a simple example where we need to perform a division and obtain both the quotient and remainder: PHP 8.3 code example: php
<?php
$dividend = '42';
$divisor = '10';
$scale = 2; // Number of decimal places for division
$quotient = bcdiv($dividend, $divisor, $scale);
$remainder = bcmod($dividend, $divisor);
echo "Quotient: " . $quotient . "\n";
echo "Remainder: " . $remainder . "\n";
// Output:
// Quotient: 4.20
// Remainder: 2
In this PHP 8.3 example, we need to make two separate function calls: bcdiv() for division and bcmod() for the modulus operation
This approach requires more code and potentially more computational overhead, especially when dealing with large numbers or performing these operations frequently. Now, let’s look at how the same operation can be performed using the new bcdivmod function in PHP 8.4: PHP 8.4 code example using bcdivmod:
<?php
$dividend = '42';
$divisor = '10';
$scale = 2; // Number of decimal places for remainder
$result = bcdivmod($dividend, $divisor, $scale);
echo "Quotient: " . $result[0] . "\n";
echo "Remainder: " . $result[1] . "\n";
// Output:
// Quotient: 4
// Remainder: 2.00
In this PHP 8.4 example, we use a single function call to bcdivmod(), which returns an array containing both the quotient and the remainder
This approach is more concise and potentially more efficient, especially in applications that frequently require both division and modulus results. The bcdivmod function in PHP 8.4 offers significant advantages in terms of efficiency, precision, and consistency. It simplifies code and enhances performance in applications where both division and modulus results are required, making it a valuable tool for developers working with arbitrary precision arithmetic
Whether used in financial calculations, cryptographic algorithms, or scientific computing, bcdivmod provides a streamlined approach to handling complex arithmetic operations with large numbers, ensuring both accuracy and efficiency in the process
The Introduction Of get_iana_id
PHP 8.4 introduces two significant additions to its Internationalization (Intl) extension for handling timezones: the intltz_get_iana_id function and the IntlTimeZone::getCanonicalID method. These new features aim to enhance timezone handling by providing more accurate and standardized timezone identifiers, which is crucial for applications that operate across multiple regions and need to handle time-related data with precision
The intltz_get_iana_id function is designed to return the IANA (Internet Assigned Numbers Authority) timezone identifier for a given timezone ID. This function is particularly useful in situations where a timezone identifier might be deprecated or has a superseded identifier, helping to canonicalize the timezone identifier to ensure consistency and accuracy
The function returns the corresponding IANA identifier if a valid timezone ID is provided, or false if an invalid timezone ID is given .
The IntlTimeZone::getCanonicalID Method
The IntlTimeZone::getCanonicalID method serves a similar purpose to the intltz_get_iana_id function but is part of the IntlTimeZone class. It provides a canonical system timezone ID or a normalized custom timezone ID for a given timezone ID
This method not only returns the canonical timezone ID but can also indicate whether the ID is a system ID through the optional isSystemId parameter
The primary purposes of these new additions are:
To provide standardized IANA timezone identifiers, ensuring consistency across different systems and applications
To normalize timezone identifiers, which is particularly useful when dealing with user input or data from various sources that might use different timezone representations
To simplify the process of retrieving canonical timezone IDs, reducing the potential for errors in timezone management
These functions enhance PHP’s capability to handle timezones more effectively, providing built-in support for retrieving standardized timezone identifiers directly within the PHP environment. This is a significant improvement over previous versions where developers often had to rely on manual mapping or external libraries to ensure timezone identifier consistency and accuracy
A Set Of Curl changes
PHP 8.4 introduces several significant enhancements to the cURL extension, providing developers with more powerful and flexible tools for handling HTTP requests and network operations. These changes aim to improve performance, security, and debugging capabilities. Here’s a comprehensive summary of the cURL changes in PHP 8.4 and their impact on PHP development:
CURLOPT_DNS_USE_GLOBAL_CACHE Now Has No Effect
The PHP Curl extension exposes a libcurl option that allowed Curl to use a shared global DNS cache. When enabled, the DNS information is cached between requests and Curl handles, even after a Curl handle is destroyed.
The option is enabled by setting the CURLOPT_DNS_USE_GLOBAL_CACHE Curl option via curl_setopt function. However, the libcurl Global DNS cache is not thread-safe, and in certain situations, it can lead to security vulnerabilities and undesired behavior. Thread-safe PHP builds prevent setting this option and emit a warning to alleviate this.
Libcurl deprecated this feature in version 7.11.1, and deprecated it in libcurl 7.62.0. In PHP 8.4 and later, regardless of the libcurl version, setting CURLOPT_DNS_USE_GLOBAL_CACHE no longer has any effect.
Minimum LibCurl Version 7.61.0
The Curl extension exposes libcurl functionality in PHP. While it is possible to compile The Curl extension with any supported libcurl version, the extension requires a certain minimum libcurl version. This makes it easier for the extension to ensure that certain functionality and APIs are always available.
Prior to PHP 8.4, the Curl extension required libcurl version 7.29.0 (released in 2013) or higher. In PHP 8.4 and later, the Curl extension requires libcurl version 7.61.0 (released in 2018) or later.
This minimum requirement bump is made with the consideration of Linux distributions such as RHEL 7, CentOS 7, and Ubuntu 18 reaching their End Of Life dates by the time PHP 8.4 is released.
Enhanced curl_version() Function:
The curl_version() function now includes a new feature_list key in its return array. This key contains an associative array that lists all known cURL features and indicates whether each feature is supported (true) or not (false)
This enhancement allows developers to programmatically check for feature support, enabling more robust and portable PHP applications that rely on cURL for HTTP requests and other network operations
Impact: This change facilitates better compatibility checks, improves debugging and maintenance, and supports the development of more adaptive applications that can adjust their behavior based on available cURL capabilities.
New cURL Options:
CURLOPT_TCP_KEEPCNT: This option allows developers to specify the maximum number of TCP keep-alive probes that can be sent before dropping a connection. The default value is 9, but it can be set to any integer value of 0 or higher
This addition provides more granular control over TCP connections, which is particularly useful for optimizing network performance and reliability in applications dealing with specific network configurations or services.
CURLOPT_SERVER_RESPONSE_TIMEOUT: This option sets a timeout period (in seconds) for the server to send a response message for a command before the session is considered dead. It replaces the older CURLOPT_FTP_RESPONSE_TIMEOUT option and is applicable to multiple protocols including FTP, SFTP, SCP, IMAP, POP3, and SMTP
This change provides a more flexible and protocol-agnostic way to handle server response timeouts, reducing the risk of hanging processes due to unresponsive servers.
CURLOPT_DEBUGFUNCTION: This option allows developers to set a custom callback function for handling debug information during cURL requests. It provides more control over how debug information is processed and logged
This enhancement enables more sophisticated debugging and logging strategies, allowing developers to tailor error handling to their application’s specific needs.
New cURL Constants:
CURL_HTTP_VERSION_3 and CURL_HTTP_VERSION_3ONLY: These constants are used with the CURLOPT_HTTP_VERSION option to specify the use of HTTP/3 for requests
CURL_HTTP_VERSION_3 (value: 30) attempts to use HTTP/3 but falls back to earlier versions if not supported.CURL_HTTP_VERSION_3ONLY (value: 31) forces the use of HTTP/3 exclusively.
These constants enable developers to leverage the benefits of HTTP/3, potentially improving performance and security in their applications.
CURLINFO_POSTTRANSFER_TIME_T: This constant is used with curl_getinfo() to retrieve the time it took from the start of a request until the last byte is sent (post-transfer time) in microseconds
This addition enhances the ability to measure and optimize the performance of data transfer operations, especially for scenarios involving large data uploads.
Backward Compatibility Considerations:
The new options and constants are generally backward compatible, as they are additions rather than modifications to existing functionality. However, the CURLOPT_DEBUGFUNCTION option is not compatible with CURLINFO_HEADER_OUT in PHP 8.4, and enabling both simultaneously will result in a ValueError exception
Some features, like CURLOPT_TCP_KEEPCNT and CURLINFO_POSTTRANSFER_TIME_T, require specific libcurl versions (8.9.0 and 8.10.0 respectively) to be fully functional
The Introduction of mb_ucfirst mb_trim, mb_ltrim, mb_rtrim
The mb_ucfirst function is designed to address a long-standing limitation in PHP’s string handling capabilities. Its primary purpose is to provide a multibyte-safe version of the traditional ucfirst function
While ucfirst is limited to single-byte character encodings and only works with ASCII characters, mb_ucfirst extends this functionality to support multibyte encodings, ensuring that developers can correctly capitalize the first character of strings in various languages and scripts
Key aspects of mb_ucfirst functionality include:
Multibyte Support: It handles multibyte character encodings, which are common in languages such as Japanese, Chinese, Korean, and many others that use non-Latin scripts
Unicode Compliance: The function applies Unicode title case rules to the first character of a string, making it particularly useful for languages with complex character mappings, such as Georgian and Vietnamese
Encoding Specification: Developers can specify the character encoding, ensuring that the function correctly interprets and manipulates the string according to its encoding
Preservation of String Integrity: Unlike ucfirst, which might fail to recognize the boundaries of multibyte characters, mb_ucfirst respects the multibyte nature of characters, preventing string corruption
Benefits and Use Cases
The introduction of mb_ucfirst brings several benefits to PHP developers:
Improved Internationalization: It enables more accurate and culturally appropriate string capitalization across a wide range of languages and scripts
Simplified Code: Developers no longer need to implement complex workarounds or rely on third-party libraries for multibyte-safe string capitalization
Consistency with Unicode Standards: The function adheres to Unicode title case rules, ensuring proper capitalization even for characters with special casing rules
Enhanced Text Processing: It facilitates more robust text processing in multilingual applications, particularly for tasks like name formatting or title generation
Use cases for mb_ucfirst include:
Capitalizing names or titles in multilingual content management systems Formatting user input in internationalized web applications Standardizing text display in multi-language user interfaces Handling proper nouns in natural language processing tasks
Code Examples
Here are some examples demonstrating the usage of mb_ucfirst:
<?php
// Basic usage
echo mb_ucfirst('test'); // Output: Test
// Handling non-ASCII characters
echo mb_ucfirst('łámał'); // Output: Łámał
// Unicode character handling
echo mb_ucfirst("\u{01CA}"); // Output: "\u{01CB}"
// Special cases (e.g., German Eszett)
echo mb_ucfirst("ß"); // Output: "Ss" (Only the first S is uppercase)
// Specifying encoding
echo mb_ucfirst('test', 'UTF-8'); // Output: Test
Impact on PHP Development
The introduction of mb_ucfirst has several implications for PHP development:
Enhanced Internationalization Capabilities: It significantly improves PHP’s ability to handle multilingual text processing, making it easier for developers to create truly global applications
Standardization of String Manipulation: mb_ucfirst provides a standard, built-in solution for a common string manipulation task, reducing reliance on custom implementations or external libraries
Performance Considerations: While mb_ucfirst may have a slight performance overhead compared to ucfirst due to the complexity of handling multibyte encodings, this trade-off is generally acceptable given the benefits of correct string manipulation in a multilingual context.
PHP 8.4 also introduces three additional new multibyte string functions: mb_trim, mb_ltrim, and mb_rtrim. These functions are designed to enhance PHP’s capability to handle multibyte strings, addressing a long-standing need in the language for more robust internationalization support.
The primary purpose of mb_trim, mb_ltrim, and mb_rtrim is to provide multibyte-safe string trimming operations. These functions are part of the mbstring extension and are specifically designed to handle multibyte character sets, such as UTF-8, which are common in languages like Japanese, Chinese, and Korean
mb_trim: Removes whitespace or specified characters from both the beginning and end of a multibyte string. mb_ltrim: Removes whitespace or specified characters from the beginning of a multibyte string. mb_rtrim: Removes whitespace or specified characters from the end of a multibyte string
These functions are analogous to their non-multibyte counterparts (trim, ltrim, rtrim) but are designed to correctly process multibyte characters, ensuring that string operations do not inadvertently break or incorrectly manipulate multibyte character sequences
Unlike the standard trimming functions, mb_trim, mb_ltrim, and mb_rtrim can safely handle multibyte character encodings, crucial for processing strings in languages with complex character sets. These functions include an additional $encoding parameter, allowing developers to specify the character encoding explicitly. This ensures correct handling of multibyte strings across different encoding schemes. The default list of characters trimmed by these functions includes a broader set of whitespace characters, such as those in the Unicode Z block, which are not covered by the traditional functions. This makes them more comprehensive in handling whitespace across different languages. Unlike the traditional trim functions, the mb_ variants do not support specifying a range of characters to trim using the … notation. Instead, they require explicitly listing each character to be trimmed
New Rounding Changes
PHP 8.4 adds four new rounding modes to the round() function, complementing the existing modes and offering more flexibility in handling various rounding scenarios
Passing Invalid Mode Now Throws A ValueError Exception
The round() function rounds a float value to the nearest integer or a decimal value of a specified precision. It supports fine-tuning the rounding method with an additional parameter.
Prior to PHP 8.4, passing an invalid rounding mode parameter silently assumed the default PHP_ROUND_HALF_UP rounding mode. In PHP 8.4 and later, passing invalid rounding mode results in a \ValueError exception, and is no longer assumed as PHP_ROUND_HALF_UP.
PHP_ROUND_CEILING
Rounds a number up to the nearest integer that is greater than or equal to the given number. Equivalent to the ceil() function when precision is set to zero.
<?php
round(1.3, 0, PHP_ROUND_CEILING); // Returns 2
round(-1.3, 0, PHP_ROUND_CEILING); // Returns -1
PHP_ROUND_FLOOR
Rounds a number down to the nearest integer that is less than or equal to the given number. Equivalent to the floor() function when precision is set to zero.
<?php
round(1.9, 0, PHP_ROUND_FLOOR); // Returns 1
round(-1.9, 0, PHP_ROUND_FLOOR); // Returns -2
PHP_ROUND_TOWARD_ZERO
Rounds a number towards zero by truncating the decimal part. Moves the number closer to zero regardless of its sign.
<?php
round(1.9, 0, PHP_ROUND_TOWARD_ZERO); // Returns 1
round(-1.9, 0, PHP_ROUND_TOWARD_ZERO); // Returns -1
PHP_ROUND_AWAY_FROM_ZERO
Rounds a number away from zero, increasing its absolute value. Moves the number further from zero regardless of its sign.
<?php
round(1.1, 0, PHP_ROUND_AWAY_FROM_ZERO); // Returns 2
round(-1.1, 0, PHP_ROUND_AWAY_FROM_ZERO); // Returns -2
DateTime Improvements
PHP 8.4 introduces significant enhancements to the DateTime and DateTimeImmutable classes, providing developers with more powerful and precise tools for handling date and time operations. These enhancements focus on improving microsecond precision handling and simplifying the creation of DateTime objects from timestamps. Here’s a comprehensive summary of the key DateTime enhancements in PHP 8.4:
DateTime::createFromTimestamp()
The createFromTimestamp() method is a new addition to both DateTime and DateTimeImmutable classes, offering a more straightforward and efficient way to create DateTime instances from Unix timestamps (https://php.watch/versions/8.4/datetime-createFromTimestamp). Key features and benefits:
Simplified Instantiation: This method allows direct creation of DateTime objects from timestamps, eliminating the need for complex workarounds or string manipulations
Microsecond Precision Support: It accepts both integer and float values for timestamps, enabling the creation of DateTime instances with microsecond precision
Improved Code Readability: The method enhances code clarity and reduces the potential for errors in timestamp handling
Performance Optimization: By providing a direct method for timestamp conversion, it potentially improves performance in applications with frequent DateTime operations
This method significantly simplifies the process of creating DateTime objects from timestamps, especially when compared to previous approaches like createFromFormat() or the @
getMicrosecond() and setMicrosecond() Methods
PHP 8.4 introduces getMicrosecond() and setMicrosecond() methods to both DateTime and DateTimeImmutable classes, enhancing the handling of microsecond precision
Precise Microsecond Manipulation: These methods allow developers to directly retrieve and set the microsecond component of a DateTime object without affecting other parts of the timestamp
Enhanced Time Precision: They provide a more intuitive and accurate way to handle microsecond-level precision, which is crucial for applications requiring high-resolution time measurements
Simplified Time Manipulation: Developers can now adjust the microsecond value independently of other time components, simplifying complex time manipulation tasks
Improved Code Readability: These methods enhance code clarity by providing dedicated functions for microsecond operations, reducing the need for complex string manipulations
request_parse_body()
The request_parse_body function introduced in PHP 8.4 represents a significant enhancement in PHP’s capability to handle HTTP request bodies, particularly for methods beyond the traditional POST request. This comprehensive overview will delve into its purpose, functionality, syntax, benefits, limitations, and use cases.
The primary purpose of request_parse_body is to streamline the parsing of HTTP request bodies, especially for non-POST methods such as PUT, PATCH, and DELETE
This function is designed to:
Parse multipart/form-data requests across various HTTP methods, expanding the flexibility of handling form data beyond just POST requests
Read the php://input stream and populate data structures similar to the $_POST and $_FILES superglobals, simplifying access to parsed data
Comply with RFC1867 (multipart) requests, which are commonly used for file uploads and complex form submissions
Expose PHP’s built-in request parsing functionality, reducing the need for custom parsing logic and boilerplate code
http_get_clear_last_headers
PHP 8.4 introduces a significant enhancement to HTTP header handling with the addition of the http_get_clear_last_response_headers() function. This new feature represents a substantial improvement in managing HTTP response headers, offering developers a more intuitive and flexible approach compared to previous methods.
The http_get_clear_last_response_headers() function is actually a combination of two separate functions: http_get_last_response_headers() and http_clear_last_response_headers()
These functions are designed to replace the historical $http_response_header variable, which was implicitly created and often confusing due to its “magic” nature
PCRE2 and REGEX changes
PHP 8.4 introduces significant changes to the PCRE2 (Perl-Compatible Regular Expressions) library, which is integral to PHP’s regular expression capabilities. These changes, part of the PCRE2 10.44 update, bring new features and modifications to the regular expression syntax.
Key PCRE2 Syntax Changes in PHP 8.4
Quantifiers Without Minimum Quantity PHP 8.4 now allows quantifiers without a specified minimum quantity, interpreting them as having a zero minimum. For example, /a{,3}/ is now equivalent to /a{0,3}/. This change aligns PHP with Perl 5.34.0 and Python, but differs from JavaScript and Java
Spaces in Curly Braces Spaces and horizontal tabs are now permitted within quantifier curly braces. For instance, /a{ 5,10 }/ is now valid in PHP 8.4. This feature is supported by ECMAScript but not by Perl
Unicode 15 Support PHP 8.4 includes support for Unicode 15, introducing new character classes and scripts such as Kawi and Nag Mundari. This enhances the expressiveness of regular expressions, particularly for applications dealing with diverse character sets
Expanded \w Character Class In Unicode mode, the \w character class now includes non-spacing marks (\p{Mn}) and connector punctuation (\p{Pc}), in addition to letters, numbers, and underscores. This expands the match set by 1,849 characters, aligning PHP more closely with Perl’s behavior
Caseless Restrict Modifier PHP 8.4 introduces a “caseless restrict” modifier, implemented using (?r) and (?-r) syntax or the “r” flag. This prevents caseless matching across ASCII and non-ASCII characters, offering more precise control over pattern matching
Variable-Length Lookbehind Support for variable-length lookbehind assertions with a defined maximum length is now available. This allows patterns like (?<=Hello{1,5}) to be valid, increasing the flexibility of lookbehind assertions
Increased Named Capture Group Label Length The maximum length for named capture group labels has been increased from 32 to 128 characters, allowing for more descriptive labels in complex regular Expressions
PHP_Integer_Size now in phpinfo
The new PHP Integer Size field in phpinfo() output is only an indicative value. PHP 8.4 and all previous versions (since PHP 5.0.5) support the following PHP constants to determine the integer size and the range of supported integers.
PHP_INT_SIZE: The integer size in bytes. e.g. 64 bits support would have PHP_INT_SIZE = 8. PHP_INT_MIN: The minimum supported integer value; -2147483648 in 32 bit systems and -9223372036854775808 in 64 bit systems. PHP_INT_MAX: The maximum supported integer value; 2147483647 in 32 bit systems and 9223372036854775807 in 64 bit systems.