1.2.845. No mb_substr In Loop

Do not use loops on mb_substr().

mb_substr() always starts at the beginning of the string to search for the nth char, and recalculate everything. This means that the first iterations are as fast as substr() (for comparison), while the longer the string, the slower mb_substr().

The recommendation is to use preg_split() with the u option, to split the string into an array. This save multiple recalculations.

<?php

// Split the string by characters
$array = preg_split('//u', $string, -1, PREG_SPLIT_NO_EMPTY);
foreach($array as $c) {
    doSomething($c);
}

// Slow version
$nb = mb_strlen($mb);
for($i = 0; $i < $nb; ++$i) {
    // Fetch a character
    $c = mb_substr($string, $i, 1);
    doSomething($c);
}

?>

See also Optimization: How I made my PHP code run 100 times faster and How to iterate UTF-8 string in PHP?.

1.2.845.1. Suggestions

  • Use preg_split() and loop on its results.

1.2.845.2. Specs

Short name

Performances/MbStringInLoop

Rulesets

All, Changed Behavior, Performances

Exakat since

1.9.6

PHP Version

All

Severity

Minor

Time To Fix

Quick (30 mins)

Precision

High

Features

csv

Available in

Entreprise Edition, Exakat Cloud