Transliterator::transliterate() - Transliterator类
Transliterator::transliterate()
(PHP 5 >= 5.4.0, PHP 7, PECL intl >= 2.0.0)
Transliterate a string
说明
面向对象风格publicTransliterator::transliterate(string $subject[,int $start[,int $end]]): string过程化风格
transliterator_transliterate(mixed $transliterator,string $subject[,int $start[,int $end]])
Transforms a string or part thereof using an ICU transliterator.
参数
$transliteratorIn the procedural version, either a Transliterator or a string from which a Transliterator can be built.
$subjectThe string to be transformed.
$startThe start index(in UTF-16 code units)from which the string will start to be transformed, inclusive. Indexing starts at 0. The text before will be left as is.
$endThe end index(in UTF-16 code units)until which the string will be transformed, exclusive. Indexing starts at 0. The text after will be left as is.
返回值
The transfomed string on success,或者在失败时返回FALSE
.
范例
Converting escaped UTF-16 code units
以上例程的输出类似于:
お早うございます 1 \uD834\uDD1E 𝄞
参见
- Transliterator::getErrorMessage() Get last error message
- Transliterator::__construct() Private constructor to deny instantiation
I pretty much like the idea of hdogan, but there's at least one group of characters he's missing: ligature characters. They're at least used in Norwegian and I read something about French, too ... Some are just used for styling (f.e. fi) Here's an example that supports all characters (should at least, according to the documentation): In this example any character will firstly be converted to a latin character. If that's finished, replace all latin characters by their ASCII replacement.
Sorry, for posting it again, but I found a bug in my code: If you have a character, like the cyrillic ь (a soft-sign - no sound), the "Any-Latin" would translate it to a prime-character, and the "Latin-ASCII" doesn't touch prime-characters. Therefore I added an option to remove all characters, that are higher than \u0100. Here's my new code, including an example: var_dump(transliterator_transliterate('Any-Latin; Latin-ASCII; [\u0100-\u7fff] remove', "A æ Übérmensch på høyeste nivå! И я люблю PHP! есть. fi")); // string(50) "A ae Ubermensch pa hoyeste niva! I a lublu PHP! est. fi" Another approach, I found quite helpful (if you by no way want to remove characters ...), try to use iconv() in addition. This surely will just return ASCII characters. See: http://stackoverflow.com/a/3542748/517914 Also an example here: var_dump(iconv("UTF-8", "ASCII//TRANSLIT//IGNORE", transliterator_transliterate('Any-Latin; Latin-ASCII', "A æ Übérmensch på høyeste nivå! И я люблю PHP! есть. fi")); // string(50) "A ae Ubermensch pa hoyeste niva! I a lublu PHP! est'. fi"
You can create slugs easily with:
OOP version :
There are some possibly undesirable conversions with ASCII//TRANSLIT//IGNORE or your users may require some custom stuff. You might want to run a substitution up front for certain things, such as when you want 3 letter ISO codes to replace currency symbols. £ transliterates to "lb", for example, which is incorrect since it's a currency symbol, not a weight symbol (#). ASCII//TRANSLIT//IGNORE does a great job within the realm of possibility :-) When it doesn't do something you want it to, you can set up a CSV with one replacement per line and run a function like: function stripByMap($inputString, $mapFile) { $csv = file($mapFile); foreach($csv as $line) { $arrLine = explode(',', trim($line)); $inputString = str_replace($arrLine[0],$arrLine[1],$inputString); } return $inputString; } or you can write some regexes. Transliterating using ASCII//TRANSLIT//IGNORE works so well that your map probably won't be very long...
鹏仔微信 15129739599 鹏仔QQ344225443 鹏仔前端 pjxi.com 共享博客 sharedbk.com
图片声明:本站部分配图来自网络。本站只作为美观性配图使用,无任何非法侵犯第三方意图,一切解释权归图片著作权方,本站不承担任何责任。如有恶意碰瓷者,必当奉陪到底严惩不贷!