strcmp() - 二进制安全字符串比较 - php 字符串函数

乐乐1年前 (2023-11-21)阅读数 40#技术干货

文章标签字符串

strcmp()

(PHP 4, PHP 5, PHP 7)

二进制安全字符串比较

说明

strcmp(string $str1, string $str2): int

注意该比较区分大小写。

参数

$str1

第一个字符串。

$str2

strcmp() - 二进制安全字符串比较 - php 字符串函数

第二个字符串。

返回值

如果$str1小于$str2返回 0；如果两者相等，返回 0。

范例

strcmp()例子

参见

strcasecmp()二进制安全比较字符串（不区分大小写）
preg_match()执行匹配正则表达式
substr_compare()二进制安全比较字符串（从偏移位置比较指定长度）
strncmp()二进制安全比较字符串开头的若干个字符
strstr()查找字符串的首次出现
substr()返回字符串的子串

If you rely on strcmp for safe string comparisons, both parameters must be strings, the result is otherwise extremely unpredictable.
For instance you may get an unexpected 0, or return values of NULL, -2, 2, 3 and -3.
strcmp("5", 5) => 0
strcmp("15", 0xf) => 0
strcmp(61529519452809720693702583126814, 61529519452809720000000000000000) => 0
strcmp(NULL, false) => 0
strcmp(NULL, "") => 0
strcmp(NULL, 0) => -1
strcmp(false, -1) => -2
strcmp("15", NULL) => 2
strcmp(NULL, "foo") => -3
strcmp("foo", NULL) => 3
strcmp("foo", false) => 3
strcmp("foo", 0) => 1
strcmp("foo", 5) => 1
strcmp("foo", array()) => NULL + PHP Warning
strcmp("foo", new stdClass) => NULL + PHP Warning
strcmp(function(){}, "") => NULL + PHP Warning

i hope this will give you a clear idea how strcmp works internally.

One big caveat - strings retrieved from the backtick operation may be zero terminated (C-style), and therefore will not be equal to the non-zero terminated strings (roughly Pascal-style) normal in PHP. The workaround is to surround every `` pair or shell_exec() function with the trim() function. This is likely to be an issue with other functions that invoke shells; I haven't bothered to check.
On Debian Lenny (and RHEL 5, with minor differences), I get this:
====PHP====

In summary, strcmp() does not necessarily use the ASCII code order of each character like in the 'C' locale, but instead parse each string to match language-specific character entities (such as 'ch' in Spanish, or 'dz' in Czech), whose collation order is then compared. When both character entities have the same collation order (such as 'ss' and '' in German), they are compared relative to their code by strcmp(), or considered equal by strcasecmp().
The LC_COLLATE locale setting is then considered: only if LC_COLLATE=C or LC_ALL=C does strcmp() compare strings by character code.
Generally, most locales define the following order:
control, space, punctuation and underscore, digit, alpha (lower then upper with Latin scripts; or final, middle, then isolated, initial with Arabic script), symbols, others...
With strcasecmp(), the alpha subclass is ignored and consider all forms of letters as equal.
Note also that some locales behave differently with accented characters: some consider they are the same letter as the unaccented letter (with a minor collation order, e.g. French, Italian, Spanish), some consider they are distinct letters with an independant collation order (e.g. in the C locale, or in Nordic languages).
Finally, the collation string is not considering individual characters but instead groups of characters that form a single letter:
- for example "ch" or "CH" in Spanish which is always after all other strings beginning with 'c' or 'C', including "cz", but before 'd' or 'D';
- 'ss' and '' in German;
- 'dz', 'DZ' and 'Dz' in some Central European languages written with the Latin script...
- UTF-8, UTF-16 (Unicode), S-JIS, Big5, ISO2022 character encoding of a locale (the suffix in the locale name) first decode the characters into the UCS4/ISO10646 code position before applying the rules of the language indicated by the main locale...
So be extremely careful to what you consider a "character", as it may just mean a encoding byte with no significance in the string collation algorithm: the first character of the string "cholera" in Spanish is "ch", not "c" !

Note a difference between 5.2 and 5.3 versions
echo (int)strcmp('pending',array());
will output -1 in PHP 5.2.16 (probably in all versions prior 5.3)
but will output 0 in PHP 5.3.3
Of course, you never need to use array as a parameter in string comparisions.

Just a short comment to the note of arnar at hm dot is: md5() is a hash function and therefore it may happen (although it is very unlikely) that the md5() checksums of two different strings will be equal (hash collision) ...

Some notes about the spanish locale. I've read some notes that says "CH", "RR" or "LL" must be considered as a single letter in Spanish. That's not really tru. "CH", "RR" and "LL" where considered a single letter in the past (lot of years ago), for that you must use the "Tradictional Sort". Nowadays, the Academy uses the Modern Sort and recomends not to consider anymore "CH", "RR" and "LL" as a single letter. They must be considered two separated letters and sort and compare on that way.
Ju just have to take a look to the Offial Spanish Language Dictionary and you can see there that from many years ago there is not the separated section for "CH", "LL" or "RR" ... i.e. words starting with CH must be after the ones starting by CG, and before the ones starting by CI.

php dot or dot die at phpuser dot net wrote that he had an unexpected difference in comparing between case sensitive and insensitive. They key there is that the case insensitive comparison converts both strings to lowercase before comparing. Since the underscore character is in a different place when compared to an uppercase or lowercase string, the result is different.
There is no 'clear' order of punctuation and other characters in or around the alphabet. Most code assumes ASCII order in which case there are several characters before both upper- and lowercase, a few inbetween, and some after both upper- and lowercase.
Note also many other/older implementations of sorting sort accented character wrong since they appear after all other alphabetical characters in most charactersets. There is probably a function in PHP to take this into account though.
Therefore I would not recommend to make a detailed assumption on how punctuation and other characters sort in relation to alphabetical characters. If sorting these characters at a specific place and in a specific order is important for you, you should probably write a custom string comparison function that does it the way you want. Usually it's sufficient to have a consistent sorting order though, which is what you get by using either strcmp, or strcasecmp consistently.

1) If the two strings have identical BEGINNING parts, they are trunkated from both strings.
2) The resulting strings are compared with two possible outcomes:
a) if one of the resulting strings is an empty string, then the length of the non-empty string is returned (the sign depending on the order in which you pass the arguments to the function)
b) in any other case just the numerical values of the FIRST characters are compared. The result is +1 or -1 no matter how big is the difference between the numerical values. 

In Apache/2.4.37 (Win32) OpenSSL/1.1.1 PHP/7.2.12 produces the following results:
(,a) = -1 //comparing with an empty string produces the length of the NON-empty string
(a,) = 1 // ditto
(,afox) = -4 // ditto
(afox,) = 4 // ditto
(,foxa) = -4 // ditto
(foxa,) = 4 // ditto
(a,afox) = -3 // The identical BEGINNING part ("a") is trunkated from both strings. Then the remaining "fox" is compared to the remaing empty string in the other argument. Produces the length of the NON-empty string. Same as in all the above examples.
(afox,a) = 3 // ditto
(a,foxa) = -1 // Nothing to trunkate. Just the numerical values of the first letters are compared
(foxa,a) = 1 // ditto
(afox,foxa) = -1 // ditto
(foxa,afox) = 1 // ditto

In case you want to get results -1, 0 or 1 always, like JS indexOf();

Hello! I am Glen Nordstrom, experienced technical writer. I have written immense blogs on technical topic like: steps for HP printer setup, fix ink cartridge issue, how to bring offline HP printer to online and many more. The contents described by me are very simple and easy to understand. So, if you have HP printer and want to setup it, then feel free to access the link 123.hp.com/setup via opening the web browser on your computer system. When you go through the provided guidelines, the printer will be setup within a couple of second. For more info, you can also call us anytime at helpline number. The team of technical engineers will answer your call and assist you in a right way. https://l-123hpsetup.com/

strcmp returns strlen($str1)-strlen($str2) when one string is the part of the other string. Otherwise it returns -1 or 1 if two strings are not identical and 0 when they are.

I hope the above example will help you.

Sometimes when you compare two strings that look "the same", you will find that they aren't. If you don't want to bother finding out why, then this is a simple solution:
$string = implode(str_split($string));
Converting the strings to md5 is also a nice method to see if they're equal.
md5($str1)."\n";
md5($str2)."\n\n";
____________
Arnar Yngvason
ThinkSoftware

Vulnerability (in PHP >=5.3) :

$ curl -d password=sekret http://andersk.scripts.mit.edu/strcmp.php
Welcome, authorized user!
$ curl -d password=wrong http://andersk.scripts.mit.edu/strcmp.php
Go away, imposter.
$ curl -d password[]=wrong http://andersk.scripts.mit.edu/strcmp.php
Welcome, authorized user!
SRC of this example: https://www.quora.com/Why-is-PHP-hated-by-so-many-developers

For those that are confused about the way this function works:

Alphabetically 'a' precedes 'b'. If we view the strings as values 'a' is less than 'b' and therefore the function returns -1.
If we were searching through an alphabetically sorted list we'd have a numerical index ($i) and compare the search string ($sstr) against each member of the string list ($slist), using strcmp we can check whether to go "up"($i++) or "down"($i--) through this list. 
Here's the example function:

The definition of return values of this function is listed correctly on this page, however, there is a common misconception in the notes posted here previously from users. 
A previous poster said:
If $str1 == $str2 strcmp return 0.
If $str1  > $str2 strcmp return 1.
If $str1 If you want to strings according to locale, use strcoll instead.
When using strcmp to compare results received from a form, keep in mind that the way you decide to encapsulate the value of the form will have an effect on your strcmp() results.
Example:


strcmp() will not return the values sent from this form as "0".
However, by using single-quotes or double-quotes to encapsulate BOTH values, strcmp() will return a "0" result.
Regarding bizarre return values from str*cmp(), I was having similar troubles until I realized that I was attempting to compare a string with HTML formatting with its plain-text equivilant. The formatted string was an  value, so the HTML was rendered without the  and  formatting I was using. Consequently the formatted and unformatted strings were rendered identically in the browser. D'oh!
Here is my function to compare russian words.
You can replace $abc to your alphabet.
function strcmp_rus($str1, $str2)
{
  $abc = "АБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ
абвгдеёжзийклмнопрстуфхцчшщъыьэюя";
  $len = min(strlen($str1), strlen($str2));
  for ($i = 0; $i  $s2) return 1;  
  }
  return 0;
}
It's definitely worth noting that the return-values of strcmp() when used for i.e. password-checking is the oposite of that of the ==-operator.
I.e.:
$pw1 = "yeah";
$pw2 = "yeah";
if (strcmp($pw1, $pw2)) {  // This returns false.
  // $pw1 and $pw2 are NOT the same.
} else {
  // $pw1 and $pw2 are the same.
}
Where the use of the == operator would give us.:
if ($pw1==$pw2) {  // This returns true.
  // $pw1 and $pw2 are the same.
} else {
  // $pw1 and $pw2 are NOT the same.
}
Additionally, to check if $pw1 and $pw2 are of the same type you can use the === operator.

鹏仔微信 15129739599 鹏仔QQ344225443 鹏仔前端 pjxi.com 共享博客 sharedbk.com

免责声明：我们致力于保护作者版权，注重分享，当前被刊用文章因无法核实真实出处，未能及时与作者取得联系，或有版权异议的，请联系管理员，我们会立即处理! 部分文章是来自自研大数据AI进行生成,内容摘自(百度百科,百度知道,头条百科,中国民法典,刑法,牛津词典,新华词典,汉语词典,国家院校,科普平台)等数据,内容仅供学习参考,不准确地方联系删除处理!邮箱：344225443@qq.com)

图片声明：本站部分配图来自网络。本站只作为美观性配图使用,无任何非法侵犯第三方意图,一切解释权归图片著作权方,本站不承担任何责任。如有恶意碰瓷者,必当奉陪到底严惩不贷!

内容声明：本文中引用的各种信息及资料（包括但不限于文字、数据、图表及超链接等）均来源于该信息及资料的相关主体（包括但不限于公司、媒体、协会等机构）的官方网站或公开发表的信息。部分内容参考包括:(百度百科,百度知道,头条百科,中国民法典,刑法,牛津词典,新华词典,汉语词典,国家院校,科普平台)等数据,内容仅供参考使用,不准确地方联系删除处理！本站为非盈利性质站点,本着为中国教育事业出一份力,发布内容不收取任何费用也不接任何广告!)