fgetcsv() - 从文件指针中读入一行并解析 CSV 字段 - php 文件目录函数

乐乐1年前 (2023-11-21)阅读数 21#技术干货

文章标签文件

fgetcsv()

(PHP 4, PHP 5, PHP 7)

从文件指针中读入一行并解析 CSV 字段

说明

fgetcsv(resource $handle[,int $length= 0[,string $delimiter=','[,string $enclosure='"'[,string $escape='']]]]): array

和fgets()类似，只除了fgetcsv()解析读入的行并找出CSV格式的字段然后返回一个包含这些字段的数组。

参数

$handle

一个由fopen()、popen()或fsockopen()产生的有效文件指针。

$length

必须大于 CVS 文件内最长的一行。在 PHP 5 中该参数是可选的。如果忽略（在 PHP 5.0.4 以后的版本中设为 0）该参数的话，那么长度就没有限制，不过可能会影响执行效率。

$delimiter

设置字段分界符（只允许一个字符）。

$enclosure

设置字段环绕符（只允许一个字符）。

$escape

fgetcsv() - 从文件指针中读入一行并解析 CSV 字段 - php 文件目录函数

设置转义字符（只允许一个字符），默认是一个反斜杠。

返回值

返回包含读取字段的索引数组。

Note:

CSV 文件中的空行将被返回为一个包含有单个null字段的数组，不会被当成错误。Note:在读取在 Macintosh 电脑中或由其创建的文件时，如果 PHP不能正确的识别行结束符，启用运行时配置可选项auto_detect_line_endings也许可以解决此问题。

如果提供了无效的文件指针，fgetcsv()会返回NULL。其他错误，包括碰到文件结束时返回FALSE，。

更新日志

版本	说明
5.3.0	增加了$escape参数。
4.3.5	现在起fgetcsv()的操作是二进制安全的。
4.3.0	增加了$enclosure参数。

范例

读取并显示 CSV 文件的整个内容

注释

Note:

该函数对区域设置是敏感的。比如说LANG设为en_US.UTF-8的话，单字节编码的文件就会出现读取错误。

参见

str_getcsv() 解析 CSV 字符串为一个数组
explode() 使用一个字符串分割另一个字符串
file() 把整个文件读入一个数组中
pack() 将数据打包成二进制字符串
fputcsv() 将行格式化为 CSV 并写入文件指针

If you need to set auto_detect_line_endings to deal with Mac line endings, it may seem obvious but remember it should be set before fopen, not after:
This will work:

This won't, you will still get concatenated fields at the new line position:

This function has no special BOM handling. The first cell of the first row will inherit the BOM bytes, i.e. will be 3 bytes longer than expected. As the BOM is invisible you may not notice.
Excel on Windows, or text editors like Notepad, may add the BOM.

fgetcsv seems to handle newlines within fields fine. So in fact it is not reading a line, but keeps reading untill it finds a \n-character that's not quoted as a field.
Example:

Returns:
array(3) {
 [0]=>
 string(5) "col 1"
 [1]=>
 string(4) "col2"
 [2]=>
 string(4) "col3"
}
array(3) {
 [0]=>
 string(29) "this
is
having
multiple
lines"
 [1]=>
 string(8) "this not"
 [2]=>
 string(13) "this also not"
}
array(3) {
 [0]=>
 string(13) "normal record"
 [1]=>
 string(19) "nothing to see here"
 [2]=>
 string(7) "no data"
}
This means that you can expect fgetcsv to handle newlines within fields fine. This was not clear from the documentation.

When a BOM character is suppled, `fgetscsv` may appear to wrap the first element in "double quotation marks". The simplest way to ignore it is to progress the file pointer to the 4th byte before using `fgetcsv`.

Here is a OOP based importer similar to the one posted earlier. However, this is slightly more flexible in that you can import huge files without running out of memory, you just have to use a limit on the get() method
Sample usage for small files:-
-------------------------------------

Sample usage for large files:-
-------------------------------------

And heres the class:-
-------------------------------------

I needed a function to analyse a file for delimiters and line endings prior to importing the file into MySQL using LOAD DATA LOCAL INFILE
I wrote this function to do the job, the results are (mostly) very accurate and it works nicely with large files too.

Example Usage:

Full function output:
Array
(
  [peak_mem] => Array
    (
      [start] => 786432
      [end] => 786432
    )
  [line_ending] => Array
    (
      [results] => Array
        (
          [nr] => 0
          [r] => 4
          [n] => 4
          [rn] => 4
        )
      [count] => 4
      [key] => rn
      [value] => 
    )
  [lines] => Array
    (
      [count] => 4
      [length] => 94
    )
  [delimiter] => Array
    (
      [results] => Array
        (
          [colon] => 0
          [semicolon] => 0
          [pipe] => 0
          [tab] => 1
          [comma] => 17
        )
      [count] => 17
      [key] => comma
      [value] => ,
    )
  [read_kb] => 10
)
Enjoy!
Ashley

This style is shown as an example on this page and in a number of examples on the Internet:

Note, this won't handle new lines within csv fields and thus should probably be avoided.

Note that fgetcsv, at least in PHP 5.3 or previous, will NOT work with UTF-16 encoded files. Your options are to convert the entire file to ISO-8859-1 (or latin1), or convert line by line and convert each line into ISO-8859-1 encoding, then use str_getcsv (or compatible backwards-compatible implementation). If you need to read non-latin alphabets, probably best to convert to UTF-8.
See str_getcsv for a backwards-compatible version of it with PHP Another version [modified michael from mediaconcepts]
For anyone else struggling with disappearing non-latin characters in one-byte encodings - setting LANG env var (as the manual states) does not help at all. Look at LC_ALL instead.
In my case it was set to "pl_PL.utf8" but since my input file was in CP1250 most of polish characters (but not all of them!) had gone missing and city of "Łódź" had become just "dź". I've "fixed" it with "pl_PL".
If you want to load some translations for your application, don't use csv files for that, even if it's easier to handle.
The following code snippet:

is about 400% slower than this code:

That's the reason why you should allways use .ini files for translations...
http://php.net/parse_ini_file
Only problem with fgetcsv(), at least in PHP 4.x -- any stray slash in the data that happens to come before a double-quote delimiter will break it -- ie, cause the field delimiter to be escaped. I can't find a direct way to deal with it, since fgetcsv() doesn't give you a chance to manipulate the line before it reads it and parses it...I've had to change all occurrences of '\"' to '" in the file first before feeding ot to fgetcsv(). Otherwise this is perfect for that Microsoft-CSV formula, deals gracefully with all the issues.
I used fgetcsv to read pipe-delimited data files, and ran into the following quirk.
The data file contained data similar to this:
RECNUM|TEXT|COMMENT
1|hi!|some comment
2|"error!|another comment
3|where does this go?|yet another comment
4|the end!"|last comment
I read the file like this:

This causes a problem on record 2: the quote immediately after the pipe causes the file to be read up to the following quote --in this case, in record 4. Everything in between was stored in a single element of $row.
In this particular case it is easy to spot, but my script was processing thousands of records and it took me some time to figure out what went wrong.
The annoying thing is, that there doesn't seem to be an elegant fix. You can't tell PHP not to use an enclosure --for example, like this:

(Well, you can tell PHP that, but it doesn't work.)
So you'd have to resort to a solution where you use an extremely unlikely enclosure, but since the enclosure can only be one character long, it may be hard to find.
Alternatively (and IMNSHO: more elegantly), you can choose to read these files like this, instead:

As it's more intuitive and resilient, I've decided to favor this 'construct' over fgetcsv from now on.
Setting the $escape parameter dosn't return unescaped strings, but just avoid splitting on a $delimiter that have an escpae-char infront of it:

鹏仔微信 15129739599 鹏仔QQ344225443 鹏仔前端 pjxi.com 共享博客 sharedbk.com

免责声明：我们致力于保护作者版权，注重分享，当前被刊用文章因无法核实真实出处，未能及时与作者取得联系，或有版权异议的，请联系管理员，我们会立即处理! 部分文章是来自自研大数据AI进行生成,内容摘自(百度百科,百度知道,头条百科,中国民法典,刑法,牛津词典,新华词典,汉语词典,国家院校,科普平台)等数据,内容仅供学习参考,不准确地方联系删除处理!邮箱：344225443@qq.com)

图片声明：本站部分配图来自网络。本站只作为美观性配图使用,无任何非法侵犯第三方意图,一切解释权归图片著作权方,本站不承担任何责任。如有恶意碰瓷者,必当奉陪到底严惩不贷!

内容声明：本文中引用的各种信息及资料（包括但不限于文字、数据、图表及超链接等）均来源于该信息及资料的相关主体（包括但不限于公司、媒体、协会等机构）的官方网站或公开发表的信息。部分内容参考包括:(百度百科,百度知道,头条百科,中国民法典,刑法,牛津词典,新华词典,汉语词典,国家院校,科普平台)等数据,内容仅供参考使用,不准确地方联系删除处理！本站为非盈利性质站点,本着为中国教育事业出一份力,发布内容不收取任何费用也不接任何广告!)