Made by Mike_Zhang

所有文章:
R学习1-基础
 R学习2-I/O
R学习3-Vector List Matrix
R学习4-Data Frame
R学习5-Graphics

1 Formatted Print

1.1 `print()`

print(x,digits=y)

digits参数用来声明有多少位数字，如下：

> print(pi)
[1] 3.141593
> print(pi,digits=3)
[1] 3.14

1.2 `cat()`

在cat()中先使用format()方法声明其digits参数，如下：

1 2	`> cat(format(pi,digits=3),'\n') 3.14`

1.3 `options()`

使用此方法会改变工作区内所有输出的格式，因此并不推荐使用，如下：

> pi
[1] 3.141593
> options(digits=3)
> pi
[1] 3.14

2 Write to File

> cat("something to write", file="filename")

此方法默认重写文件，若要在原文件后面加上内容，需要设置append参数为TRUE，如下：

> cat("something to write", file="filename", append=TRUE)

或者可以使用变量名代替文件名，方便多次使用，如下：

f <- file("filename.txt","w")
cat("something to write", file=f)
cat("write something", file=f)
close(f)

此方法默认append=TRUE。

或者使用有参sink()方法，可以使所有的print和cat方法输出到指定的文件内，完成输出后，使用无参sink()方法，可以关闭输出文件并恢复输出到console，如下：

> sink("filename") # Begin write to file
...
# write something
...
> sink() # End writing to file, back to console

若要查看当前目录下有哪些文件，使用> list.files()来查看，
或者> list.files(all.files=TRUE)来查看包括隐藏的所有文件。

3 Read from Fixed-Width File

使用read.fwf()方法，如下：

1	`> records <- read.fwf("filename", widths=c(w1, w2, ..., wn))`

Data:

Fisher R.A. 1890 1962
Pearson Karl 1857 1936
Cox Gertrude 1900 1978
Yates Frank 1902 1994
Smith Kirstine 1878 1939

> records <- read.fwf("fixed-width.txt", widths=c(10,10,4,-1,4))
> records
V1 V2 V3 V4
1 Fisher R.A. 1890 1962
2 Pearson Karl 1857 1936
3 Cox Gertrude 1900 1978
4 Yates Frank 1902 1994
5 Smith Kirstine 1878 1939

此处的-1表示忽略宽度为1的列.

> records <- read.fwf("fixed-width.txt", widths=c(10,10,4,-1,4),
+ col.names=c("Last","First","Born","Died"))
> records
Last First Born Died
1 Fisher R.A. 1890 1962
2 Pearson Karl 1857 1936
3 Cox Gertrude 1900 1978
4 Yates Frank 1902 1994
5 Smith Kirstine 1878 1939

亦可设置col.names属性，为输出加上标题，如上。

4 Read from Table

1	`> dfrm <- read.table("filename")`

被读取文件需满足：

一行为一条记录；

一行中的数据由一个字符隔开，如space,tab,colon,comma.

每一行由相同的单元数量。

如：

Data：

Fisher R.A. 1890 1962
Pearson Karl 1857 1936
Cox Gertrude 1900 1978
Yates Frank 1902 1994
Smith Kirstine 1878 1939

> dfrm <- read.table("statisticians.txt")
> print(dfrm)
V1 V2 V3 V4
1 Fisher R.A. 1890 1962
2 Pearson Karl 1857 1936
3 Cox Gertrude 1900 1978
4 Yates Frank 1902 1994
5 Smith Kirstine 1878 1939

输出的表格会自动带上行号列号。

可以通过行号列号访问到对应的单元，如下：

1 2	`> class(dfrm$V1) [1] "factor"`

此方法会对NA数据直接输出为”NA”, 若要自定义，则可以修改na.strings属性，如下：

1	`> dfrm <- read.table("filename.txt", na.strings=".")`

若被读取的数据自带标题行，则可以通过设置header属性为TRUE，使输出结果使用自带的标题行，如下：

Data:

lastname firstname born died
Fisher R.A. 1890 1962
Pearson Karl 1857 1936
Cox Gertrude 1900 1978
Yates Frank 1902 1994
Smith Kirstine 1878 1939

> dfrm <- read.table("statisticians.txt", header=TRUE, stringsAsFactor=FALSE)
> print(dfrm)
lastname firstname born died
1 Fisher R.A. 1890 1962
2 Pearson Karl 1857 1936
3 Cox Gertrude 1900 1978
4 Yates Frank 1902 1994
5 Smith Kirstine 1878 1939

此方法会自动忽略被读文件中被’#’修饰的注释行。

5 Read from CSV

若CSV文件有标题行，则：

1	`> tbl <- read.csv("filename")`

若无标题行，则：

1	`> tbl <- read.csv("filename"，header=FALSE)`

类似于read.table("filename")方法，也会忽略注释行，若不要忽略，则设置参数comment.char=""。

6 Write to CSV

1	`> write.csv(x, file="filename", row.names=FALSE)`

设置row.names=FALSE，取消行标题的显示。否则显示为设置的名字，默认为数字。
设置col.names=FALSE，取消列标题的显示。

注意：此方法只能重写原文件，不能在原来文件后加上内容。
若有此需求，使用write.table()方法。

7 Read from Web

R可以从网络上读取txt，csv等文件，如下：

1 2	`> tbl <- read.csv("http://www.example.com/download/data.csv") > tbl <- read.table("ftp://ftp.example.com/download/data.txt")`

8 Read from HTML

R可以从网络上读取HTML文件，需要用到XML包，如下：

1
2
3

> library(XML)
> url <- 'http://www.example.com/data/table.html'
> tbls <- readHTMLTable(url)

并且可以设置此方法的which属性来选取读取的页数，如下：

1	`> tbl <- readHTMLTable(url, which=3)`

9 Complex Reading

9.1 `readLines()`

此方法从文件中读取每一行，并将其返回成字符串数组，如下：

1	`> lines <- readLines("file.txt")`

也可以设置具体的读取行数，如下：

1	`> lines <- readLines("file.txt", n=3)`

9.2 `scan()`

此方法可以返回包括满足要求的片段列表，如下：

Data:

1	`2355.09 2246.73 1738.74 1841.01 2027.85`

1
2
3

> singles <- scan("singles.txt", what=numeric(0))
> singles
[1] 2355.09 2246.73 1738.74 1841.01 2027.85

第二个what参数用作规定读取的片段内容，如下：

what=numeric(0): number.
what=integer(0): integer.
what=complex(0): complex number.
what=character(0): character string.
what=logical(0): logical value.

此方法也有其他的参数可以设置，如下：

n=number:
Stop after reading this many tokens. (Default: stop at end of file.)

nlines=number:
Stop after reading this many input lines. (Default: stop at end of file.)

skip=number:
Number of input lines to skip before reading data.

na.strings=list:
A list of strings to be interpreted as NA.

参考

P. Teetor, R Cookbook. Sebastopol: O’Reilly Media, Incorporated, 2011.

写在最后

R语言相关的知识会继续学习，继续更新.
最后，希望大家一起交流，分享，指出问题，谢谢！

原创文章，转载请标明出处
Made by Mike_Zhang

感谢你的支持

Programming > R

#学习 #R

R学习2-I/O

https://ultrafish.io/post/R-learning-2/

Author

Mike_Zhang

Posted on

December 18, 2021

Licensed under

Python Basic Note Previous

R学习1-基础 Next

R学习2-I/O

1 Formatted Print

1.1 print()

1.2 cat()

1.3 options()

2 Write to File

3 Read from Fixed-Width File

4 Read from Table

5 Read from CSV

6 Write to CSV

7 Read from Web

8 Read from HTML

9 Complex Reading

9.1 readLines()

9.2 scan()

参考

写在最后

1.1 `print()`

1.2 `cat()`

1.3 `options()`

9.1 `readLines()`

9.2 `scan()`