前言
前一阵面试腾讯问道了这么一个问题,如何使用linux常用命令,在一个类似表格的文本文件里找到所有含有 a 的行,并且按照第三列的内容排序并输出到一个文件中
当时回答的不是特别好,其实要解决这个问题涉及了较多的常用命令,故此整理以下这个问题相关的linux命令和使用方法

测试文件

NAME            JOB             AGE     BIRTH_DATE
Sheldon         physicist       27      1.1
Penny           actress         26      2.2
Leonard         physicist       28      3.3
Howard          engineer        24      1.31
Rajesh          physicist       26      4.4
Bernadette      scientist       26      2.28
Amy             biologist       28      12.31

首先要作的肯定是读取文本文件的内容

能够获取文件内容命令

  1. cat
$ cat test.txt 
NAME		JOB		AGE	BIRTH_DATE
Sheldon		physicist	27	1.1	
Penny		actress		26	2.2
Leonard		physicist	28	3.3
Howard		engineer	24	1.31
Rajesh		physicist	26	4.4
Bernadette	scientist	26	2.28
Amy 		biologist	28	12.31
$
  1. nano (文本编辑命令)
$ vi test.txt

NAME            JOB             AGE     BIRTH_DATE
Sheldon         physicist       27      1.1
Penny           actress         26      2.2
Leonard         physicist       28      3.3
Howard          engineer        24      1.31
Rajesh          physicist       26      4.4
Bernadette      scientist       26      2.28
Amy             biologist       28      12.31

                             [ Unbound key: M-^A ]
^G 求助     ^O Write Out ^W 搜索      ^K 剪切文字  ^J 对齐      ^C 游标位置
^X 离开      ^R 读档      ^ \ 替换      ^U Uncut Text^T 拼写检查  ^_ 跳行

  1. vim (功能及其强大的编辑器,就不展开使用方式了)
$ vi test.txt

NAME            JOB             AGE     BIRTH_DATE
Sheldon         physicist       27      1.1
Penny           actress         26      2.2
Leonard         physicist       28      3.3
Howard          engineer        24      1.31
Rajesh          physicist       26      4.4
Bernadette      scientist       26      2.28
Amy             biologist       28      12.32

:wq
  1. head (只读取最前面几行,适用于大文件)
$  # 只读取前5行
$ head -n 5 test.txt 
NAME		JOB		AGE	BIRTH_DATE
Sheldon		physicist	27	1.1	
Penny		actress		26	2.2
Leonard		physicist	28	3.3
Howard		engineer	24	1.31
$
  1. tail (只读最后的几行,适用于大文件)
$ # 只读取最后两行
$ tail -n 2 test.txt 
Bernadette	scientist	26	2.28
Amy 		biologist	28	12.31
$ 
  1. awk ( 只选取特定的行和列)
$  awk '{print $3}' test.txt
AGE
27
26
28
24
26
26
28

$  awk '{print $3 $2}' test.txt
AGEJOB
27physicist
26actress
28physicist
24engineer
26physicist
26scientist
28biologist

$ awk 'NR==3{print}' test.txt
Penny		actress		26	2.2

值得一提的是,awk的功能相当强大,比如简单秋以下text文件第三列的平均值

awk '{sum += $3} END {print "Avg =", sum/NR}' test.txt
Avg = 23.125

awk 命令教程

找出所有含有’a’的行

  1. grep 查找一下即可
$ cat test.txt | grep a
Penny		actress		26	2.2
Leonard		physicist	28	3.3
Howard		engineer	24	1.31
Rajesh		physicist	26	4.4
Bernadette	scientist	26	2.28
  1. awk 也可以完成此功能,甚至可以只获取某一列有a的的行

以第三列为基准排序

  1. sort 命令
cat test.txt | grep a  | sort -n -k 3
Howard		engineer	24	1.31
Bernadette	scientist	26	2.28
Penny		actress		26	2.2
Rajesh		physicist	26	4.4
Leonard		physicist	28	3.3

完成任务.
sort 和 awk 对处理文本来说是两个非常好用的命令,多学多用.



处身寒夜,把握星光。