Linux下如何使用awk解析json数据
来源:互联网 发布:华硕mg248q设置软件 编辑:程序博客网 时间:2024/05/16 08:07
近期在做一个项目,调用api后返回了一个嵌套较多层的json格式的数据,常规解析json的方式也已经很难或做不到提取出需要的信息点,恰好对awk比较熟悉,考虑到awk本身也是一种用于数据处理的工具,于是就有了以下用awk解析json的尝试。下面以face++ 的detect api返回值,json格式:
{ "faces": [ { "face_token": "826536dba3973d71d219a15e18c9597b", "face_rectangle": { "height": 113, "left": 198, "top": 125, "width": 113 }, "attributes": { "age": { "value": 26 }, "gender": { "value": "Female" } }, "landmark": { "mouth_upper_lip_bottom": { "x": 257, "y": 203 }, "right_eye_upper_right_quarter": { "x": 284, "y": 142 }, "left_eyebrow_upper_right_quarter": { "x": 235, "y": 127 }, "mouth_upper_lip_right_contour3": { "x": 267, "y": 202 }, "left_eye_top": { "x": 228, "y": 143 }, "nose_contour_right3": { "x": 265, "y": 185 }, "contour_right6": { "x": 298, "y": 207 }, "mouth_lower_lip_bottom": { "x": 257, "y": 211 }, "nose_contour_right2": { "x": 268, "y": 172 }, "nose_contour_right1": { "x": 264, "y": 149 }, "nose_contour_left3": { "x": 250, "y": 185 }, "nose_contour_left2": { "x": 246, "y": 172 }, "nose_contour_left1": { "x": 249, "y": 150 }, "right_eyebrow_upper_right_quarter": { "x": 286, "y": 128 }, "left_eyebrow_upper_middle": { "x": 226, "y": 126 }, "right_eye_upper_left_quarter": { "x": 273, "y": 143 }, "right_eyebrow_right_corner": { "x": 292, "y": 133 }, "contour_right8": { "x": 283, "y": 227 }, "right_eyebrow_upper_left_quarter": { "x": 271, "y": 127 }, "contour_right2": { "x": 309, "y": 159 }, "contour_right3": { "x": 309, "y": 172 }, "contour_right1": { "x": 309, "y": 147 }, "left_eye_lower_left_quarter": { "x": 222, "y": 149 }, "contour_left1": { "x": 201, "y": 150 }, "contour_left2": { "x": 201, "y": 162 }, "contour_left3": { "x": 203, "y": 175 }, "contour_left4": { "x": 205, "y": 187 }, "contour_left5": { "x": 209, "y": 199 }, "contour_left6": { "x": 215, "y": 210 }, "contour_left7": { "x": 224, "y": 220 }, "right_eye_bottom": { "x": 280, "y": 149 }, "right_eye_lower_right_quarter": { "x": 284, "y": 148 }, "right_eyebrow_left_corner": { "x": 266, "y": 132 }, "right_eyebrow_upper_middle": { "x": 279, "y": 126 }, "mouth_lower_lip_top": { "x": 257, "y": 205 }, "right_eyebrow_lower_left_quarter": { "x": 272, "y": 132 }, "left_eyebrow_lower_left_quarter": { "x": 219, "y": 133 }, "mouth_upper_lip_left_contour3": { "x": 247, "y": 203 }, "left_eyebrow_lower_middle": { "x": 226, "y": 132 }, "left_eye_upper_left_quarter": { "x": 222, "y": 144 }, "mouth_upper_lip_left_contour1": { "x": 252, "y": 198 }, "mouth_upper_lip_top": { "x": 257, "y": 199 }, "mouth_upper_lip_left_contour2": { "x": 244, "y": 200 }, "right_eye_pupil": { "x": 279, "y": 145 }, "mouth_lower_lip_right_contour1": { "x": 267, "y": 204 }, "mouth_lower_lip_left_contour2": { "x": 242, "y": 207 }, "mouth_lower_lip_right_contour3": { "x": 265, "y": 210 }, "mouth_lower_lip_right_contour2": { "x": 272, "y": 206 }, "contour_chin": { "x": 259, "y": 237 }, "contour_left9": { "x": 245, "y": 235 }, "left_eye_lower_right_quarter": { "x": 233, "y": 150 }, "mouth_left_corner": { "x": 236, "y": 202 }, "contour_right4": { "x": 307, "y": 184 }, "contour_right7": { "x": 291, "y": 217 }, "left_eyebrow_left_corner": { "x": 211, "y": 135 }, "nose_right": { "x": 272, "y": 181 }, "nose_tip": { "x": 257, "y": 177 }, "contour_right5": { "x": 303, "y": 196 }, "nose_contour_lower_middle": { "x": 258, "y": 187 }, "right_eye_top": { "x": 278, "y": 141 }, "mouth_lower_lip_left_contour3": { "x": 249, "y": 210 }, "right_eye_right_corner": { "x": 288, "y": 146 }, "mouth_upper_lip_right_contour1": { "x": 261, "y": 198 }, "mouth_upper_lip_right_contour2": { "x": 270, "y": 199 }, "right_eyebrow_lower_right_quarter": { "x": 285, "y": 132 }, "left_eye_left_corner": { "x": 218, "y": 148 }, "mouth_right_corner": { "x": 278, "y": 201 }, "right_eye_lower_left_quarter": { "x": 275, "y": 149 }, "left_eyebrow_right_corner": { "x": 242, "y": 132 }, "left_eyebrow_lower_right_quarter": { "x": 234, "y": 132 }, "right_eye_center": { "x": 279, "y": 146 }, "left_eye_pupil": { "x": 228, "y": 146 }, "nose_left": { "x": 243, "y": 182 }, "mouth_lower_lip_left_contour1": { "x": 247, "y": 204 }, "left_eye_upper_right_quarter": { "x": 233, "y": 145 }, "right_eyebrow_lower_middle": { "x": 279, "y": 132 }, "left_eye_center": { "x": 228, "y": 148 }, "contour_left8": { "x": 233, "y": 228 }, "contour_right9": { "x": 272, "y": 234 }, "right_eye_left_corner": { "x": 270, "y": 149 }, "left_eyebrow_upper_left_quarter": { "x": 218, "y": 128 }, "left_eye_bottom": { "x": 227, "y": 150 }, "left_eye_right_corner": { "x": 237, "y": 149 } } } ], "time_used": 251, "request_id": "1492784536,22b21204-0e1e-4071-91ba-b29e5a68fe03", "image_id": "6kox5v0GjhI9f+MWEc3wIA=="}
我们从调用api开始讲起,以下是我对face++的一个调用,直接运用shell命令的curl发送的post请求:
curl -X POST "https://api-cn.faceplusplus.com/facepp/v3/detect" \ -F "api_key=RF_13WYz9Cgz9KXtJxaqAIbIF7_jxuC-" \ -F "api_secret=MeCM2hq-SSiWXgxMt-JcpOo-t5IAIdz4" \ -F "image_file=@/home/gec/timg.jpeg" \ -F "return_landmark=1" -F "return_attributes=gender,age"
为了方便接下来awk对数据的解析,我们可以将调用api的返回值重定向到一个文件里,将以上命令改写成:
curl -X POST "https://api-cn.faceplusplus.com/facepp/v3/detect" \ -F "api_key=RF_13WYz9Cgz9KXtJxaqAIbIF7_jxuC-" \ -F "api_secret=MeCM2hq-SSiWXgxMt-JcpOo-t5IAIdz4" \ -F "image_file=@/home/gec/timg.jpeg" \ -F "return_landmark=1" -F "return_attributes=gender,age" >face
如此,当调用成功时,我们就可以在face中看到face++的detect api的正确返回值,查看文件用cat指令或者直接用文本编辑器打开都可以,如下:
cat face
{"image_id": "6kox5v0GjhI9f+MWEc3wIA==", "request_id": "1492784536,22b21204-0e1e-4071-91ba-b29e5a68fe03", "time_used": 251, "faces": [{"landmark": {"mouth_upper_lip_left_contour2": {"y": 200, "x": 244}, "mouth_upper_lip_top": {"y": 199, "x": 257}, "mouth_upper_lip_left_contour1": {"y": 198, "x": 252}, "left_eye_upper_left_quarter": {"y": 144, "x": 222}, "left_eyebrow_lower_middle": {"y": 132, "x": 226}, "mouth_upper_lip_left_contour3": {"y": 203, "x": 247}, "left_eyebrow_lower_left_quarter": {"y": 133, "x": 219}, "right_eyebrow_lower_left_quarter": {"y": 132, "x": 272}, "right_eye_pupil": {"y": 145, "x": 279}, "mouth_lower_lip_right_contour1": {"y": 204, "x": 267}, "mouth_lower_lip_left_contour2": {"y": 207, "x": 242}, "mouth_lower_lip_right_contour3": {"y": 210, "x": 265}, "mouth_lower_lip_right_contour2": {"y": 206, "x": 272}, "contour_chin": {"y": 237, "x": 259}, "contour_left9": {"y": 235, "x": 245}, "left_eye_lower_right_quarter": {"y": 150, "x": 233}, "mouth_lower_lip_top": {"y": 205, "x": 257}, "right_eyebrow_upper_middle": {"y": 126, "x": 279}, "right_eyebrow_left_corner": {"y": 132, "x": 266}, "right_eye_lower_right_quarter": {"y": 148, "x": 284}, "right_eye_bottom": {"y": 149, "x": 280}, "contour_left7": {"y": 220, "x": 224}, "contour_left6": {"y": 210, "x": 215}, "contour_left5": {"y": 199, "x": 209}, "contour_left4": {"y": 187, "x": 205}, "contour_left3": {"y": 175, "x": 203}, "contour_left2": {"y": 162, "x": 201}, "contour_left1": {"y": 150, "x": 201}, "left_eye_lower_left_quarter": {"y": 149, "x": 222}, "contour_right1": {"y": 147, "x": 309}, "contour_right3": {"y": 172, "x": 309}, "contour_right2": {"y": 159, "x": 309}, "mouth_left_corner": {"y": 202, "x": 236}, "contour_right4": {"y": 184, "x": 307}, "contour_right7": {"y": 217, "x": 291}, "left_eyebrow_left_corner": {"y": 135, "x": 211}, "nose_right": {"y": 181, "x": 272}, "nose_tip": {"y": 177, "x": 257}, "contour_right5": {"y": 196, "x": 303}, "nose_contour_lower_middle": {"y": 187, "x": 258}, "right_eye_top": {"y": 141, "x": 278}, "mouth_lower_lip_left_contour3": {"y": 210, "x": 249}, "right_eye_right_corner": {"y": 146, "x": 288}, "mouth_upper_lip_right_contour1": {"y": 198, "x": 261}, "mouth_upper_lip_right_contour2": {"y": 199, "x": 270}, "right_eyebrow_lower_right_quarter": {"y": 132, "x": 285}, "left_eye_left_corner": {"y": 148, "x": 218}, "mouth_right_corner": {"y": 201, "x": 278}, "right_eye_lower_left_quarter": {"y": 149, "x": 275}, "left_eyebrow_right_corner": {"y": 132, "x": 242}, "left_eyebrow_lower_right_quarter": {"y": 132, "x": 234}, "right_eye_center": {"y": 146, "x": 279}, "left_eye_pupil": {"y": 146, "x": 228}, "nose_left": {"y": 182, "x": 243}, "mouth_lower_lip_left_contour1": {"y": 204, "x": 247}, "left_eye_upper_right_quarter": {"y": 145, "x": 233}, "right_eyebrow_lower_middle": {"y": 132, "x": 279}, "left_eye_center": {"y": 148, "x": 228}, "contour_left8": {"y": 228, "x": 233}, "contour_right9": {"y": 234, "x": 272}, "right_eye_left_corner": {"y": 149, "x": 270}, "left_eyebrow_upper_left_quarter": {"y": 128, "x": 218}, "left_eye_bottom": {"y": 150, "x": 227}, "left_eye_right_corner": {"y": 149, "x": 237}, "right_eyebrow_upper_left_quarter": {"y": 127, "x": 271}, "contour_right8": {"y": 227, "x": 283}, "right_eyebrow_right_corner": {"y": 133, "x": 292}, "right_eye_upper_left_quarter": {"y": 143, "x": 273}, "left_eyebrow_upper_middle": {"y": 126, "x": 226}, "right_eyebrow_upper_right_quarter": {"y": 128, "x": 286}, "nose_contour_left1": {"y": 150, "x": 249}, "nose_contour_left2": {"y": 172, "x": 246}, "nose_contour_left3": {"y": 185, "x": 250}, "nose_contour_right1": {"y": 149, "x": 264}, "nose_contour_right2": {"y": 172, "x": 268}, "mouth_lower_lip_bottom": {"y": 211, "x": 257}, "contour_right6": {"y": 207, "x": 298}, "nose_contour_right3": {"y": 185, "x": 265}, "left_eye_top": {"y": 143, "x": 228}, "mouth_upper_lip_right_contour3": {"y": 202, "x": 267}, "left_eyebrow_upper_right_quarter": {"y": 127, "x": 235}, "right_eye_upper_right_quarter": {"y": 142, "x": 284}, "mouth_upper_lip_bottom": {"y": 203, "x": 257}}, "attributes": {"gender": {"value": "Female"}, "age": {"value": 26}}, "face_rectangle": {"width": 113, "top": 125, "left": 198, "height": 113}, "face_token": "826536dba3973d71d219a15e18c9597b"}]}原本json结构是相当清晰的,每条数据都会按行分开,但是经实验Linux下(也可能不是环境原因,这里不是很清楚,还请朋友们不吝指正)返回的json是不分行的,所有数据都挤在一行上,这对解析来说无疑又加大了难度。
接下来尝试从json当中提取需要的信息,以年龄age为例。如果信息太多觉得不好找,可以用grep指令标记一下,这样就可以很方便的找到关键词的所在位置:
grep age face
博主最初的思路是:awk能不能通过匹配到age这个词,然后输出当前位置若干字符后的字符串26?博主按这个思路尝试了很久,同时也发现网上有很多朋友问过类似的问题,但是都得不到答案,大概awk本身并没有能直接实现这种需求的语法吧。
我最终想出来的方法是awk指定分隔符结合管道运用,指定 所需数据附近的符号 作为分隔符将原json数据切块,越切越小,切到最后保留的那块数据就是我们想要的数据。
这个方法的优点在于无视了json格式的多层嵌套进行了直接切块,完整指令如下:
awk -F '"' '{print $(NF-14)}' face|awk -F} '{print $1}'|awk -F:' ' '{print $2}'
将上面的指令剖析开来
第一步:
awk -F '"' '{print $(NF-14)}' face #指定引号为分隔符,打印倒数第14个字段
得到:
: 26}},
第二步,基于上面的结果再进行:
awk -F} '{print $1}' #指定}为分隔符,打印第1个字段得到:
: 26
第三步,基于上面结果再进行:
awk -F:' ' '{print $2}' #指定:和空格 为分隔符,打印第2个字段得到结果:
26
至此,awk顺利提取得到了json格式中的数据。由于json格式都是差不多的形式,指定分隔符加管道的方法同样适用于其他api返回数据的解析,仅需修改字段数即可。
另外,linux下使用jq也是一个不错的选择:linux下使用jq解析json
2 0
- Linux下如何使用awk解析json数据
- Linux下如何使用jq解析json数据
- Linux下如何使用jq解析json数据
- Linux下使用jq解析JSON格式的数据
- Linux下使用jq解析JSON格式的数据
- Android下使用Gson解析JSON数据
- Json数据如何解析?
- Json数据如何解析?
- 如何使用JSON Framework库解析与生成json数据
- 如何使用JSON Framework库解析与生成json数据
- Linux 下使用awk处理数据并写入数据库
- 使用Json解析Json数据
- linux 下awk 的使用
- Jquery如何解析JSON数据
- Jquery如何解析JSON数据
- Android下Json数据解析
- 如何使用fastJson来解析JSON格式数据和生成JSON格式数据
- Linux下解析json字符串
- 求1+2+3+...+n
- 充分使用树莓派SD卡容量
- Codeforces gym 101350G 数学
- python学习笔记先导
- 信号量(生产者和消费者模型)
- Linux下如何使用awk解析json数据
- 安卓自定义弧形刻度选择器
- c语言基础(三)
- 关于Java序列化与反序列化
- nginx的反向代理与负载均衡
- Hibernate 出现Unsupported major.minor version 52.0 [duplicate]
- c/c++ static extern const
- BZOJ 1434: [ZJOI2009]染色游戏 博弈
- Memory Network简单理解