shell 和 python3 :Word Frequency(leetcode192-t11.sh)
来源:互联网 发布:sqlserver 认证 编辑:程序博客网 时间:2024/06/06 14:08
Word Frequency
Total Accepted: 5884 Total Submissions: 22927 Difficulty: Medium Contributors: Admin
Write a bash script to calculate the frequency of each word in a text file words.txt.
For simplicity sake, you may assume:
- words.txt contains only lowercase characters and space ’ ’ characters.
- Each word must consist of lowercase characters only.
- Words are separated by one or more whitespace characters.
For example, assume that words.txt has the following content:
the day is sunny the thethe sunny is is
Your script should output the following, sorted by descending frequency:
the 4is 3sunny 2day 1
Note:
Don’t worry about handling ties, it is guaranteed that each word’s frequency count is unique.
#!/bin/bashdeclare -A HashWordFile="words.txt"function ReadTxtFile{ while read Line do Word=(${Line}) for Var in ${Word[@]} do HashWord[${Var}]=${HashWord[${Var}]}'1' # 等效于 HashWord+=( [${Var}]='1') echo "Hashword datagroup $Var : ${HashWord[${Var}]}" Word[${Var}]= done done < ${File} for Key in ${!HashWord[*]} #${!HashWord[*]} or ${!HashWord[@]} 是返回所有下角标 do echo "${Key} ${#HashWord[${Key}]}" done}### Main LogicReadTxtFile
执行结果:
root@ubuntu:~/test# ./t11.sh Hashword datagroup the : 1Hashword datagroup day : 1Hashword datagroup is : 1Hashword datagroup sunny : 1Hashword datagroup the : 11Hashword datagroup the : 111Hashword datagroup the : 1111Hashword datagroup sunny : 11Hashword datagroup is : 11Hashword datagroup is : 111day 1is 3sunny 2the 4
或者
#!/bin/bashdeclare -A HWFile=$1while read line do word=${line[*]} for var in ${word[*]} do HW[$var]=${HW[$var]}'1' done done < $Filefor key in ${!HW[*]} do echo "${key} ${#HW[$key]}" done
执行结果:
root@ubuntu:~/test# ./t11-1.sh words.txt day 1is 3sunny 2the 4
Reference:
符号${!arry[@]}返回所有下角标http://blog.csdn.net/baiwz/article/details/25078551
while read line一次读入一行,read读到的值放在line中,可加echo “Word : ${Word[*]}” 验证。
python3:
import pprintmessage='the day is sunny the the \n the sunny is is'print(message)a=[]count={}#lines=message.replace('\n','').split(' ') 与下行一样lines=message.strip('\n').split(' ') #去掉换行符,以空格为标志把文本分割开成列表项a.extend(lines)print(a)for word in a: count.setdefault(word,0) count[word]=count[word]+1pprint.pprint(count)
执行结果:
================== RESTART: /Users/valen/Documents/test.py ==================the day is sunny the the the sunny is is['the', 'day', 'is', 'sunny', 'the', 'the', '\n', 'the', 'sunny', 'is', 'is']{'\n': 1, 'day': 1, 'is': 3, 'sunny': 2, 'the': 4}>>>
https://zhidao.baidu.com/question/1690382694635348108.html
http://blog.csdn.net/huguangshanse00/article/details/14639871
0 0
- shell 和 python3 :Word Frequency(leetcode192-t11.sh)
- leetcode192. Word Frequency
- [Leetcode Shell]Word Frequency
- 【Leetcode Shell】Word Frequency
- Leetcode: Word Frequency (shell , awk)
- leetcode-shell-192. Word Frequency
- Word Frequency
- Word Frequency
- Word Frequency
- Word Frequency
- Word-frequency filter
- [leetcode][bash] Word Frequency
- leetcode-192 Word Frequency
- LeetCode 192 Word Frequency
- Leetcode: Word Frequency
- Multiple Threads: Word Frequency
- Multiple Files: Word Frequency
- [leetcode]Word Frequency
- 算法导论之寻找最大子数组
- 图论 最大团,最大独立集
- maven+springmvc+dubbozookeeper
- 优雅地为RecyclerView加上头部、下拉刷新、自动加载
- Java Map 按Key排序
- shell 和 python3 :Word Frequency(leetcode192-t11.sh)
- 高并发实时直播弹幕研发实践|架构师实践日
- ssh免密码设置
- 事务和线程的区别还有事务并发执行引起的四个问题:丢失修改、脏读、不可重复读,幻读
- 实现分组+固定表头的ListView之PinnedHeaderListView
- hdu2955 Robberies
- Spark MLlib特征处理:OneHotEncoder OneHot编码 ---原理及实战
- 开始第一篇博客
- java发送Email邮件