shell 和 python3 :Word Frequency（leetcode192-t11.sh）

来源：互联网发布：sqlserver 认证编辑：程序博客网时间：2024/06/06 14:08

Word Frequency

 Total Accepted: 5884 Total Submissions: 22927 Difficulty: Medium Contributors: Admin

Write a bash script to calculate the frequency of each word in a text file words.txt.

For simplicity sake, you may assume:

words.txt contains only lowercase characters and space ’ ’ characters.
Each word must consist of lowercase characters only.
Words are separated by one or more whitespace characters.

For example, assume that words.txt has the following content:

the day is sunny the thethe sunny is is

Your script should output the following, sorted by descending frequency:

the 4is 3sunny 2day 1

Note:
Don’t worry about handling ties, it is guaranteed that each word’s frequency count is unique.

#!/bin/bashdeclare -A HashWordFile="words.txt"function ReadTxtFile{    while read Line    do        Word=(${Line})        for Var in ${Word[@]}        do         HashWord[${Var}]=${HashWord[${Var}]}'1'  # 等效于 HashWord+=( [${Var}]='1')         echo "Hashword datagroup $Var : ${HashWord[${Var}]}"         Word[${Var}]=        done    done < ${File}    for Key in ${!HashWord[*]}  #${!HashWord[*]} or ${!HashWord[@]} 是返回所有下角标    do        echo "${Key} ${#HashWord[${Key}]}"    done}### Main LogicReadTxtFile

执行结果：

root@ubuntu:~/test# ./t11.sh Hashword datagroup the : 1Hashword datagroup day : 1Hashword datagroup is : 1Hashword datagroup sunny : 1Hashword datagroup the : 11Hashword datagroup the : 111Hashword datagroup the : 1111Hashword datagroup sunny : 11Hashword datagroup is : 11Hashword datagroup is : 111day 1is 3sunny 2the 4

或者

#!/bin/bashdeclare -A HWFile=$1while read line   do    word=${line[*]}    for var in ${word[*]}       do          HW[$var]=${HW[$var]}'1'       done   done < $Filefor key in ${!HW[*]}    do    echo "${key} ${#HW[$key]}"    done

执行结果：

root@ubuntu:~/test# ./t11-1.sh words.txt day 1is 3sunny 2the 4

Reference:
符号${!arry[@]}返回所有下角标http://blog.csdn.net/baiwz/article/details/25078551
while read line一次读入一行，read读到的值放在line中，可加echo “Word : ${Word[*]}” 验证。

python3:

import pprintmessage='the day is sunny the the \n the sunny is is'print(message)a=[]count={}#lines=message.replace('\n','').split(' ') 与下行一样lines=message.strip('\n').split(' ') #去掉换行符，以空格为标志把文本分割开成列表项a.extend(lines)print(a)for word in a:    count.setdefault(word,0)    count[word]=count[word]+1pprint.pprint(count)

执行结果：

================== RESTART: /Users/valen/Documents/test.py ==================the day is sunny the the  the sunny is is['the', 'day', 'is', 'sunny', 'the', 'the', '\n', 'the', 'sunny', 'is', 'is']{'\n': 1, 'day': 1, 'is': 3, 'sunny': 2, 'the': 4}>>>

https://zhidao.baidu.com/question/1690382694635348108.html
http://blog.csdn.net/huguangshanse00/article/details/14639871

0 0