LeetCode算法题——Top K Frequent Words

来源：互联网发布：java运行环境官方下载编辑：程序博客网时间：2024/05/17 21:47

题目概述

Given a non-empty list of words, return the k most frequent elements.
Your answer should be sorted by frequency from highest to lowest. If two words have the same frequency, then the word with the lower alphabetical order comes first.

Input: ["i", "love", "leetcode", "i", "love", "coding"], k = 2Output: ["i", "love"]Explanation: "i" and "love" are the two most frequent words.    Note that "i" comes before "love" due to a lower alphabetical order.

分析

此题主要需要实现两个方面的功能：
- 统计出每个单词出现的次数
- 在排序的时候，相同出现频率的单词要进行字典序排序

实现前一个功能，比较容易，只需要使用一个map数据结构，遍历一遍单词列表，建立起一个

struct cmp{  // priority_queue的自定义排序重载    bool operator() (pair<string, int> a, pair<string, int> b) {        if (a.second == b.second) return a.first > b.first; // 直接使用string的符号比较        return a.second < b.second; // 字典序最低到最高    }};class Solution {public:    vector<string> topKFrequent(vector<string>& words, int k) {        unordered_map<string, int> map;        priority_queue< pair<string, int>, vector< pair<string, int> >, cmp> que;        vector<string> res;        for (string s : words) map[s]++; // 统计每个单词出现的次数        for (auto p : map) que.push(p); // 将每一组映射加入priority_queue        for (int i = 0; i < k; i++) {            res.emplace_back(que.top().first); // 取出前k个队列元素            que.pop();        }        return res;    }};

时间复杂度为O（N log N），N为单词列表的长度；空间复杂度为O（N）

总结

这道题目综合使用了多种数据结构的使用仔细研究代码，可以整理出许多关于数据结构的知之后识，熟练掌握之后，代码将显得简洁易懂，非常高效。

建立映射使用map结构。需要注意的是，C++11中常推荐的是使用unorder_map，因为其内部元素的value是不进行自动排序的，在操作时效率会更高一些；不过在内存占用上，unorder_map比起map要略微高一些。

priority_queue默认情况下有三个参数，分别是type, container, functional。默认情况下是大顶堆，即队首元素最大。而本题由于要使用小顶堆，即队首元素字典序最小，因此要额外使用重载函数。值得一提的是，这里的重载函数需要以结构体形式出现。

vector中的emplace_back函数可以视为push_back函数的替代品，不过前者直接在数组末尾原地添加插入的元素，而不像后者需要出发拷贝和转移构造，效率更高，节省了资源。

阅读全文

0 0