nittoco · nittoco · Apr 25, 2025 · Apr 25, 2025 · oda · Apr 25, 2025
diff --git a/347. Top K Frequent Elements.md b/347. Top K Frequent Elements.md
@@ -0,0 +1,173 @@
+## Step1
+
+- なんか何回もループして効率悪い気もするし、冗長な気もするが、とりあえず
+
+```python
+
+class Solution:
+    def topKFrequent(self, nums: List[int], k: int) -> List[int]:
+        num_to_frequency = defaultdict(int)
+        for num in nums:
+            num_to_frequency[num] += 1
+        frequencies_and_nums = []
+        for num, frequency in num_to_frequency.items():
+            frequencies_and_nums.append((frequency, num))
+        result = []
+        elems_count = 0
+        for frequency, num in sorted(frequencies_and_nums, reverse=True):
+            result.append(num)
+            elems_count += 1
+            if elems_count == k:
+                return result
+```
+
+## Step2
+
+- まずはセルフ添削
+    - 以下のエッジケースでどうするか
+        - kがnより大きい場合
+            - 今は、何も返さないが、まずそう。とりあえずn個全部返そうかなあ
+            - warningとか出した方がいいのかなあ
+        - kが0の場合
+            - 何も返さないでもよさそう。いやでも、空配列を返したいかも。
+        - numsが空の場合
+            - 空配列でいいかなあ。
+    - 大きく分けると、各数字の頻度を数える(num_to_frequencyの初期化)→頻度順に並び替える→k個入れる、となる。「頻度順に並び替える」の部分を関数化するのはアリかな？
+- https://github.com/fuga-98/arai60/pull/10/files
+    - Counterを使う方法がある。特に、most_commonを使うと楽
+        - **あとでドキュメントと内部実装を見てみる**
+        - ドキュメント
+            - 辞書のサブクラス(なのでhashableなものしか無理)
+            - elements()で、要素を挿入順に返せる
+            - substract()で要素が引かれる。total()で要素数の合計。
+            - +, -, &, |なども定義されている
+    - heapqを使うのは、sortすれば済む話なので、あまり好きではないかも。
+    - sorted(num_count, key=num_count.get, reverse=True) だけで、value順にkeyが並ぶのか。なら、頻度順に並び替える、を関数にしなくてもよさそう
+- https://github.com/fhiyo/leetcode/pull/12/files
+    - なるほど、dataclassにしちゃって、__lt__で順序関係を定義するのもいいかも
+    - quick selectというのがあるのか。quick sortとまあ大体一緒だけど、分割したあとk番目のやつがどっちにあるのかを見て、ソートする必要がある範囲を狭めてちょっと効率化する感じ。**あとで実装。**
+    - mapを多用するのは見にくいかも。目が動くので。
+- https://discord.com/channels/1084280443945353267/1195700948786491403/1199312856806600805
+    - pushしたあと、すぐpopするのはやや違和感？
+- [https://github.com/irohafternoon/LeetCode/pull/11](https://github.com/irohafternoon/LeetCode/pull/11#discussion_r2029796289)
+    - C++だとsetが平衡二分木なので、sort()しなくてもいけるのか
+    - リストやマップをクラス変数で共有するのは、犠牲にするものが結構ある
+- [https://github.com/plushn/SWE-Arai60/pull/9](https://github.com/plushn/SWE-Arai60/pull/9#discussion_r2016235753)
+    - マイナスにしてpriority_queueは、少しわかりづらく感じた。
+- [https://github.com/mura0086/arai60/pull/14](https://github.com/mura0086/arai60/pull/14#discussion_r2009686710)
+    - 似た変数名があると混乱するね
+- https://github.com/Fuminiton/LeetCode/pull/9#pullrequestreview-2626752734
+- https://github.com/quinn-sasha/leetcode/pull/9
+- https://github.com/olsen-blue/Arai60/pull/9
+- https://github.com/t0hsumi/leetcode/pull/9#discussion_r1880366187
+- QuickSelectと、dataclassで特殊メソッドを定義する、以下の実装を試した。(注: 練習のためで、本番ではこれはやりません)
+    - ltやeqなどの、特殊メソッドのドキュメントを読んだ(https://docs.python.org/ja/3.12/reference/datamodel.html#object.__lt__)
+    - デフォルトの実装。eqはisを呼ぶ。等しくない場合はNot Implemented。__ne__はその逆。
+    - そのほかの比較演算子は、デフォルト挙動は存在しない。<または==であることは≤を保証しない。比較演算子がない場合、<や≤はTypeErrorとなる。
+    - reflectionというのがよくわからなかったが、chatGPTに聞いた感じ、`a < b` で `a.__lt__(b)` がダメだったら `b.__gt__(a)` を試す。ということらしい？
+        - こういう用語、常識なのか？
+
+    ```python
+
+    from dataclasses import dataclass
+
+    @total_ordering
+    @dataclass
+    class NumWithFreq:
+        num: int
+        freq: int
+
+        def __lt__(self, other):
+            if self.freq == other.freq:
+                return self.num < other.num
+            return self.freq < other.freq
+
+        def __eq__(self, other):
+            return self.freq == other.freq and self.num == other.num
+
+    class QuickSelect:
+        def __init__(self, nums):
+            self.nums = nums
+            self.all_length = len(nums)
+
+        def _swap(self, index1, index2):
+            self.nums[index1], self.nums[index2] = self.nums[index2], self.nums[index1]
+
+        def _get_median_of_three_and_its_index(self, index1, index2, index3):
+            sorted_three = sorted(
+                [
+                    (self.nums[index1], index1),
+                    (self.nums[index2], index2),
+                    (self.nums[index3], index3),
+                ]
+            )
+            return sorted_three[1]
+
+        # 不変条件 pivotのindex以上のものは、pivot以上の値である
+        def _partition(self, left_index, right_index, k):
+            if left_index == right_index:
+                return self.nums[left_index]
+            pivot_value, pivot_index = self._get_median_of_three_and_its_index(
+                left_index, right_index, (left_index + right_index) // 2
+            )
+            self._swap(pivot_index, right_index)
+            num_less_than_pivot = left_index
+            for i in range(left_index, right_index):
+                if self.nums[i] < pivot_value:
+                    self._swap(i, num_less_than_pivot)
+                    num_less_than_pivot += 1
+            self._swap(right_index, num_less_than_pivot)
+            pivot_rank = self.all_length - num_less_than_pivot
+            if pivot_rank == k:
+                return self.nums[num_less_than_pivot]
+            if pivot_rank > k:
+                return self._partition(num_less_than_pivot + 1, right_index, k)
+            return self._partition(left_index, num_less_than_pivot - 1, k)
+
+        def get_kth_largest_value(self, k):
+            return self._partition(0, self.all_length - 1, k)
+
+        def get_sorted_value_until_kth_greatest(self, k):
+            self.get_kth_largest_value(k)
+            return self.nums[-k:]
+
+    class Solution:
+        def topKFrequent(self, nums: List[int], k: int) -> List[int]:
+            num_to_frequency = defaultdict(int)
+            for num in nums:
+                num_to_frequency[num] += 1
+            nums_with_freqs = []
+            for num, freq in num_to_frequency.items():
+                nums_with_freqs.append(NumWithFreq(num=num, freq=freq))
+            quick_select = QuickSelect(nums_with_freqs)
+            ordered_by_kth = quick_select.get_sorted_value_until_kth_greatest(k)
+            result = []
+            for num_with_freq in ordered_by_kth:
+                result.append(num_with_freq.num)
+            return result
+    ```
+
+- Counter、most_commonでの実装。実務ならこれでOK
+    - CPythonのmost_commonの[実装](https://github.com/python/cpython/blob/17718b0503e5d1c987253641893cab98e01f4535/Lib/collections/__init__.py#L625)を見てみると、nがNoneの時以外に、いちいちheapqをimportしてnlargest()使っているのが不思議。単純にsortedの方がオーバーヘッドなさそうなので。
+    - Counterがdictの継承なのは、予想通り。initの実装も眺めたが、大体予想通り。
+
+```python
+from collections import Counter
+
+class Solution:
+    def topKFrequent(self, nums: List[int], k: int) -> List[int]:
+        return [num for num, _ in Counter(nums).most_common(k)]
+
+```
+
+- sortのkeyをうまく指定した実装(Counterのmost_commonがこう実装しないのが不思議)
+
+```python
+
+class Solution:
+    def topKFrequent(self, nums: List[int], k: int) -> List[int]:
+        num_to_frequency = defaultdict(int)
+        for num in nums:
+            num_to_frequency[num] += 1
+        return sorted(num_to_frequency, key=num_to_frequency.get, reverse=True)[:k]
+```