Questions
===================================================


Smallest Range Covering Elements from K Lists
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

You have k lists of sorted integers in non-decreasing order. Find the smallest range that includes at least one number from each of the k lists.

Example Input: nums = [[4,10,15,24,26],[0,9,12,20],[5,18,22,30]] Output: [20,24]


This problem can be solved by K-Way Merge. k-way merge algorithms or multiway merges are a specific type of sequence merge algorithms that specialize in taking in k sorted lists and merging them into a single sorted list. These merge algorithms generally refer to merge algorithms that take in a number of sorted lists greater than two. Two-way merges are also referred to as binary merges.


.. literalinclude:: smallest_range_cover_klists.h
  :language: c++
  :linenos:
  :lines: 13-38
  :emphasize-lines: 2-8

Refer to :numref:`kwaymerge` for K-way merge of linked lists.


Top K Frequent Elements
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

.. code-block:: none

  Given a non-empty array of integers, return the k most frequent elements.

  For example,
  Given [1,1,1,2,2,3] and k = 2, return [1,2].

  Note:
  You may assume k is always valid, 1 ≤ k ≤ number of unique elements.
  Your algorithm’s time complexity must be better than O(NlogN), where n is the array’s size.

.. only:: html

  - https://leetcode.com/problems/top-k-frequent-elements/


Remove Smallest Peaks in Order
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

You are given a list of unique integers nums with length of N. Return a new list created by repeatedly removing the smallest peak from nums and appending to a resulting list. A number nums[i] is called a peak if:

.. code-block:: none

    nums[i] > nums[i + 1] if i = 0
    nums[i - 1] < nums[i] if i = N - 1
    nums[i - 1] < nums[i] > nums[i + 1]
    (Constraints: 0 ≤ N ≤ 100,000 where N is the length of nums)

Otherwise, if a list has one element, that element is considered to be a peak.

Example:

.. code-block:: none

  Input: nums = [3, 5, 1, 4, 2]
  Output: [4, 2, 5, 3, 1]
  Explanation:
  We remove 4 and get [3, 5, 1, 2]
  We remove 2 and get [3, 5, 1]
  We remove 5 and get [3, 1]
  We remove 3 and get [1]
  We remove 1 and get []

Other examples:

.. code-block:: none

  Input  : [3, 5, 1, 4, 2]
  Output : [4, 2, 5, 3, 1]

  Input  : [3]
  Output : [3]

  Input  : [1, 2, 3, 4, 5]
  Output : [5, 4, 3, 2, 1]

  Input  : [5, 4, 3, 2, 1]
  Output : [5, 4, 3, 2, 1]

----

It is easy to find the :math:`O(N^2)` solution. To reduce the time complexity, we can use priority_queue. To be more efficient on removing item, we need to construct a ``doubly linked list``.

The node definition is:

.. literalinclude:: remove_smallest_peaks_in_order_test.cpp
  :language: c++
  :linenos:
  :lines: 6-13
  :emphasize-lines: 5

The algorithm goes like this:

#. In first for loop, create a doubly linked list and push all peaks into a priority queue.
#. In the second for loop, pop the minimum heap and remove it from the linked list and add its predecessor or successor if one of them is peak.

.. literalinclude:: remove_smallest_peaks_in_order_test.cpp
  :language: c++
  :linenos:
  :lines: 14-60
  :emphasize-lines: 27-46


Minimize Max Distance to Gas Station
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

On a horizontal number line, we have gas stations at positions stations[0], stations[1], ... stations[N-1], where N is the number of the stations. Now, you need to add K more gas stations so that D, the maximum distance between adjacent gas stations, is minimized. Return the smallest possible value of D.

.. code-block:: none

    stations = [0, 1, 2, 3, 5]

         s[0]---------s[1]---------s[2]----------s[3]------------------------s[4]      D
    K=2:  0 -----x---- 1 ---------- 2 ----------- 3 ------------x------------ 5 -----> 1
    K=3:  0 -----x---- 1 ---------- 2 ----------- 3 ------x----------x------- 5 -----> 1
    K=5:  0 -----x---- 1 -----x---- 2 -----x----- 3 ------x----------x------- 5 -----> 2/3

    function signature: double minmaxGasDist(vector<int> v, int k);

----

.. literalinclude:: code/gas_station_test.cpp
  :language: c++
  :linenos:

.. only:: html

  - https://leetcode.com/problems/minimize-max-distance-to-gas-station/


Customer Total Waiting Time
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Calculate the total waiting time for a customer C to speak to an agent given N agents, M customers, and T[] time for an agent to serve a customer. T[i] represents the amount of time it takes for an agent i to serve one customer. One agent can serve one customer at a time. All N agents can serve concurrently. The customer chooses the agent with the lowest wait time.

.. code-block:: none

    N = 2, M = 2, T = [4, 5]
    First customer chooses agent 1. Second customer chooses agent 2. Customer C will wait 4 minutes.

    N = 2, M = 4, T = [4, 5]
    First customer chooses agent 1. Second customer chooses agent 2. Third customer chooses agent 1. Forth customer chooses agent 2. Customer C will wait 8 minutes.

----

A minheap is ued to solve this problem.

.. literalinclude:: code/wait_time_test.cpp
  :language: c++
  :linenos:


.. only:: html

  - https://www.1point3acres.com/bbs/thread-891913-1-1.html
  - https://leetcode.com/problems/process-tasks-using-servers/


.. _Beam-Search:

Beam Search
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

You are developing a game where players navigate a grid-based map to find treasure. Obstacles block certain cells, and each cell has a cost to move through. We define our cost function as below:

.. code-block:: c++

    int cost_function(const vector<Pos> &path, const VVI &grid) {
        int cost = path.size(); // Base cost on length
        for (auto pos: path) {
            cost += grid[pos.first][pos.second]; // Add obstacle cost
        }
        return cost;
    }

For the below grid, from the top-left cell to bottom-right cell, the top-2 most efficient paths are as below:

.. code-block:: none

     {0, 0, 1, 0, 0}         {x, 0, 1, 0, 0}        {x, 0, 1, 0, 0}
     {0, 1, 0, 1, 0}         {x, 1, 0, 1, 0}        {x, 1, 0, 1, 0}
     {0, 0, 0, 0, 1}   ==>   {x, x, x, x, 1}   and  {x, x, x, x, 1}
     {1, 0, 1, 0, 0}         {1, 0, 1, x, x}        {1, 0, 1, x, x}
     {0, 0, 0, 0, 0}         {0, 0, 0, 0, x}        {0, 0, 0, 0, x}

       input grid           top-2 most efficient paths indicated by x

Implement a function using **beam search** to find the top-K most efficient paths from a starting point to the treasure location.

.. code-block:: c++

    // Assuming VVI is defined somewhere as:
    // using VVI = std::vector<std::vector<int>>;
    using Pos = std::pair<int, int>; // row, column definition
    /**
     * Finds the top-K most efficient paths using beam search.
     *
     * @param grid A 2D vector representing the map, where grid[i][j] can be:
     *             0 (empty), 1 (obstacle), or a positive integer indicating cost.
     * @param start A pair (row, col) representing the starting position.
     * @param goal A pair (row, col) representing the treasure location.
     * @param k The number of top paths to return.
     * @param beam_width The beam width for the beam search.
     * @return A vector of the top-K paths, where each path is a vector of (row, col) pairs.
     */
    std::vector<std::vector<Pos>> find_top_paths(const VVI& grid, const Pos& start,
                                                 const Pos& goal, int k, int beam_width) {
        // Implementation here
    }

----

We can use priority queue to implement beam search, but before implementing the solution, there are two questions:

- Should we use `vistied_set` like in many other graph traversal solutions(eg. BFS, DFS, Dijkstra)?
- Should we reuse cost value saved in priority queue?

One feasible implementation would be as below:

.. literalinclude:: code/beam_search_test.cpp
  :language: c++
  :linenos:
  :lines: 22-65
  :emphasize-lines: 24-26

.. note::

    **No Visited Set**

    - Beam Search Nature: Beam search, by its nature, keeps a narrow focus on a set of the most promising candidates (paths) at each step. Unlike breadth-first or depth-first searches, which might exhaustively explore all paths and thus benefit more directly from a visited set to prevent re-exploration, beam search's limited "beam width" inherently restricts its exploration scope. However, this doesn't mean revisiting nodes is always avoided; beam search can revisit nodes if doing so is part of paths within the current best candidates.

    - Path Uniqueness: Each path in the priority queue is unique, and extending a path with new neighbors naturally avoids immediate cycles. But since the algorithm aims to find multiple top paths (k paths), completely blocking revisited nodes could prematurely prune potentially optimal or diverse paths that share initial segments but diverge later.

    - Dynamic Programming vs. Beam Search: In dynamic programming approaches (like Dijkstra's algorithm), marking nodes as visited is crucial because the goal is to find the single best path to each node based on cumulative cost. In contrast, beam search in this context seeks multiple high-quality paths, potentially with shared segments, making a global visited set less applicable.

    - Visited Set Reconsideration: For specific cases or larger grids, reintroducing some form of visited state management (potentially at a granular level, e.g., per path or with nuanced conditions) might optimize performance by avoiding redundant explorations.

    **Use of ignored_cost**

    - Priority Queue Elements: The priority queue stores pairs of integers (cost) and paths (vectors of positions). In the context where the priority queue is polled (candidates.pop()), the cost (ignored_cost) has already served its purpose for sorting within the priority queue. The focus shifts to examining and extending the path associated with that cost.

    - Cost Recalculation: When a path is extended, its new cost is calculated based on the entire path plus the heuristic for the next step. This recalculation makes the previous cost (the one that was used for priority queue ordering) obsolete for paths still in contention, hence it can be "ignored" or not used directly in further computations within that iteration of expanding paths.

    - Simplification and Focus on Path Extension: The use of ignored_cost simplifies the code by not distracting with cost management once paths are being extended. It clarifies that the decision-making process at that point is focused on path exploration rather than cost comparison.