Rotate a Square Matrix

(Referring to this leetcode problem.)

Pictorally, my approach consists of:

Given a NxN (N=6 here) matrix, number its four corners:

1 . . . . 2
. . . . . .
. . . . . .
. . . . . .
. . . . . .
4 . . . . 3

Rotate elements at positions 1, 2, 3, 4 clockwise:

4 . . . . 1
. . . . . .
. . . . . .
. . . . . .
. . . . . .
3 . . . . 2

Next, number their clockwise-immediate elements:

# 1 . . . #
. . . . . 2
. . . . . .
. . . . . .
4 . . . . .
# . . . 3 #

(Already dealt with elements are marked with #.)

Repeat the previous steps N-1 times (in total):

# # 1 . . #        # # # 1 . #
. . . . . #        . . . . . #
. . . . . 2  --->  4 . . . . #
4 . . . . .  --->  # . . . . 2
# . . . . .        # . . . . .
# . . 3 # #        # . 3 # # #
                //
              //
             LL 
# # # # 1 #        # # # # # #
4 . . . . #        # . . . . #
# . . . . #  --->  # . . . . #
# . . . . #  --->  # . . . . #
# . . . . 2        # . . . . #
# 3 # # # #        # # # # # #

Notice how we moved through the ‘outer layer’ of the matrix in a ‘spiral’ shape, and its elements were rotated thusly.

By applying the same procedure to the inner layers, we arrive at a fully rotated matrix.

The code follows:

struct Pos { int i, j; };

void rotate(vector<vector<int>>& m) {
  // for every k-th layer:
  for (int k = 0; k < m.size() / 2; ++k) {
    // it has n*n size:
    int n = m.size() - k * 2;
    // apply our 'spiral' rotating procedure n-1 times:
    for (int s = 0; s < n - 1; ++s) {
      rotateLayer(m, k, n, s);
    }
  }
}

void rotateLayer(vector<vector<int>> &m, int layer, int size, int step) {
  int k = layer;
  int n = size;
  int s = step;

  // top-left corner,  moving right   (j += s)
  Pos p1 { k,             k + s };
  // top-right corner, moving down    (i += s)
  Pos p2 { k + s,         k + n - 1 };
  // bottom-right corner, moving left (j -= s)
  Pos p3 { k + n - 1,     k + n - 1 - s };
  // bottom-left corner,  moving up   (i -= s)
  Pos p4 { k + n - 1 - s, k };
  
  // swap clockwise
  tie(at(m, p1), at(m, p2), at(m, p3), at(m, p4))
    = make_tuple(at(m, p4), at(m, p1), at(m, p2), at(m, p3));
}

int& at(vector<vector<int>>& m, Pos p) {
  return m[p.i][p.j];
}

Note that this algorithm is not exactly cache-friendly (:

RE: The long arrow operator in C++

I came accross Ivan Čukić’s post on the long arrow operator for C++, where he jokingly proposes the use of --->, ----->, etc. for dereferencing nested pointer-like structures.

Like most jokes, there’s a nugget of truth behind it, and I decided to tackle the problem of nested structures with my own crazy C++.

My take allows the user to write the following code:

auto hello1(wrap<vec2> w) {
    return w->x + w->y;
}

auto hello2(wrap<wrap<vec2>> w) {
    return w->x + w->y;
}

auto hello3(wrap<wrap<wrap<vec2>>> w) {
    return w->x + w->y;
}

// ... etc

Essentially, the -> collapses an arbitrary number of wrap applications into a single one. But how?

Here’s how.

template <typename T>
struct wrap {
    auto operator->() {
        if constexpr (is_wrap_v<T>) {  // impl of is_wrap_v below
            return value.operator->();
        } else {
            return &value;
        }
    }

    T value;
};

We make use of C++17’s if constexpr feature to statically determine whether we’ve reached a “pure” value or we need to continue unwrapping.

The full code can be found here, and thanks to the amazing Compiler Explorer you can try it for yourself here.

One possible generalization is to replace is_wrap_v<T> with a has_operator_arrow_v<T> which returns true if T provides operator->.

Modern C++ is fun!

Pascal's triangle in Haskell

(My answer to this Stack Overflow question.)

Start out with the triangle itself:

     1
    1 1
   1 2 1
  1 3 3 1
 1 4 6 4 1
    ...

You should notice that to write down the next row, you must apply this rule: sum the previous rows’ adjacent elements, using a 0 for the lonely edge elements. Visually:

    0   1   0
     \+/ \+/
  0   1   1   0
   \+/ \+/ \+/
0   1   2   1   0
 \+/ \+/ \+/ \+/
  1   3   3   1
       ...

Operationally, that looks like this:

For row 0:
[1]  (it's a given; i.e. base case)

For row 1:
[0, 1]   <- row 0 with a zero prepended ([0] ++ row 0)
 +  +
[1, 0]   <- row 0 with a zero appended  (row 0 ++ [0])
 =  =
[1, 1]   <- element-wise addition

For row 2:
[0, 1, 1]
 +  +  +
[1, 1, 0]
 =  =  =
[1, 2, 1]

Generally, for row N:

element-wise addition of:
  [0] ++ row(N-1)
  row(N-1) ++ [0]

Remember that element-wise addition of lists in Haskell is zipWith (+).

Thus we arrive at the following Haskell definition:

pascal 0 = [1]
pascal n = zipWith (+) ([0] ++ pascal (n-1)) (pascal (n-1) ++ [0])

Or in a fashion similar to the famous “lazy fibs”:

pascals = [1] : map (\xs -> zipWith (+) ([0] ++ xs) (xs ++ [0])) pascals

Binary search on the integer number line

Consider a predicate p(x) such that:

  • forall x in [1..k-1]: p(x)
  • forall x in [k..n]: !p(x)

In other words, [1..n] is partitioned by p(x) at k.

In problems such as this one (First Bad Version), we are asked to implement an algorithm analogous to git’s bisect command. Based on the problem statement, it is straightforward to model it as the above partitioning scheme.

Then there’s problems such as this other one (Arranging Coins) where the connection to partitioning is not exactly obvious, but it’s still there.

A naive approach to finding k is to iterate over [1..n] sequentially and stop when p(x) no longer holds. The time-complexity of this algorithm is O(n) (ignoring the cost of computing p(x)).

A more efficient approach is to perform a binary search over [1..n]. This brings down the time-complexity to O(lg n). Visually:

     [xxxxxxxxxxooo]
     /     /\      \
    /     /  \      \
   /     /    \      \
a:[xxxxxx]  b:[xxxxooo]

p(x) holds for a's last element.
This means that the partition point (k) is not in this range.
We are safe to discard range `a`, halving our search space.

The algorithm works and it’s quite efficient, but implementing binary search can be quite a hassle. I think that playing around with indices is a pain, and it’s very easy to forget when to +1 something or when not to do it. I can rarely get it right on the first try.

When implementing something is too costly, we must first turn our attention to libraries. Thankfully, binary search is a common enough algorithm that it’s not a big surprise to find an implementation in the standard library of our language of choice.

For example, C++’s algorithm library provides the lower_bound operation with the following signature:

template< class ForwardIt, class T >
ForwardIt lower_bound( ForwardIt first, ForwardIt last, const T& value );

// -- or --

template< class ForwardIt, class T, class Compare >
ForwardIt lower_bound( ForwardIt first, ForwardIt last, const T& value, Compare comp );

However, that’s not quite the droid we’re looking for. The function assumes that we know the element we’re looking for and only tries to find its position (roughly) in the range. Although we could jump through some hoops to make lower_bound work, C++11 has sneaked in a function that better suits our needs: partition_point. Take a look at its signature and description:

template< class ForwardIt, class UnaryPredicate >
ForwardIt partition_point( ForwardIt first, ForwardIt last, UnaryPredicate p );

Examines the partitioned (as if by std::partition) range [first, last) and locates the end of the first partition, that is, the first element that does not satisfy p or last if all elements satisfy p.

That’s exactly what we need… or is it?

If we decide to reify the [1..n] range into some collection (e.g. a std::vector<int>), then yes it is. But that’s an obvious waste of space.

Can we do away with reifying the range? Yes we can, but it’s not pretty. We need to implement RandomAccessIterator by essentially lifting int into its interface:

struct number_iter {
    int val;
    bool operator ==(number_iter rhs) {
        return val == rhs.val;
    }
    bool operator !=(number_iter rhs) {
        return val != rhs.val;
    }
    number_iter& operator++() {
        ++val;
        return *this;
    }
    int operator*() {
        return val;
    }
    int operator-(number_iter rhs) {
        return val - rhs.val;
    }
    number_iter& operator+=(int x) {
        val += x;
        return *this;
    }
};

// This is the tricky part:

namespace std {
    template <>
    struct iterator_traits<number_iter> {
        using difference_type = int;
        using iterator_category = random_access_iterator_tag;
    };
}

Now we can solve First Bad Version like so:

auto pred = [](int x) { return !isBadVersion(x+1); };
auto it = std::partition_point(number_iter{0}, number_iter{n}, pred);
return *it + 1;

And similarly with Arranging Coins:

if (n == 1) return 1;
auto pred = [n](long long x) { return x*(x+1)/2 <= n; };
auto it = std::partition_point(number_iter{0}, number_iter{n}, pred);
return *it - 1;

Okay, okay. I agree that this approach is suboptimal in terms of implementability. You could argue that it’s harder to implement than binary search. But, hey, at least we learned a couple of things about the STL.

Don’t forget to carefully scan through the C++ algorithm library.

Edit (2017-07-16):

Yet another problem: Single Element in a Sorted Array.

int singleNonDuplicate(vector<int>& nums) {
    int n = nums.size();
    auto pred = [&nums] (int i) { return nums[2*i] == nums[2*i+1]; };
    auto it = partition_point(number_iter{0}, number_iter{n/2}, pred);
    return nums[2 * (*it)];
}

Tree traversal order

(Referring to this leetcode problem)

My first solution was, of course, recursive:

class Solution {
public:
    vector<int> postorderTraversal(TreeNode* root) {
        vector<int> ans;
        traverse(ans, root);
        return ans;
    }

    void traverse(vector<int>& ans, TreeNode* node) {
        if (!node) return;          // *
        traverse(ans, node->left);  // *
        traverse(ans, node->right); // *
        ans.push_back(node->val);   // *
    }
};

However, the problem suggests that the reader try an iterative solution instead.

While doing that, I tried to keep the essential parts (marked with *) of the algorithm identical, while only modeling the recursion with a stack:

// Pattern: discriminated union
struct Action {
    enum Tag { TRAVERSE, SHOW };
    
    Tag tag;
    union {
        TreeNode *node;
        int value;
    };
    
    Action(TreeNode *node)
        : tag{TRAVERSE}, node{node}
    {}
    
    Action(int value)
        : tag{SHOW}, value{value}
    {}
};

class Solution {
public:
    vector<int> postorderTraversal(TreeNode* root) {
        vector<int> ans;
        
        stack<Action> actions;
        actions.push(Action{root});
        
        while (!actions.empty()) {
            auto action = actions.top();
            actions.pop();

            stack<Action> tmp;

            switch (action.tag) {
            case Action::SHOW:
                ans.push_back(action.value);
                break;
            case Action::TRAVERSE:
                auto node = action.node;
                if (!node) continue;           // *
                tmp.push(Action{node->left});  // *
                tmp.push(Action{node->right}); // *
                tmp.push(Action{node->val});   // **
                break;
            }
            
            while (!tmp.empty())
                actions.push(tmp.top()), tmp.pop();
        }
        
        return ans;
    }
};

Note that I didn’t do the following:

actions.push(Action{node->left});  // *
actions.push(Action{node->right}); // *
ans.push_back(node->val)           // **

The problem with doing that is that it doesn’t accurately model our initial recursive form because the push_back statement, if placed somewhere else, would not be “deferred” in relation to the traversal steps.

What’s also nice about this approach is that, just like in the recursive form, we can trivially do pre-order and in-order traversals:

// pre-order
if (!node) continue;               // *
actions.push(Action{node->val});   // **
actions.push(Action{node->left});  // *
actions.push(Action{node->right}); // *

// in-order
if (!node) continue;               // *
actions.push(Action{node->left});  // *
actions.push(Action{node->val});   // **
actions.push(Action{node->right}); // *

Contrast that with some community solutions (e.g. this most voted one), which completely distort the underlying algorithm.

(Note: from this point on, I’m just playing around.)

Now, if we extract the traversal algorithm into its own function like so:

template <typename F>
void traverse(F&& f, TreeNode *node) {
    if (!node) return;      // *
    f(Action{node->left});  // *
    f(Action{node->right}); // *
    f(Action{node->val});   // **
}

Then, we can reuse it with either a recursive driver:

vector<int> recursively(TreeNode* root) {
    vector<int> ans;
    recursively_driver(ans, Action{root});
    return ans;
}

void recursively_driver(vector<int>& ans, Action action) {
    switch (action.tag) {
    case Action::SHOW:
        ans.push_back(action.value);
        break;
    case Action::TRAVERSE:
        traverse([&ans] (Action a) { recursively_driver(ans, a); }, action.node);
        break;
    }
}

or an iterative driver:

vector<int> iteratively(TreeNode* root) {
    vector<int> ans;
    
    stack<Action> actions;
    actions.push(Action{root});
    
    while (!actions.empty()) {
        auto action = actions.top();
        actions.pop();

        stack<Action> tmp;

        switch (action.tag) {
        case Action::SHOW:
            ans.push_back(action.value);
            break;
        case Action::TRAVERSE:
            traverse([&tmp] (Action a) { tmp.push(a); }, action.node);
            break;
        }

        while (!tmp.empty())
            actions.push(tmp.top()), tmp.pop();
    }
    
    return ans;
}