I finally made it back to HackerRank last week! It wasn’t as good as in past in terms of the overall score (135th place out of ~ 10K participants), but I feel it was definitely good in terms of solutions I came up to. + Now there are way more players than it was 4 years ago on HackerRank, so the competition is very strong there now.
And finally, I definitely had a room to do better: I came up with a seemingly very good approach to 2 out of 3 problems on which I didn’t get 100% score, but they still didn’t pass all the tests. I wish I could do one of these two in time so I could improve it after the nightly test pass — HackerRank runs full test suite only once per day, i.e. preliminary results you see on your submissions might vanish in the morning. So it’s great if you have at least one extra day for re-submitting the solution there in case it fails — and somehow I understood this too late :)
There are many guys among my Facebook friends who love competitive programming (Dima, Leonid Volkov, Den Raskovalov, Kah Keng Tay, + many others), so I am going to ask you for some help with the last problem— I am stuck finding a better solution there, though really want to know it :) Besides that, I am sharing my analysis of why my other 2 solutions didn’t get 100% score, and I’d be happy to hear your feedback on that too.
Contest link: https://www.hackerrank.com/contests/w23/challenges
I was using Python, F#, C#, and C to submit solutions there. C# and C — mostly because I was desperate with the last problem, so I was simply checking what I can get there by pushing my solution to its limits.
The solution I found requires:
- O(n * log(tree_depth)) on preprocessing, i.e. O(n * log(n)) in the worst case
- O(log(tree_depth)) on any query, i.e. O(k * log(n)) in the worst case for all queries.
The key idea is: you can compute the force from subtree v to a node u in O(1) assuming you know their lowest common ancestor, and LCA can be computed in O(log(tree_depth)).
F# solution: https://goo.gl/vNukuq
As you may notice here, I compute LCA in O(log(tree_depth) ^ 2) there, i.e. slower than I suggested above. I knew I should use RMQ or something similar, but decided to save on implementation time by implementing LCA as binary search based on getParent(depth), where getParent is implemented w/ node.Parents[int(log2(jumpSize))]skip list in every node.
Interesting that is was a kind of “weighted decision”: I’ve made an assumption that that since it’s F#, this is going to be fast enough anyway — and I feel this this is exactly where I was wrong, coz if tree_depth is 100K, I’m spending ~ 17x more per query (log2(100K) ~=17, and I spend 17*17), and extra 17x is definitely too much for contest problems.
Obvious fix w/o involving RMQ is to switch it from true depth-based binary splits for LCA to skip list-based splits: it would make it O(log(tree_depth)) as well.
F# solution: https://goo.gl/L8Q05K
I came up with O(n * log(max_radius/desirable_precision)), i.e. basically, O(10⁴ * log(10⁴*10⁴*10³)) ~= O(350000), which seems to be totally enough. Not surprisingly, it failed not with “timeout” on some of overnight tests, but with “wrong answer”.
The idea is very simple: this must be a convex polygon, and all of its vertices must lie on a circle for the maximum area. So all you need is to find the radius of this circle, which you can do relying on binary search: rotation angle of the last vertex you fit is a monotonically decreasing function of radius.
I have two guesses on why my solution failed:
- Simple: I didn’t take epsilon while comparing the last vertex angle with 2*pi when deciding where to go (to the left half or to the right half) after the binary split. In fact, if the difference is very close to zero, I shouldn’t prefer either side, but narrow down the search range around this point by e.g. 50%.
- Complex: since all the point coordinates were computed sequentially, I am accumulating the error while moving from the first vertex to the last one, and moreover, the error is probably exploding. I guess I should try to at least go into both directions of the circle while fitting the vertices on it and compare the angle difference by processing 50% of either direction — this alone could decrease an error a lot assuming it is really growing exponentially (if not, this is probably not a big deal).
- Any other ideas on why it fails or on other approaches are welcome :)
Sasha and the Swaps II
This is where I stuck. This is what you get when you see you name right in the problem title :)
I came up with O(n²) DP solution here, but apparently, this isn’t enough: max n is 10⁵ for this problem.
DP solution is based on the following facts:
- You need only n to solve the problem, but not the sequence itself by mapping the original sequence of items to [1 … n] sequence, which is always the same for the same n). So let’s assume our target sequences are always [1 … n] further.
- Let’s assume n is the length of the shortest sequence of swaps converting permutation A to permutation B. Thus you also need n steps to convert B back to A, i.e. 2n steps totally to make a dummy set of swaps that doesn’t change A. Also, the smallest sequence of such swaps is double swap, which works for any permutation A. This proves that if you know the shortest sequence of n steps converting A to B, all other sequences of steps will have a length of n +2*k.
- Let’s assume you want to count only the shortest swap sequences s(n, k). If so, solution for (n, k) can be obtained from solutions for (n-1,k) and (n-1, k-1):
a) anyPermutation(n-1) + [n] case (i.e. when n-th item is appended to the tail) requires no extra swaps; there is just one permutation like this for anyPermutation(n-1)
b) anyPermutation(n-1) case + [n] in the middle / in the beginning case (i.e. when n-th item is not in the tail) requires 1 extra swap to move n to the tail, and there are n permutations like this for anyPermutation(n-1)
Thus s(n,k) = s(n-1, k) + n * s(n-1, k-1), and computing n-th row in this matrix requires O(n²) time and O(n) memory.
- If you know s(n, k) for k = [1 … n-1], you can compute the final solution p(n, k) = sum(s(n, i) | i=k, k-2, k-4 … 0). This can be done in O(n) for all k, if you’ll revert the direction of this loop and start accumulating the result.
So overall, it’s O(n²) solution. Moreover, s(n, k) are Stirling numbers of the first kind (I didn’t know this — I googled for a subset of numbers when I was desperate to check if get something that’s well-known). And I quickly found the answer on MathOverflow claiming there is basically no faster way to get a row of these numbers faster than in O(n²):
Update: Den Raskovalov suggested this improvement:
Stirling numbers are coefficients of polynomial, which is product of n linear polynomials, right? So, we can start with n linear polynomials, multiply two of them each step and end up with one polynomial of degree n? Multiplication of the polynomials could be done via Karatsuba in pow(n, 1.5) (approximately), right? It gives us asymptotic better than n².
Another one is from Jonny Ho:
Div conquer adds log(n) to the above, so pow(n, 1.5)*log(n) is hard to get under time limit. On the other hand, FFT which is n*log²(n) is annoying to get right under mods.
I clearly understand how to apply Karatsuba here, + simple calculation shows that log(n) won’t dramatically increase the run time:
And I never used FFT with polynomials in practice, so I guess I should try to implement a faster approach suggested by Johnny Ho some day :)
P.S.: Den Raskovalov, C# solution for this problem is just 25% slower than C++ (.NET/Windows vs g++/Ubuntu), so may be this is a good topic for another “.NET vs C++”-style post. Based on this data, it was reasonable to stick to C# there (time limit is 3s for it, vs 2s for C/C++), but since they run it on Mono, it actually performs much worse there.