CAP in a nutshell

CAP stands for Consistency, Availability and Partition Tolerance and specifies one can have only two of the above three attributes in a distributed data store system such as an SQL or NoSQL system. The Designer or Architect has to pick the poison based on the requirements. Further explanation of CAP attributes follows.

If one desires high availability (A-Availability) of a system, one need to have partitioning – more than 1 server. When there are multiple servers, the connectivity between them (to sync data) may become broken (a.k.a. network partition) occasionally. While the connectivity between the servers is broken, the servers may not have their data synched across each other. This leads to data inconsistency, when read from a server that has stale data, which will eventually become consistent (C-Consistency) when the network connectivity recovers.

Hence, given a highly available multi server data store system, you can only have one of Consistency(C) or Partition Tolerance(P), not both.

This kind of reminds me of the choices we have while doing things – Cheap, Good and Fast.

HTTP 2.0

HTTP 2.0 is the most recent major revision of the HTTP protocol that powers the internet.

One significant advantage HTTP 2.0 brings to the table is its efficient parallelism in executing requests on a connection to download remote resources. While the previous versions of HTTP can only have one request pending on a connection at any given point in time, HTTP 2.0 can multiplex many requests at the same time and fetch the requested resources out of order and hence is not slowed down by the slowness of prior requests – no head-of-line blocking. In addition, it uses a binary protocol with header compression that can lead to a reduction in the data downloaded over the wire. While HTTP 1.1 allows submitting requests (via pipelining) even before the previous requests are completed, the responses can be fetched only in-order and hence can slow down due to head-of-line blocking. In addition, HTTP 2.0 allows the server to push resources even without being asked by the client – for example, the server could proactively push CSS, js, and images on a page immediately after the page was delivered – this can speed up the page load time.

Most browsers and some web servers already support HTTP 2.0 today. Though HTTPS is not mandatory by the specification, most implementations today do it over HTTPS, which means HTTPS is a must for HTTP 2.0.

In terms of some of the practices that are used today to workaround limitations in prior HTTP versions, such as domain sharding to overcome “only one simultaneous request on a connection”, becomes obsolete with HTTP 2.0 and is, in fact, an anti-pattern as it leads to multiple connections establishment overhead.

Curious to see how fast the web adapts HTTP 2.0.

Dynamic Programming

Dynamic Programming is a rage in interviews these days, not many get to use them in their daily programming lives, though.

In Dynamic Programming the idea is simple – if you have counted a million items in a basket and if somebody added another item to the basket then the no of items in the basket currently is a million+1. You do not recount again, just add to the work you already did. You build solutions for problems using solutions for the sub-problems.

Here’s a sample problem from LeetCode.

Problem:
Say you have an array for which the ith element is the price of a given stock on day i.

If you were only permitted to complete at most one transaction (ie, buy one and sell one share of the stock), design an algorithm to find the maximum profit.

Example 1:
Input: [7, 1, 5, 3, 6, 4]
Output: 5

max. difference = 6-1 = 5 (not 7-1 = 6, as selling price needs to be larger than buying price)
Example 2:
Input: [7, 6, 4, 3, 1]
Output: 0

In this case, no transaction is done, i.e. max profit = 0.

Solution:

class Solution {
    public int maxProfit(int[] prices) {
        int currentMaxProfit = 0,
            currentMinPrice = prices.length>0?prices[0]:0;

        for (int i=1; i<prices.length; i++) {
            currentMaxProfit =
                Math.max(currentMaxProfit, prices[i] - currentMinPrice);
            currentMinPrice = Math.min(currentMinPrice, prices[i]);
        }

        return currentMaxProfit;
    }
}

As you can see, the minimum price is not recomputed by scanning all elements of the array from start to current element – 1 to compute max profit for the current element – converting this algorithm to an O(n) from O(n*n).

Invert a Binary Tree

Apparently, the author of Homebrew couldn’t solve this problem while interviewing for a job at Google, I gave it a shot with the comfort a living room and no time pressure.

Problem:
https://leetcode.com/problems/invert-binary-tree/

Solution:

/**
 * Definition for a binary tree node.
 * public class TreeNode {
 *     int val;
 *     TreeNode left;
 *     TreeNode right;
 *     TreeNode(int x) { val = x; }
 * }
 */
public class Solution {
    public TreeNode invertTree(TreeNode root) {
        if (root == null)
	        return root;
        TreeNode temp = root.left;
        root.left = root.right;
        root.right = temp;
        invertTree(root.right);
        invertTree(root.left);
        return root;
    }
}

CSRF and XSS

In this post, I decided to dissect two common security vulnerabilities that exist on the web and along the way highlight some browser security model intricacies.

CSRF (Cross Site Request Forgery) can be explained with a simple example as follows:

1. User visits a malicious website M.

2. M embeds an image tag on its page, that invokes, say, a URL that transfers money to another account from the user’s account. If the user was logged into his bank account when he loaded M’s page, the transfer would happen.

Of course, this is a contrived example of using GET requests, usually these things are done via embedded forms and via POST requests which still works. One may ask, how can a page from one domain post to a different domain, isn’t it cross domain? Surprisingly, this is possible. What is not possible is javascript posting cross domain.

How do you prevent such a CSRF attack? One way is to embed some data in the FORMs, like a hidden parameter and verify the presence of it when the form is received on submission. This will ensure only forms generated by the site, bank – in this case, will be accepted.

XSS (Cross Site Scripting) can be explained with an example as follows:

There exists a website V which reflects the input parameters as is on the web pages. For example, V spits out any javascript that is passed as input.

1. A malicious user crafts a URL to V with javascript in the input that injects a js file from a malicious website.

2. The URL is sent to the targetted user via email or through other means.

3. When the user clicks the URL,  the reflected javascript is executed which in turn loads the js file from the malicious website. This js file executes and sends any sensitive data on the page such as authentication cookie to the malicious user who then has access to the targeted user’s account by copying the authentication cookie to his browser.

You may ask, how can a js file downloaded from a third party site have access to V’s DOM or page, isn’t it cross domain which is not allowed? No, this is allowed as the js download request was on the page downloaded from V. This is how things like optimizely or CDNs work, the script downloaded from optimizely or CDNs can modify content on the page downloaded from V.

How do we prevent XSS? Simple, clean up any suspicious input such as javascript, SQL, etc. from user supplied inputs and do not reflect them back on the rendered page.