Zürich from a Scientific Lens

Note: I don’t usually publish what I write in my journal but this just has to be an exception.

July 8, 2017

Image may contain: sky and outdoor

Entry 1 As I head to Zürich, I cannot help but feel an elevating sense of excitement. I have been there once before on Ankit’s invitation to visit ETH Zürich’s computer science department but I barely explored anything else in that visit. This time, however, I have fully planned the trip. As soon as my train reaches Zürich in 20 minutes, I will put it into execution.

I will begin by visiting the ETH Bibliothek (ETH Zürich’s library). Image may contain: people sitting and indoorThen I will wander across the building of ETH Zürich. By 1045, I will enter the Swiss National Museum, followed by a visit to the Kunsthaus Zürich (the Zürich Museum of Fine Arts) at 1230. At 1415, I will take a walk in the Chinese Garden in Zürich followed by a visit to Lake Zürich. At the end of the day, I plan to take the train on platform 31 at 1802 back to Lausanne and reach my room by 2054. Here’s to an educational, informative and exciting adventure!

Image may contain: one or more peopleEntry 2 As I sit here in the main wing of ETH Bibliothek, I cannot hide my admiration and amazement at the cultural wealth and beauty of the ETH Zürich campus. I wonder how Einstein must have felt while studying here and later in his life teaching here. How Von Neumann attended lectures in these auditoriums while pursuing his undergraduate degree here. Twenty one Nobel laureates have been part of this wonderful place. An even larger number have set foot in it.

No automatic alt text available.Entry 3 Once again, I find myself in the main building of ETH Zürich but this time after having visited and marvelled at the splendour of the Universitat Zürich Zoological Museum, the Swiss National Museum and the Zürich Museum of Fine Arts (I visited them in the order I mentioned them). As a science enthusiast, it is unneccessary to mention the fact that I liked the Zoological Museum most. It contained the skeleton of a mammoth and the fossilized skeleton of one of  the first fish species to evolve tiny legs for crawling onto land. That skeleton (or the individual it belonged to) may as well be my ancestor! (and an ancestor to the homo sapiens, our species).

No automatic alt text available.

The Swiss National Museum contained a display on migrants including Albert Einstein mentioning how he moved to Zürich after completing school, studied at ETH Zürich and kater in developed the general theory of relativity here before migrating to the United States. The Zürich Museum of Fine Arts technically had a ticket not covered by my Swiss Pass but the nice attendants there (secretly) gave me a free ticket. No doubt it contained amazing stuff. Unfortunately, I have not learnt to understand or critique art yet but I took lots of pictures (of all three museums) so maybe some day when/if I manage to learn the skill of comprehending artistic expression I can take a look at the pictures and experience some of the admiration and awe that comes with comprehension and which I sadly missed today.

Image may contain: 3 people, people standing

I returned to the ETH Zürich main building because my phone’s battery was almost completely drained due to all the camera, mobile data and GPS usage. Since I need my phone to find my way in Zürich, it was a necessary delay. There are only two things left in today’s plan. the Chinese Garden and Lake Zürich. Due to the extra stop made at the Zoological Museum, I might end up having to give up on one of the spots if I intern to catch the 1802 train back to Renens but I will try my best to make the most of the remaining time.

Image may contain: 1 person, textEntry 4 Change of plans. I just fiund and ETH Bibliothek pamphlet titled “Einstein’s Zürich”. The pamphlet mentions ten places relevant to Einstein’s life and work in ETH Zürich’s many buildings spread all over Zürich and other places. My new plan is to spend the remaining time I have before the train arrives in visiting as many of these ten places as possible.

Entry 5 I succeeded in visiting four of the ten places. The first being ETH Zürich itself. The second Einstein’s residence between 1886 to 1888 and from 1889 to 1900 at Unionstrasse 4. It even contained a memory plaque commemorating Einstein. The third was Einstein’s residence from 1900 to 1901 and fourth place was the site of ETH Zürich’s old physics department where Einstein was a full professor of theoretical physics from 1912 to 1914. The site now has the ETH Zürich electrical engineering department. It started raining after that so I was forced to hurry to the Zürich central train station. Overall, this was a truly amazing day. Quite possibly, one of the finest days of my life.

Image may contain: plant

Fluttering For A Day – An Elegy

Image result for we are like butterflies who flutter for a day and think it is forever

No matter how eloquently put by Carl Sagan, the thought is still agonizing. We have too little time. Eighty years? Ninety years? Is that enough to live a happy and meaningful life? I cannot say. Is it enough to gain an adequate amount of knowledge? No. Let alone research, it is not even enough to complete the current past records of our own species. It is not even enough to complete one form of the records of our species.

According to Google, in 2010 there were 129, 864, 880 unique books (~130 million) in the world. (source) Imagine if a person lives for 80 years and begins reading one entire book a day ever since he learns to read at the age of 10. That translates to a mere 70 \times 365 = 25550 books. That is a little less than 0.02 percent of the books our species currently possesses. Of course this is only a ballpark figure as a lot of books probably containing similar information but it is not inaccurate to claim that even a life spent in the pursuit of information will result in an individual gaining an almost negligible slice of what is present.

How do we resolve this tragedy? My own take is not to think of knowledge as something to be completely acquired by an individual but more like a network of ideas and information collectively owned by the entire species. Somewhat akin to a distributed file system or a cloud architecture like the one used for Google’s page ranking or Facebook’s TAO abstraction. A researcher caches a small slice of this entity in his or her lifetime and uses it to (if he or she is fortunate) add a yet undiscovered and exciting piece to the entity, leaving it larger and more elegant than what it was before.

How Carpal Tunnel improved my Code


“Waterfall”. M.C Escher.1961.

When I started waking up with annoying and persistent tingling every morning around one and a half years ago, I didn’t take it very seriously which probably was the wrong thing to do. By the middle of 2016 however, the pain had gotten annoying enough to affect my concentration which is when it was evident that I had carpal tunnel. With help from my father who is luckily a medical doctor and told me about wrist braces and various lifestyle changes, the pain became a little manageable however I was still left with an engineering constraint.

Here is what usually happens when I sit and type: After the first ten minutes the area between the thumb and index finger of my left hand begins tingling (I am left handed so it starts with the left hand). Around the seventeen minute mark, my right hand starts feeling similar tingling. After around half and hour, my hands are pretty much in pain and I also begin feeling pain around my shoulders. After forty minutes, I simply have to stand up and either walk around or lie down for a bit before resuming work. The pain subsides more or less completely after ten minutes of rest but my breaks typically last twenty minutes as I am lazy.

To summarize, every forty minutes of typing incurs a cost of 20 minutes. Initially, I was pretty depressed as I made the mistaken and naive assumption of time spent writing code was directly proportional to my productivity as a programmer. With that assumption, carpal tunnel meant a ~33.3% loss in productivity. I have been measuring my roughly productivity by assigning difficulty levels to tasks I store on Google Tasks. This includes research projects, course projects, assignments, reading textbooks/documentation etc. Surprisingly, after eight months of this, it seems my initial assumption of a ~33.3% loss in productivity could not have been more inaccurate. In fact, it seems to me that my productivity has actually  increased over this period.

Yes, I am aware that correlation does not imply causality. Understandably, there are a lot of factors involved in this increase in productivity from the fact that I gain more knowledge and experience with time to the fact that my caffeine consumption has also increased significantly. However, I am still convinced that carpal tunnel is playing a significant role in this trend. Here are a few reasons why I think that is the case.

1- The 20 minute “break” isn’t time wasted

In fact, during the 20 minute break I think about the code I have written and what I plan to do in my next 40 minute work session. In other words, this has turned into a kind of planning session that precedes every work session. The key advantage to this is that, earlier on if I had say ways to approach subproblem X, I would first mentally sort them in descending order with an order principle like: (w_{1} * ease of implementation + w_{2} * probability of working – w_{3} * computational resources required) where the w_{i}s are weights. Then, I would mentally execute the following algorithm:

approach_list = []

while not tired:
    new_approach = think_of_approach()


while not working_fine(current_approach):
    current_approach = approach_list.pop()

until I managed to come up with one that worked fine.

Now what the planning session allows me to do is prune the approach_list before the implementation and testing steps by simply thinking through all the approaches and removing the ones I can deduce (or when I’m really unsure, mathematically prove) will not work. In other words, now I execute the following in between the two while loops (after sorting, although it doesn’t matter):

approach_list.filter(approach => deduce_correct(approach))

Since the deduce_correct function takes less time in the average case than implementtest, this method of pruning the dataset more than offsets the 20 minutes “wasted” in the planning session.

2- Pain incentivizes cleaner code

When every key-press hurts your hands, it is not difficult to motivate yourself to write code that is more:

  • Clear: as altering, say, variable values to figure out what a piece of code is doing later in case I forget will result in more agony.
  • Concise: the less code I write to address every problem, the less it hurts.
  • Reliant on the standard library and third party APIs: reinventing the wheel hurts too much.
  • Better commented: The last thing I want to do is rewrite a routine because I’ve forgotten how it works. The clearer (and more concise) comments I write, the less painful my future.

3- Picking the right tool

This is in some ways a corollary of reason 2. I have now also become more inclined to begin projects using programming languages and technologies that will help me get them done with the least amount of boilerplate code etc. While admittedly this can add performance and scalability challenges as things get more complex, I realize I often over-complicated my life by thinking that far ahead resulting in even bare-bones functionality taking a long time to get implemented and in me often resorting to hacky solutions. For example, it is almost redundant to mention that a simple web-crawler written quickly and beautifully with python + urllib or node-js + https/http libs (or even bash + wget) can take over twice the time (and twice the code) if implemented in certain other languages (Captain obvious: “He means Java”).

Statistics: Writing this took me 1.575 writing sessions and 2 planning sessions. 🙂

What Next – A Research Game Plan


Evariste Galois

The French mathematician Évariste Galois solved a three and a half century old standing problem on the solvability of polynomials. He also laid the foundations of group theory and galois theory somewhere between that time and his untimely death at 20. [More on Galois]

On the other hand, I spent the night of my twentieth birthday studying (and struggling to understand) very simple undergraduate linear algebra miles away from any of Galois’ accomplishments. Not all of us are Galois or anything close to him. That doesn’t mean we can’t strive to reach his level. If not at twenty, then maybe by the time we are fifty.

Luckily, there are plenty of exciting problems to be worked on in my research area and solving them will require a solid knowledge-base and understanding. Here is how I would rate my knowledge of some of the sub-fields I need to know at this moment in time:

  • Internet Architecture: Working Knowledge
  • Datacenters: Working Knowledge
  • Internet Measurements: Good Working Knowledge
  • Distributed Systems: Some Knowledge
  • Network Security: No Knowledge
  • Wireless Networks: Some Knowledge
  • Internet Censorship: Working Knowledge

I am currently in the third year of my undergraduate programme. In the ideal case, before the start of the first year of my graduate programme (if any grad school miraculously agrees to take me in as a PhD student), I should have a “Good Working Knowledge” of all the above categories.


Needless to say this requires developing adequate background in Distributed Systems, Network Security and Wireless Networks (which hopefully will pave the way for perhaps some interesting Internet of Things research) along with brushing up more on Datacenters. Let’s see how things work out. Lots of exciting work ahead. As Carl Sagan would say, “Somewhere, something incredible is waiting to be known”. 🙂

Side note: The frequency of my posts has undoubtedly decreased as I focus more on exciting research projects and graduate-level courses. In fact, the only reason I had the time to post this today was because I have a mid-semester break. Sorry about that.

PageRank: A Case Study for Discrete-time Markov Chains and Graph Theory

Def: Let a webpage be a vertex in a directed graph such that

  1. the weight of an outgoing edge connected to the vertex can only take values in the interval (0, 1]
  2. the sum of the weights of all outgoing edge connected to the vertex equals 1
  3. the weight of an outgoing edge connected to the vertex is \frac{1}{k} where k is the total number of outgoing edges connected to the vertex.

Def: Let a hyperlink be any outgoing edge connected to a webpage.

Def: Let an internet be a graph I = (V, E) such that all vertices v in V are webpages and all edges e in E are hyperlinks. The following definition is taken from [1]:


Seem familiar? If not, I will write it in an equivalent but more recognizable form [3]:
These are the global balance equations of a Discrete-time Markov Chain!
Let $latex I$ be an internet represented by the following diagram. [2]
The above equations can be used to calculate the ranks of the four webpages in internet I. The ranks of webpage 1, 2, 3 and 4 are 0.38, 0.12, 0.29 and 0.19 respectively. [2]
Algorithm [3] (slightly modified):
  1. For an internet, determine all webpages and links between them.
  2. Create a directed graph.
  3. If page i has k > 0 outgoing links, then the probability of each outgoing link is 1/k.
  4. The vertices of the graph are states and its links are transitions of a Markov Chain. Solve the Markov Chain to get its limiting probabilities. The limiting probabilities are the pageranks of the webpages.
It is important to note that not every internet is conveniently ergodic (i.e aperiodic and irreducible. All internets are finite so positive recurrence is not a problem). See page 4 and 5 of [1] to see how Brin and Page solve this problem by creating artificial rank sources in rank sinks. A better explanation is provided in [2] with the title “The solution of Page and Brin”. Of course, to test the pagerank algorithm, Brin and Page created the Google Search Engine initially hosted at google.stanford.edu and then self-hosted at google.com. You might have heard of it.
[1] Paper: “The PageRank Citation Ranking: Bringing Order to the Web” by Brin, Page.
[2] “The Mathematics of Web Search” lecture notes (Winter, 2009) Cornell University
[3] Notes from Dr. Zartash’s lectures on DTMCs and Queueing theory and Notes from Dr. Ihsan’s lecture on the PageRank Algorithm.

Interesting Academic Genealogy

Def: let a → b be a directed edge in which the predecessor a is the advisor and the successor b is the advisee.
Jacob Bernoulli (Doctorate in Theology, 1676, Universität Basel – Switzerland)
→ Johann Bernoulli (Medicinae Dr. Universität Basel 1690 – Switzerland)
→ Leonhard Euler (Ph.D. Universität Basel 1726 – Switzerland)
→ Joseph Louis Lagrange (B.A. Università di Torino 1754 Italy)*
+ Pierre-Simon Laplace (unknown)
→ Simeon Denis Poisson (Ph.D. École Polytechnique 1800 France)^
→ Michel Chasles (Ph.D. École Polytechnique 1814 France)
→ Hubert Anson Newton (B.A. Yale University 1850 United States)
→ Eliakim Hastings Moore (Ph.D. Yale University 1885 United States)
→ Oswald Veblen (Ph.D. The University of Chicago 1903 United States)
→ Alonzo Church (Ph.D. Princeton University 1927 United States)
→ Alan Mathison Turing (Ph.D. Princeton University 1938 United States)
* Not officially. He didn’t get a doctorate degree.
^ co-advised by both.

Networks Notes: RFC 1149 – IP over Avian Carriers

This is a series containing notes I made while reading RFCs.

Link to this RFC.


Avian carrier

A Standard for the Transmission of IP Datagrams on Avian Carriers.

Note: this was the April Fool’s RFC for the year 1990. 🙂

  • Describes an experimental method for sending IP datagrams over “avian carriers” (pigeons!)
  • Primarily useful in Metropolitan Area Networks
  • Experimental, not recommended standard.

Overview and Rational

  • Avian carriers provide:
    • high delay
    • low throughput
    • low altitude service
  • Connection path limited to a single point-to-point path for each carrier.
  • Many carriers can be used without significant interference with each other, outside of early spring.

Frame Format

  • Scroll of paper wrapped around avian carrier’s leg.
  • Bandwidth depends on leg length.
  • MTU typically 256 mg. Paradoxically, generally increases as carrier ages.


  • Prioritized pecking order can be used for multiple types of service.
  • Built-in worm detection and eradication
  • Storms can cause dataloss.
  • Persistent delivery retry, until the carrrier drops.

Security Considerations

  • Not generally a problem in normal operation
  • Data encryption needed in tactical environments.

Networks Notes: RFC 792 – Internet Control Message Protocol

This is a series containing notes I made while reading RFCs.

Link to this RFC.

Internet Control Message Protocol (ICMP)

  • Used by a gateway or destination to communicate with a source host. To report an error, for example.
  • Uses the basic support of IP as if it was a higher-level protocol. Actually a part of IP, is implemented in every IP module.
  • Control messages provide feedback about problems in the communication environment.
  • Not designed to make IP reliable, IP is not designed to be completely reliable. For main purpose, see previous point.
  • No guarantees that a control message will be returned. Datagrams can still be undelivered without being reported by a control message.
  • Reliability is implemented in higher-level protocols that use IP (e.g TCP).
  • ICMP messages typically report errors in the processing of datagrams.
  • No ICMP messages are sent about ICMP messages (to avoid infinite loop).
  • ICMP messages only sent about errors in handing fragment zero of fragmented datagrams (the fragment with the fragment offset equal to zero).

Message Formats

  • Sent using the basic IP header.
  • First octet of the data portion of the datagram is a ICMP type field. Its value determines format of remaining data.
  • Protocol number of ICMP is 1.

For values of an ICMP message’s IP header, see the “Message Formats” section of the linked RFC.

ICMP Fields

  • Type
  • Code
  • Checksum

For details on ICMP message types see pages 4 to 19 of the linked RFC.


Networks Notes: RFC791 -Internet Protocol

This is a series containing notes I made while reading RFCs.

Link to this RFC.

Internet Protocol

  • Implements two basic functions:
    • Addressing
    • Fragmentation
  • Addresses carried in the internet header are used to transmit internet datagrams to their destinations.
  • The selection of a path for transmission is called routing.
  • Internet modules use fields in the internet header to fragment and reassemble internet datagrams when necessary in “small packet” networks.
  • An internet module resides in:
    • each host engaged in communication
    • each gateway interconnecting networks
  • Modules share common rules for interpreting the address fields and for fragmenting and assembling internet datagrams.
  • Modules have procedures for making routing decisions and other functions.
  • The protocol treats each internet datagram as an independent entity unrelated to any other internet datagram (i.e no connections or logical circuits).
  • The protocol uses four key mechanisms to provide service:
    • Type of Service
    • Time to Live
    • Options
    • Header Checksum


  • Type of Service
    • Indicates the quality of the service desired.
    • Used by gateways for:
      • selecting tranmission parameters for a particular network
      • selecting the network to be used for the next hop
      • selecting the next gateway when routing the internet datagram.
  • Time to Live
    • Indication of an upper bound on the lifetime of an internet datagram.
    • Set by sender of the datagram.
    • Decremented at points in the route where it is processed.
    • The internet datagram is destroyed if this  reaches zero before reaching the destination.
    • A “self-destruct” time limit.
  • Options
    • provide for control functions needed or sometimes useful.
    • unnecessary in most common communications.
    • can include provisions for timestamps, security, routing etc.
  • Header Checksum
    • provides a verification that the internet datagram has been transmitted correctly.
    • the internet datagram is discarded if the header checksum fails.
  • The internet protocol does not provide a reliable communication facility. There are no:
    • acknowledgements
    • error control for data, aside from the header checksum
    • retranmissions
    • flow control
  • Errors detected can be reported using the Internet Control Message Protocol (ICMP), implemented in the IP module.


  • Name – what we seek
  • Address – where it is
  • Route – how to get there

On addresses…

  • Addresses have a fixed length of four octets (32 bits).
  • An address:
    • begins with a network number
    • followed by the local address (“rest” field)
  • Addresses have three formats or classes.
    • Class A: high order bit is zero. Next 7 bits are the network. Last 24 bits are the local address.
    • Class B: high order two bits are one-zero. Next 14 bits are the network. Last 16 bits are the local address.
    • Class C: high order three bits are one-one-zero. Next 21 bits are the network. Last 8 bits are the local address.


  • Necessary when large packets have to pass through a local network that limits packets to a smaller size.
  • An internet datagram can be marked “don’t fragment”.
    • It will not be fragmented under any circumstances.
    • If it cannot be delivered without fragmentation, it will be discarded.
  • Needs to be able to break a datagram into an almost arbitrary number of pieces that can later be reassembled.
  • Receiver uses identification field to ensure fragments of different datagrams are not mixed.
  • Fragment offset field tells receiver the position of a fragment in the original datagram.
  • Fragment offset and length determine portion of the original datagram covered by the fragment.
  • More-fragments flag indicates (by being reset) the last fragment
  • Fields that provide sufficient information to reassemble datagrams:
    • identification field
    • fragment offset field
    • length
    • more-fragments flag
  • See RFC page 8-9 for a very well-written description of how fragmentation works.


  • Forward datagrams between networks
  • Also implement the Gateway to Gateway Protocol (GGP) to coordinate routing and other internet control information.
  • Higher level protocols need not be implemented in a gateway. GGP functions are added to the IP module.

The details on specification are very well written and I don’t think notes for them are needed. To read up the specification, refer to section 3 of the above linked RFC.