I maintain a mirror of some classic papers because the originals tend to be not well formatted for the modern web. I also add my notes and higlights.
A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence
In 1955, four computer scientists wrote the following proposal for a workshop to lay groundwork for Artificial Intelligence. This workshop is considered to be the founding event of AI as a field. Ideas from this proposal remain highly relevant to the day. I added side notes with my thoughts and the connections I could trace to the more modern AI ideas.
Learning Representations by Back-propagating Errors
This is the classic paper that rediscovered back-propagation. Conceptually, back propagation is quite simple and just is a repeated application of chain rule. However, results of applying backprop for multi layer neural networks have been spectacular. This paper reads like a very brief tutorial of deep learning.
What is Life? The Physical Aspect of the Living Cell
This short book, written by Schrodinger, one of the fathers of quantum mechanics, influenced the discovery of DNA. In 1943, in this book, he predicted the presence of ‘an aperiodic crystal’ and how it is a ‘code-script’ for how organisms work. Within 10 years, in 1953, DNA was discovered and it is both aperiodic and code-script of life! He also shows how life is unintuitive according to known physical laws, how it feeds on ‘negative entropy’ and how we need different type of laws to explain life. Epilogue On Determinism and Free Will is a cherry on the top and gives a physicist view on the topic. Written for dilettantes like myself (Schrodinger himself being one), this book is highly readable and relevant.
Molecular Structure of Nucleic Acids: A Structure for Deoxyribose Nucleic Acid
This paper, originally published in Nature in in 1953, unraveled the structure of DNA. This was such a huge contribution to biology that it won the authors Nobel Prize just 9 years after the publication. It’s amazing how DNA has become such a commonplace knowledge in less than 50 years. The paper is surprisingly readable, short and to the point.
Evolutionary Rate at the Molecular Level
This paper, published in 1968, is a prelude to the highly influential The neutral theory and molecular evolution published later in 1983. This paper does some computations about rate of mutations at DNA level. These numbers turn to be so high that we have no choice but to accept that most of the mutation are selectively neutral. This is in contrast to widely held notion that evolution, and therefore mutations, happens by natural selection. Selection must just be one of the many evolutionary forces that shape an organism.
The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information
This is a classic 1956 paper marrying information theory and psychology. Easy read yet cited 33000 (!) times. Preview: Humans have limited judgement, limited memory and use recursion to process so much information
Judgment under Uncertainty: Heuristics and Biases
This landmark cognitive science paper lead to prospect theory, behavioral economics and eventually Nobel Prize in Economics to the authors. Hypothesis of this paper is presented very well: humans rely on a set of heuristics for decision-making and these useful yet incomplete heuristics lead to cognitive biases in judgement. These heuristics are (i) Representativeness: probability of an event which resembles a class is judged to be high. This leads to insensitivity to priors, sample size etc. (ii) Availability: probability of an event is judged by its imaginability. This leads to biases such as illusory correlation (iii) Anchoring and adjustment: people adjust estimates from an initial anchor. Insufficient adjustment leads to under or over estimation. Note that these heuristics and biases are distinct from motivational biases such as wishful thinking.
Self-Organized Criticality: An Explanation of 1/f Noise
Power law distribution is all around us, in cities, internet, genes, earthquakes and even brain. Physicists call this 1/f noise. This papers suggests a simple cellular automata model which can generate the power-law. Once the model evolves into a minimally stable state, it is said to be in self-organized criticality. From this state, small disturbances do nothing most of the time but sometimes create ‘black-swan’ avalanches which destroys the whole state.
Collective dynamics of 'small-world' networks
Have you heard of 6-degrees of separation Or Erdos’ number? Have you ever meet a stranger and realize you both actually have a mutual friend? This is the small-world phenomenon. It has been observed in wide varieties of natural and artificial networks including internet, genes and neural networks. This paper, cited >40k times, rekindled the interest in small-world phenomenon. It uses a simple construction to demonstrate the phenomenon and shows how computation happens remarkably quickly on the small-world networks. In particular, authors show how a infectious disease (like COVID-19) spreads extremely fast on these networks.
Emergence of Scaling in Random Networks
Another important paper in the network science field (cited >35k times). Just like small-world networks, authors observed an universal among a lot of natural and artificial networks: scale-free power-law distribution of vertex connectivity. What this means is that there are a few nodes which is highly connected and most of the nodes are only connected to a few. Authors show that this phenomenon happens when networks add new nodes with time and these nodes connect preferentially to already well connected nodes. May be this is how networks evolve and genetic and neural networks end up being scale-free.
A Critique on Intellectual Property
In 2001, Linus Torvalds wrote this essay as an appendix to his book - Just for fun. I thought this essay deserves wider readership. I made some edits to it to make it more readable as a paper. I hope Torvalds doesn’t kill me for this!
The Unacknowledged Convergence of Open Source, Open Access and Open Science
This is just a reformat and mirror of a 2005 paper by John Willinsky because the original link renders slow.
Absorptive Capacity: A New Perspective on Learning and Innovation
It’s easier to learn new things in a topic we already know. It’s hard to appreciate advances in a topic if we don’t know much about that topic. Therefore, more knowledge we have, easier it is to absorb new knowledge and innovate. There is also a exploration vs exploitation tradeoff: Hierarchy, increased communication and specialization leads exploitation of existing knowledge. Flat structure and diversity encourages exploration of new knowledge and increases the absorptive capacity in the new topics. Another consequence of non-linearity of knowledge acquisition is that firm’s evolutionary history both encourages and constrains the kind of knowledge it can acquire in the future.
Firm Resources and Sustained Competitive Advantage
This work is pivotal for the emergence of the resource-based view of the firm, the dominant framework for analyzing competitive strategy. Previous work in this space has assumed homogeneity of the firms to emphasize the effects of competitive environment. This works breaks down these assumptions and highlights the heterogeneity and immobility of firm attributes and resources. In fact, no firm can have sustained competitive advantage if the firm resources are uniform and/or can be bought in a market. For a firm resource to hold the potential for sustained competitive advantage, it has to be (a) valuable, (b) rare, (c) imperfectly imitable and (d) not substitutable by an equivalent resource. Imperfect imitability of a resource can arise because it is (i) history dependent, (ii) causally ambiguous or (iii) socially complex.
Exploration and Exploitation in Organizational Learning
This paper published originally in Organizational Science in 1991 is highly influential (cited >20k times) the topics of innovation and organizational learning. It is particularly influential on the concept of ambidextrous organization. This paper shows the myopia of learning or exploitation and emphasizes the importance of exploring and trying out new things. The ideas are developed through a simple but revealing mathematical model. Results presented can be thought provoking and may look counter-intuitive at the first glance. So, this paper makes for a great read.
A Dynamic Theory of Organizational Knowledge Creation
This paper, originally published in Organization Science in 1994, is highly influential on knowledge based view of the organization. Given how modern organizations sell software services (and ideas) as opposed to goods, this view is highly relevant. This papers starts with an epistemological view of knowledge or innovation, emphasizes on processes which create knowledge and presents an organizational design incorporating these processes. Highly interesting is the perspective of embodiment of knowledge and how innovation arises as an interaction between tacit and explicit knowledges.
Dynamic Capabilities and Strategic Management
This paper introduces a new framework which emphasizes organizational endogenous variables to explain competitive advantage of a firm. Three dimensions important to this approach are (1) production, learning and transformational processes, (2) technological, structural and reputational positions and (3) firm-historical paths that lead to the current processes and positions. Firms are seen as unique and it is hard to replicate by itself for expansion and imitate by competition. In fact, evolutionary paths that the firm has taken constrain the future paths available to it. In light of this, capability of an organization to dynamically reconfigure in the face of technological and market changes bestows a significant competitive advantage. This approach is in contrast to earlier approaches which see firms as homogenous entities and strategy as ‘blocking’ competition from the market.
Can Quantum-Mechanical Description of Physical Reality Be Considered Complete?
This paper introduces the famous Einstein-Podolsky-Rosen(EPR) Paradox paradox. Right after the mainstream acceptance of quantum mechanics in the late 1920s, EPR discovered the problem of quantum entanglement and teleportation implied by the formulation of quantum mechanics. The problem is that measurement of one system can mysteriously affect another system even if they are not interacting any more. This counter-intuitive prediction of quantum mechanics is actually verified to be true! In fact, concept of entanglement remains central to quantum computing.
The Synthesis of Algorithmic Systems
Computers Then and Now
This is the 1967 Turing award lecture by Maurice V. Wilkes. From Wikipedia, Wilkes is best known as the builder and designer of the EDSAC, the first computer with an internally stored program. This lecture does a wide survey of computer as they were in 1968 and makes many accurate predictions.
One Man's View of Computer Science
This is the 1968 Turing award lecture by Richard Hamming. His citation reads “For his work on numerical methods, automatic coding systems, and error-detecting and error-correcting codes.”
The Nature of the Firm
Price mechanism and markets are a way to coordinate diverse factors of production. If so, why is every transaction is not done on market and why do firms exist? A worker in a firm moves between departments not because of change in relative prices but because he is asked to. This is because engaging in a market is not free and comes with transaction costs. An entrepreneur by employing and organizing factors of production in a firm saves these marketing costs. Then why do markets exist in the first place? As a firm gets larger, there are diminishing returns to management and organization costs increase to the level of marketing costs. This makes it profitable to carry out the transaction on market and limits the size of the firm.
Elements of Diffusion of Innovation
The following is the first chapter in the landmark communication studies book Diffusion of innovations. This book, one of the most cited in social sciences, makes for an essential reading for anyone in a startup. Engineers usually assume technological innovations speak for themselves and they will be adopted automatically and rapidly. Nothing can be farther from truth, with many innovations take forever to (or never) get adopted. Diffusion of innovation is inherently a social process where communication and social structure play as much role as the innovation itself. This chapter gives an overview of this process.
The Strength of Weak Ties
This is a landmark paper in sociology. Idea is to use social networks to link micro and macro levels of sociology. Core idea of the paper is this: strong friendships and ties form cliques while weak ties act as ‘bridges’ that connect different cliques. Therefore, information from strong friendships/ties is strongly correlated to your own, while weak ties give you information unknown to you. This makes weak ties crucial in explaining many phenomenon including diffusion of ideas, rumours and innovations, individual’s network and even community cohesion and organization.
The Forms of Capital
Capital is accumulated labour and inertia which makes games of society less like roulette. Distribution of capital represents the social structure. If we use such a broader definition of capital, it is important to consider the forms other than those recognized by economists. Cultural capital – essentially knowledge or technology – exists in embodied state (in you), objectified state (in your laptop) or institutionalized state (in your college degree). Social capital are your connections or network that allow you to mobilize resources and can be institutionalized by family name, elite school or like. Transfer, reproduction, conversions of these forms of capital is not straightforward like it is for economic capital and incurs significant transaction costs.
Go To Statement Considered Harmful
This is a classic paper originally published in Communications of the ACM, 1968 by Edsger W. Dijkstra and a mirror of another mirror. Yes, that Dijkstra of the graph theory algorithm. I just learnt that he is a big systems guy too.
A Protocol For Packet Network Intercommunication
This paper published originally in IEEE Transactions on Communications in 1974 setup the stage for internet. The protocol described here, TCP/IP, is basically how internet works to the day. Work around this paper earned the authors a Turing award. The papers remains highly relevant and readable to the day.
The UNIX Time-Sharing System
This is the 1974 paper that started it all and is highly relevant to this day. The design choices described in this paper significantly influenced the modern OS design. Best of all, it’s a breeze to read and it flows like a tutorial.
End-To-End Arguments in System Design
This paper is essentially the design philosophy of internet and its protocols. In the context of layered systems, end-to-end principle is that you should not make lower layers feature-rich and should leave the features to higher-level subsystems. For example, there’s no point having your communication channel doing the encryption. Lower levels should be simple and need not be perfect. Design tradeoffs to make a low-level subsystem perfect are usually not worth it. This design principle applies not only to network but also to file systems, operating systems and even processor design. It’s strikingly similar to Unix design philosophy: do one thing and do it well.
Embracing Change with Extreme Programming
In the late 90s, right before dotcom bubble, people started finding software engineering hard. Many methodologies to do software engineering were published but Kent Beck’s Extreme Programming was one of the most influential. The current paper was originally published in IEEE computer in 1999. Beck followed up this paper with a highly cited book Extreme Programming: Embrace Change.