Classic Papers

I maintain a mirror of some classic papers because the originals tend to be not well formatted for the modern web. I also add my notes and higlights.

Biology

1943, Erwin Schrodinger, What is Life? The Physical Aspect of the Living Cell
This short book, written by Schrodinger, one of the fathers of quantum mechanics, influenced the discovery of DNA. In 1943, in this book, he predicted the presence of ‘an aperiodic crystal’ and how it is a ‘code-script’ for how organisms work. Within 10 years, in 1953, DNA was discovered and it is both aperiodic and code-script of life! He also shows how life is unintuitive according to known physical laws, how it feeds on ‘negative entropy’ and how we need different type of laws to explain life. Epilogue On Determinism and Free Will is a cherry on the top and gives a physicist view on the topic. Written for dilettantes like myself (Schrodinger himself being one), this book is highly readable and relevant.
1953, J.D. Watson and F.H.C. Crick, Molecular Structure of Nucleic Acids: A Structure for Deoxyribose Nucleic Acid
This paper, originally published in Nature in in 1953, unraveled the structure of DNA. This was such a huge contribution to biology that it won the authors Nobel Prize just 9 years after the publication. It’s amazing how DNA has become such a commonplace knowledge in less than 50 years. The paper is surprisingly readable, short and to the point.
1968, Motoo Kimura, Evolutionary Rate at the Molecular Level
This paper, published in 1968, is a prelude to the highly influential The neutral theory and molecular evolution published later in 1983. This paper does some computations about rate of mutations at DNA level. These numbers turn to be so high that we have no choice but to accept that most of the mutation are selectively neutral. This is in contrast to widely held notion that evolution, and therefore mutations, happens by natural selection. Selection must just be one of the many evolutionary forces that shape an organism.

Cognitive Science

1956, George A. Miller, The Magical Number Seven, Plus or Minus Two: Some Limits on Our Capacity for Processing Information
This is a classic 1956 paper marrying information theory and psychology. Easy read yet cited 33000 (!) times. Preview: Humans have limited judgement, limited memory and use recursion to process so much information
1974, Amos Tversky and Daniel Kahneman, Judgment under Uncertainty: Heuristics and Biases
This landmark cognitive science paper lead to prospect theory, behavioral economics and eventually Nobel Prize in Economics to the authors. Hypothesis of this paper is presented very well: humans rely on a set of heuristics for decision-making and these useful yet incomplete heuristics lead to cognitive biases in judgement. These heuristics are (i) Representativeness: probability of an event which resembles a class is judged to be high. This leads to insensitivity to priors, sample size etc. (ii) Availability: probability of an event is judged by its imaginability. This leads to biases such as illusory correlation (iii) Anchoring and adjustment: people adjust estimates from an initial anchor. Insufficient adjustment leads to under or over estimation. Note that these heuristics and biases are distinct from motivational biases such as wishful thinking.

Computer Science

1955, J. McKarthy, M.L. Minsky, N. Rochester, C.E. Shannon, A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence
In 1955, four computer scientists wrote the following proposal for a workshop to lay groundwork for Artificial Intelligence. This workshop is considered to be the founding event of AI as a field. Ideas from this proposal remain highly relevant to the day. I added side notes with my thoughts and the connections I could trace to the more modern AI ideas.
1960, J.C.R. Licklider, Man-Computer Symbiosis
There were always two competing schools of thought for automation and artificial intelligence in the computing world: replacement and symbiosis. Author of this paper falls firmly into symbiosis school advocating for close and cooperative interaction between humans and computers. Author suggests that computers can help humans formulate their questions better through trial and error analysis. Humans are better with setting goals, hypotheses and motivations while computers are good at converting these hypotheses into testable models which are verified against data. Many technology challenges to this vision existed when author wrote this paper of which many of them are now solved.
1968, Edsger W. Dijkstra, Go To Statement Considered Harmful
This is a classic paper originally published in Communications of the ACM, 1968 by Edsger W. Dijkstra and a mirror of another mirror. Yes, that Dijkstra of the graph theory algorithm. I just learnt that he is a big systems guy too.
1974, Vinton G. Cerf And Robert E. Kahn, A Protocol For Packet Network Intercommunication
This paper published originally in IEEE Transactions on Communications in 1974 setup the stage for internet. The protocol described here, TCP/IP, is basically how internet works to the day. Work around this paper earned the authors a Turing award. The papers remains highly relevant and readable to the day.
1974, Dennis M. Ritchie and Ken Thompson, The UNIX Time-Sharing System
This is the 1974 paper that started it all and is highly relevant to this day. The design choices described in this paper significantly influenced the modern OS design. Best of all, it’s a breeze to read and it flows like a tutorial.
1981, J. H. Saltzer, D. P. Reed, And D. D. Clark , End-To-End Arguments in System Design
This paper is essentially the design philosophy of internet and its protocols. In the context of layered systems, end-to-end principle is that you should not make lower layers feature-rich and should leave the features to higher-level subsystems. For example, there’s no point having your communication channel doing the encryption. Lower levels should be simple and need not be perfect. Design tradeoffs to make a low-level subsystem perfect are usually not worth it. This design principle applies not only to network but also to file systems, operating systems and even processor design. It’s strikingly similar to Unix design philosophy: do one thing and do it well.
1986, David E. Rumelhart, Geoffrey E. Hinton and Ronald J. Williams, Learning Representations by Back-propagating Errors
This is the classic paper that rediscovered back-propagation. Conceptually, back propagation is quite simple and just is a repeated application of chain rule. However, results of applying backprop for multi layer neural networks have been spectacular. This paper reads like a very brief tutorial of deep learning.
1999, Kent Beck, Embracing Change with Extreme Programming
In the late 90s, right before dotcom bubble, people started finding software engineering hard. Many methodologies to do software engineering were published but Kent Beck’s Extreme Programming was one of the most influential. The current paper was originally published in IEEE computer in 1999. Beck followed up this paper with a highly cited book Extreme Programming: Embrace Change.

Computer Science (Misc)

1967, Alan J. Perlis, The Synthesis of Algorithmic Systems
1968, Maurice V. Wilkes, Computers Then and Now
This is the 1967 Turing award lecture by Maurice V. Wilkes. From Wikipedia, Wilkes is best known as the builder and designer of the EDSAC, the first computer with an internally stored program. This lecture does a wide survey of computer as they were in 1968 and makes many accurate predictions.
1969, Richard W. Hamming, One Man's View of Computer Science
This is the 1968 Turing award lecture by Richard Hamming. His citation reads “For his work on numerical methods, automatic coding systems, and error-detecting and error-correcting codes.”
2001, Linus Torvalds, A Critique on Intellectual Property
In 2001, Linus Torvalds wrote this essay as an appendix to his book - Just for fun. I thought this essay deserves wider readership. I made some edits to it to make it more readable as a paper. I hope Torvalds doesn’t kill me for this!
2005, John Willinsky, The Unacknowledged Convergence of Open Source, Open Access and Open Science
This is just a reformat and mirror of a 2005 paper by John Willinsky because the original link renders slow.

Management

1931, Neil McElroy, Brand Man: Origin of Product Management
Quoting wikipedia for history of product management - ‘The concept of product management originates from a 1931 memo by Procter & Gamble President Neil H. McElroy. McElroy, requesting additional employees focused on brand management, needed “Brand Men” who would take on the role of managing products, packaging, positioning, distribution, and sales performance.’. Following is that 1931 memo.
1979, Michael E. Porter, The Five Competitive Forces That Shape Strategy
In this highly influential paper at the intersection of economics and business, author shows that industry-level profitability is a factor of five competitive forces. Unlike classical micro economic theory which sees competition between industry rivals as the only force, it is only one of the author’s five forces. Other four forces influencing firm’s bargain power, thus its profitability, are: bargaining power of suppliers and buyers, threat of new entrants and substitute products. Understanding these five forces, allows the firm to formulate strategy of higher profitability. Some such strategies may include positioning the company to exploit the weaker forces, exploit the change in the structural forces or change the balance itself. Antecedents of the five forces can be seen in classical economics, new institutional economics and Schumpeterian evolutionary economics.
1989, Fred D. Davis, Perceived Usefulness, Perceived Ease of Use, and User Acceptance of Information Technology
This paper was written in the late 80s during the adoption of foundational information technologies like email. It deals with how users perceive and accept these technologies. It is shown that perceived usefulness and ease of use are great predictors of usage. While usefulness is more correlated with usage, usefulness is dependent on ease of use. i.e. A technology is not useful if it is not easy to use. Therefore following causal chain is observed: perceived ease of use → perceived usefulness → usage. I have illustrated the process with a simple case study on how my mom adopted an instant-delivery platform.
1990, Wesley M. Cohen And Daniel A. Levinthal, Absorptive Capacity: A New Perspective on Learning and Innovation
It’s easier to learn new things in a topic we already know. It’s hard to appreciate advances in a topic if we don’t know much about that topic. Therefore, more knowledge we have, easier it is to absorb new knowledge and innovate. There is also a exploration vs exploitation tradeoff: Hierarchy, increased communication and specialization leads exploitation of existing knowledge. Flat structure and diversity encourages exploration of new knowledge and increases the absorptive capacity in the new topics. Another consequence of non-linearity of knowledge acquisition is that firm’s evolutionary history both encourages and constrains the kind of knowledge it can acquire in the future.
1991, Jay Barney, Firm Resources and Sustained Competitive Advantage
This work is pivotal for the emergence of the resource-based view of the firm, the dominant framework for analyzing competitive strategy. Previous work in this space has assumed homogeneity of the firms to emphasize the effects of competitive environment. This works breaks down these assumptions and highlights the heterogeneity and immobility of firm attributes and resources. In fact, no firm can have sustained competitive advantage if the firm resources are uniform and/or can be bought in a market. For a firm resource to hold the potential for sustained competitive advantage, it has to be (a) valuable, (b) rare, (c) imperfectly imitable and (d) not substitutable by an equivalent resource. Imperfect imitability of a resource can arise because it is (i) history dependent, (ii) causally ambiguous or (iii) socially complex.
1991, James G. March, Exploration and Exploitation in Organizational Learning
This paper published originally in Organizational Science in 1991 is highly influential (cited >20k times) the topics of innovation and organizational learning. It is particularly influential on the concept of ambidextrous organization. This paper shows the myopia of learning or exploitation and emphasizes the importance of exploring and trying out new things. The ideas are developed through a simple but revealing mathematical model. Results presented can be thought provoking and may look counter-intuitive at the first glance. So, this paper makes for a great read.
1994, Ikujiro Nonaka, A Dynamic Theory of Organizational Knowledge Creation
This paper, originally published in Organization Science in 1994, is highly influential on knowledge based view of the organization. Given how modern organizations sell software services (and ideas) as opposed to goods, this view is highly relevant. This papers starts with an epistemological view of knowledge or innovation, emphasizes on processes which create knowledge and presents an organizational design incorporating these processes. Highly interesting is the perspective of embodiment of knowledge and how innovation arises as an interaction between tacit and explicit knowledges.
1995, John P. Kotter, Leading Change: Why Transformation Efforts Fail
Change and renewal is important for organizational survival because circumstances change. Usually ‘status quo’ is incorporated into structure and systems of the organization. Therefore, transformation effort is not easy and most effort fail. In this ‘manual of change’, author shows that change process happen in series of phases which put together take significant time. Author lays down these phases and common errors in each of these phases. The phases are creating a sense of urgency, guiding coalition, vision, communicating that vision, removing obstacles, creating short wins, being persistent and culture change. Mistakes in any of these phases can derail the change efforts and tradition can take over.
1997, David J. Teece, Gary Pisano and Amy Shuen, Dynamic Capabilities and Strategic Management
This paper introduces a new framework which emphasizes organizational endogenous variables to explain competitive advantage of a firm. Three dimensions important to this approach are (1) production, learning and transformational processes, (2) technological, structural and reputational positions and (3) firm-historical paths that lead to the current processes and positions. Firms are seen as unique and it is hard to replicate by itself for expansion and imitate by competition. In fact, evolutionary paths that the firm has taken constrain the future paths available to it. In light of this, capability of an organization to dynamically reconfigure in the face of technological and market changes bestows a significant competitive advantage. This approach is in contrast to earlier approaches which see firms as homogenous entities and strategy as ‘blocking’ competition from the market.

Math

1935, A. Einstein, B. Podolsky and N. Rosen, Can Quantum-Mechanical Description of Physical Reality Be Considered Complete?
This paper introduces the famous Einstein-Podolsky-Rosen(EPR) Paradox paradox. Right after the mainstream acceptance of quantum mechanics in the late 1920s, EPR discovered the problem of quantum entanglement and teleportation implied by the formulation of quantum mechanics. The problem is that measurement of one system can mysteriously affect another system even if they are not interacting any more. This counter-intuitive prediction of quantum mechanics is actually verified to be true! In fact, concept of entanglement remains central to quantum computing.
1987, Per Bak, Chao Tang, and Kurt Wiesenfeld, Self-Organized Criticality: An Explanation of 1/f Noise
Power law distribution is all around us, in cities, internet, genes, earthquakes and even brain. Physicists call this 1/f noise. This papers suggests a simple cellular automata model which can generate the power-law. Once the model evolves into a minimally stable state, it is said to be in self-organized criticality. From this state, small disturbances do nothing most of the time but sometimes create ‘black-swan’ avalanches which destroys the whole state.
1998, Duncan J. Watts and Steven H. Strogatz, Collective dynamics of 'small-world' networks
Have you heard of 6-degrees of separation Or Erdos’ number? Have you ever meet a stranger and realize you both actually have a mutual friend? This is the small-world phenomenon. It has been observed in wide varieties of natural and artificial networks including internet, genes and neural networks. This paper, cited >40k times, rekindled the interest in small-world phenomenon. It uses a simple construction to demonstrate the phenomenon and shows how computation happens remarkably quickly on the small-world networks. In particular, authors show how a infectious disease (like COVID-19) spreads extremely fast on these networks.
1999, Albert-László Barabási and Réka Albert, Emergence of Scaling in Random Networks
Another important paper in the network science field (cited >35k times). Just like small-world networks, authors observed an universal among a lot of natural and artificial networks: scale-free power-law distribution of vertex connectivity. What this means is that there are a few nodes which is highly connected and most of the nodes are only connected to a few. Authors show that this phenomenon happens when networks add new nodes with time and these nodes connect preferentially to already well connected nodes. May be this is how networks evolve and genetic and neural networks end up being scale-free.

Social Science

1937, Ronald H. Coase, The Nature of the Firm
Price mechanism and markets are a way to coordinate diverse factors of production. If so, why is every transaction is not done on market and why do firms exist? A worker in a firm moves between departments not because of change in relative prices but because he is asked to. This is because engaging in a market is not free and comes with transaction costs. An entrepreneur by employing and organizing factors of production in a firm saves these marketing costs. Then why do markets exist in the first place? As a firm gets larger, there are diminishing returns to management and organization costs increase to the level of marketing costs. This makes it profitable to carry out the transaction on market and limits the size of the firm.
1962, Everett M. Rogers, Elements of Diffusion of Innovation
The following is the first chapter in the landmark communication studies book Diffusion of innovations. This book, one of the most cited in social sciences, makes for an essential reading for anyone in a startup. Engineers usually assume technological innovations speak for themselves and they will be adopted automatically and rapidly. Nothing can be farther from truth, with many innovations take forever to (or never) get adopted. Diffusion of innovation is inherently a social process where communication and social structure play as much role as the innovation itself. This chapter gives an overview of this process.
1973, Mark S. Granovetter, The Strength of Weak Ties
This is a landmark paper in sociology. Idea is to use social networks to link micro and macro levels of sociology. Core idea of the paper is this: strong friendships and ties form cliques while weak ties act as ‘bridges’ that connect different cliques. Therefore, information from strong friendships/ties is strongly correlated to your own, while weak ties give you information unknown to you. This makes weak ties crucial in explaining many phenomenon including diffusion of ideas, rumours and innovations, individual’s network and even community cohesion and organization.
1981, Oliver E. Williamson, The Economics of Organization: The Transaction Cost Approach
Transaction cost approach is an inter-disciplinary approach to organizations that combine economics, sociology and law. Transaction cost, economic counterpart to friction, occurs when good or service cannot be transferred between parties easily. Transaction cost is illustrated by dilemma: is it cheaper to make components yourself or buy them from market? Transaction cost arise from the fact that humans are computationally limited while motivationally complex. Transaction cost can be dimensionalized on uncertainty, frequency and asset specificity. These dimensions can be used to explain different organizational structures.
1986, Pierre Bourdieu, The Forms of Capital
Capital is accumulated labour and inertia which makes games of society less like roulette. Distribution of capital represents the social structure. If we use such a broader definition of capital, it is important to consider the forms other than those recognized by economists. Cultural capital – essentially knowledge or technology – exists in embodied state (in you), objectified state (in your laptop) or institutionalized state (in your college degree). Social capital are your connections or network that allow you to mobilize resources and can be institutionalized by family name, elite school or like. Transfer, reproduction, conversions of these forms of capital is not straightforward like it is for economic capital and incurs significant transaction costs.
1988, James S. Coleman, Social Capital in the Creation of Human Capital
In a theoretical attempt to bring economists’ rational action model into social systems, author introduces the concept of social capital similar to financial, physical and human capitals. Social capital, which inheres in the strength of social structure and obligations, allows actors to leverage it to achieve certain ends, both economical and non-economical. Three forms of social capital are obligations, information channels and norms of the society. Closure of social structure facilitates social capital accumulation further. Empirical data shows that social capital, both within and outside family, is an important negative predictor of school dropout rate. Thus social capital plays an important role in the creation of human capital.
1989, Francis Fukuyama, The End of History?
Written on the eve of end of Cold War, this political science essay can be interpreted as the summation of US and liberalism’s victory in the ideological realm. Author starts his essay by presenting Hegelian idealism which holds that consciousness and ideals are the cause of changes in the material world, not the other way around. Main ideological competitors to liberalism, fascism and communism were destroyed by World War II and Cold War respectively. Rise of liberalism in China, Russia and many parts of Asia is presented as evidence of the victory of liberal democracy. With this ‘end of history’, author believes international life will be more about economics and less about idealogy.