How I Created the Theory

He hath shook hands with time.
JOHN FORD, The Broken Heart 
Energy has mass and mass represents energy.
ALBERT EINSTEIN, The Evolution of Physics 
Margaret Fuller: I accept the Universe. Thomas Carlyle: Gad! she'd
better! (Attributed to Thomas Carlyle) 
Creating
a new theory is not like destroying an old barn and erecting a
skyscraper in its place. It is rather like climbing a mountain, gaining
new and wider views, discovering unexpected connections between our
starting point and its rich environment. But the point from which we
started out still exists and can be seen, although it appears smaller and
forms a tiny part of our broad view gained by the mastery of the
obstacles on our adventurous way up.
ALBERT EINSTEIN, The Evolution of Physics 
IT IS NOT easy to talk about how I reached the idea of the theory of relativity; there were so many hidden complexities to motivate my thought, and the impact of each thought was different at different stages in the development of the idea. I will not mention them all here. Nor will I count the papers I have written on this subject. Instead I will briefly describe the development of my thought directly connected with this problem.
It was more than seventeen years ago that I had an idea of developing the theory of relativity for the first time. While I cannot say exactly where that thought came from, I am certain that it was contained in the problem of the optical properties of moving bodies. Light propagates through the sea of ether, in which the Earth is moving. In other words, the ether is moving with respect to the Earth. I tried to find clear experimental evidence for the flow of the ether in the literature of physics, but in vain.
Then I myself wanted to verify the flow of the ether with respect to the Earth, in other words, the motion of the Earth. When I first thought about this problem, I did not doubt the existence of the ether or the motion of the Earth through it. I thought of the following experiment using two thermocouples: Set up mirrors so that the light from a single source is to be reflected in two different directions, one parallel to the motion of the Earth and the other antiparallel. If we assume that there is an energy difference between the two reflected beams, we can measure the difference in the generated heat using two thermocouples. Although the idea of this experiment is very similar to that of Michelson, I did not put this experiment to the test.
While I was thinking of this problem in my student years, I came to know the strange result of Michelson's experiment. Soon I came to the conclusion that our idea about the motion of the earth with respect to the ether is incorrect, if we admit Michelson's null result as a fact. This was the first path which led me to the special theory of relativity. Since then I have come to believe that the motion of the Earth cannot be detected by any optical experiment, though the Earth is revolving around the Sun.
I had a chance to read Lorentz's monograph of 1895. He discussed and solved completely the problem of electrodynamics within the first approximation, namely neglecting terms of order higher than v / c, where v is the velocity of a moving body and c is the velocity of light. Then I tried to discuss the Fizeau experiment on the assumption that the Lorentz equations for electrons should hold in the frame of reference of the moving body as well as in the frame of reference of the vacuum as originally discussed by Lorentz. At that time I firmly believed that the electrodynamic equations of Maxwell and Lorentz were correct. Furthermore, the assumption that these equations should hold in the reference frame of the moving body leads to the concept of the invariance of the velocity of light, which, however, contradicts the addition rule of velocities used in mechanics.
Why do these two concepts contradict each other? I realized that this difficulty was really hard to resolve. I spent almost a year in vain trying to modify the idea of Lorentz in the hope of resolving this problem.
By chance a friend of mine in Bern (Michele Besso) helped me out. It was a beautiful day when I visited him with this problem. I started the conversation with him in the following way: "Recently I have been working on a difficult problem. Today I come here to battle against that problem with you." We discussed every aspect of this problem. Then suddenly I understood where the key to this problem lay. Next day I came back to him again and said to him, without even saying hello, "Thank you. I've completely solved the problem." An analysis of the concept of time was my solution. Time cannot be absolutely defined, and there is an inseparable relation between time and signal velocity. With this new concept, I could resolve all the difficulties completely for the first time.
Within five weeks the special theory of relativity was completed. I did not doubt that the new theory was reasonable from a philosophical point of view. I also found that the new theory was in agreement with Mach's argument. Contrary to the case of the general theory of relativity in which Mach's argument was incorporated in the theory, Mach's analysis had [only] indirect implication in the special theory of relativity. This is the way the special theory of relativity was created.
My first thought on the general theory of relativity was conceived two years later, in 1907. The idea occurred suddenly. I was dissatisfied with the special theory of relativity, since the theory was restricted to frames of reference moving with constant velocity relative to each other and could not be applied to the general motion of a reference frame. I struggled to remove this restriction and wanted to formulate the problem in the general case.
In 1907 Johannes Stark asked me to write a monograph on the special theory of relativity in the journal Jahrbuch der Radioaktivitat. While I was writing this, I came to realize that all the natural laws except the law of gravity could be discussed within the framework of the special theory of relativity. I wanted to find out the reason for this, but I could not attain this goal easily.
The most unsatisfactory point was the following: Although the relationship between inertia and energy was explicitly given by the special theory of relativity, the relationship between inertia and weight, or the energy of the gravitational field, was not clearly elucidated. I felt that this problem could not be resolved within the framework of the special theory of relativity.
The breakthrough came suddenly one day. I was sitting on a chair in my patent office in Bern. Suddenly a thought struck me: If a man falls freely, he would not feel his weight. I was taken aback. This simple thought experiment made a deep impression on me. This led to the theory of gravity. I continued my thought: A falling man is accelerated. Then what he feels and judges is happening in the accelerated frame of reference. I decided to extend the theory of relativity to the reference frame with acceleration. I felt that in doing so I could solve the problem of gravity at the same time. A falling man does not feel his weight because in his reference frame there is a new gravitational field which cancels the gravitational field due to the Earth. In the accelerated frame of reference, we need a new gravitational field.
I could not solve this problem completely at that time. It took me eight more years until I finally obtained the complete solution. During these years I obtained partial answers to this problem.
Ernest Mach was a person who insisted on the idea that systems that have acceleration with respect to each other are equivalent. This idea contradicts Euclidean geometry, since in the frame of reference with acceleration Euclidean geometry cannot be applied. Describing the physical laws without reference to geometry is similar to describing our thought without words. We need words in order to express ourselves. What should we look for to describe our problem? This problem was unsolved until 1912, when I hit upon the idea that the surface theory of Karl Friedrich Gauss might be the key to this mystery. I found that Gauss's surface coordinates were very meaningful for understanding this problem. Until then I did not know that Bernhard Riemann [who was a student of Gauss's] had discussed the foundation of geometry deeply. I happened to remember the lecture on geometry in my student years [in Zurich] by Carl Friedrich Geiser who discussed the Gauss theory. I found that the foundations of geometry had deep physical meaning in this problem.
When I came back to Zurich from Prague, my friend the mathematician Marcel Grossman was waiting for me. He had helped me before in supplying me with mathematical literature when I was working at the patent office in Bern and had some difficulties in obtaining mathematical articles. First he taught me the work of Curbastro Gregorio Ricci and later the work of Riemann. I discussed with him whether the problem could be solved using Riemann theory, in other words, by using the concept of the invariance of line elements. We wrote a paper on this subject in 1913, although we could not obtain the correct equations for gravity. I studied Riemann's equations further only to find many reasons why the desired results could not be attained in this way.
After two years of struggle, I found that I had made mistakes in my calculations. I went back to the original equation using the invariance theory and tried to construct the correct equations. In two weeks the correct equations appeared in front of me!
Concerning my work after 1915, I would like to mention only the problem of cosmology. This problem is related to the geometry of the universe and to time. The foundation of this problem comes from the boundary conditions of the general theory of relativity and the discussion of the problem of inertia by Mach. Although I did not exactly understand Mach's idea about inertia, his influence on my thought was enormous.
I solved the problem of cosmology by imposing invariance on the boundary condition for the gravitational equations. I finally eliminated the boundary by considering the Universe to be a closed system. As a result, inertia emerges as a property of interacting matter and it should vanish if there were no other matter to interact with. I believe that with this result the general theory of relativity can be satisfactorily understood epistemologically.