Appendix 1. Weighing the evidence for physical time dilation.
The main argument of this paper does not depend on the evidence for physical time dilation. The main argument is that even though Einstein’s and Lorentz’s interpretations of the Lorentz transformations are generally (but in some key ways not) empirically equivalent, we should nevertheless interpret the Lorentz transformations as Lorentz himself did: that relativistic effects are due to interaction with space/ether, based on the larger empirical findings that are discussed in the paper. This core argument does not rest on particular evidence regarding physical time dilation because time dilation occurs, as either a coordinate effect or a physical effect, under both interpretations of the Lorentz transformations.
The core of the debate comes down to how we interpret the relativistic effects that are contained in the equations, as discussed in the body of my paper: do we rely on “spacetime structure” to explain relativistic effects (as Einstein did with his postulated isotropic speed of light and the combined spacetime that flows from this assumption) or physical interaction with space/ether (as Lorentz did)?
However, since Lorentz suggested that time dilation was not a real physical phenomenon but, rather, a “mathematical fiction,” or coordinate effect only (Galison 2004), evidence showing that physical time dilation is a real phenomenon rather than a coordinate effect only would weigh in favor of the Einstein approach rather than the Lorentz approach, all else equal. I argue, of course, in the paper that all else is not equal, and this is why the argument does not hinge on the evidence with respect to time dilation.
I offer in this appendix, however, some considerations on the evidence collected thus far on physical time dilation – do clocks actually measure different elapsed times in different moving frames? – and I conclude that this evidence is weaker than required to be considered a real physical phenomenon at this time. If I am right, this further weighs in favor of the Lorentzian interpretation of the transformations. If physical time dilation is real, however, this weighs more in favor of the Einstein interpretation.
I will look at three key papers that examine time dilation. This is obviously not a comprehensive examination of the evidence – time and space will not permit that kind of examination. By examining three key papers instead I hope to provide a reasonable overview of the state of the science in this area. It also turned out, serendipitously, that each of these three papers suffers from different types of issues that, in different ways, cast serious doubt on their purported support for SR.
First, I’ll examine the well-known Hafele-Keating experiment (Hafele and Keating 1972, “Around the world atomic clocks: observed relativistic time gains”), which was one of the first experiments that found significant time dilation effects, and also received significant media attention at the time and since. The experiment involved shipping four cesium clocks on jetliners traveling different directions around the world and then comparing their readings.
The HK experiment had many serious issues from the outset, as their 1972 paper itself describes. The authors identify two main experimental accuracy issues: 1) the fact that they were measuring effects on the order of 0.1 microseconds per day and their machinery’s accuracy was only within 1 microsecond per day; 2) in correcting the data for this issue they needed to also correct for unpredictability in expected drift in each clock, which they attempted to do with two different methods discussed.
With respect to the first method for correcting for naturally-occuring time drift in the four clocks employed for the experiment, ”the average rate method,” the authors state (p. 169): ”Reliability of results with the average rate method, however, depends on the unlikely chance that only one rate change occurred during each trip and that it occurred at the midpoints. Furthermore, there is no obvious method for estimating the experimental error. Nevertheless, the average rate method does produce convincing qualitative results.”
The last sentence is rather incredible given the first two sentences.
With respect to the second method, the authors state (p. 177): ”An analysis of these data revealed the times and magnitudes for correlated rate changes during each trip. Thus significant rate changes were identified and ascribed to each clock. A piecewise extrapolation of the time trace for each clock relative to MEAN(USNO), with proper accounting for these identified rate changes, then produced the relativistic time differences [observed].”
We have to dig a bit deeper to find why this method, rather than being an appropriate adjustment, seems instead to be a strong example of cherry picking the data. Kelly 2000 looks at the original data collected by HK from the four cesium clocks used in the experiment (this data was not published in the original paper), after the author request the original report from the US Naval Observatory, and concludes (emphasis added):
The [US Naval Observatory] standard station had some years previously adopted a practice of replacing at intervals whichever clock was giving the worst performance. On a similar basis, the results of Clock 120 [one of the four used by HK] should have been disregarded. That erratic clock had contributed all of the alteration in time  on the Eastward test and on the Westward test, as given in the 1971 report. Discounting this one totally unreliable clock, the results would have been within 5ns and 28ns of zero on the Eastward and Westward tests respectively. This is a result that could not be interpreted as proving any difference whatever between the two directions of flight.
Accordingly, under Kelly 2000’s re-examination of the raw data, it seems that we should accord little to no weight to this now iconic experiment purporting to find strong evidence of physical time dilation – that is, real differences in the elapsed time of traveling clocks.
Turning to the second paper, Reinhardt et al. 2007 conducts a complex experiment, the latest in a long line of Ives-Stillwell-type experiments, specifically using lithium ion resonance frequencies and saturation spectroscopy in ion storage rings. The experiment measured the frequency of similarly-accelerated lithium ion groups, at 3.0% and 6.4% of the speed of light, respectively. By comparing the resonance frequency of the two groups to the frequency of the measurement lasers, the time dilation prediction of SR can be tested. The experiment predicts that the product of the two measurement lasers’ (parallel and anti-parallel to the direction of the ions) frequency will match the product of the frequency of the ions’ frequencies in the laboratory rest frame.
The paper states: “Time dilation is one of the most fascinating aspects of special relativity as it abolishes the notion of absolute time. … Here we report on a method, based on fast optical atomic clocks with large, but different Lorentz boosts, that tests relativistic time dilation with unprecedented precision.” There are no traditional clocks involved, however; the “clocks” mentioned refers to the frequency of the accelerated lithium ions, which will change with acceleration when compared to the rest frame frequencies. While not a traditional clock, this change in frequency functions as a clock under the same principles as any clock: by measuring a certain type of periodic motion.
The paper briefly discusses the need for a test theory in order to examine the purported relativistic effects and settles on the Robertson Mansouri Sexl (RMS) test framework, which is the most common test theory for measuring relativistic effects. RMS assumes an arbitrarily chosen rest frame and, if there is deviation from expected results in the rest frame, this deviation is interpreted as support for the Einsteinian no-rest frame approach.
Reinhardt et al. 2007 resulted in the most accurate measurements of time dilation at the time of the experiment (there is a similar paper, Botermann et al. 2014, that finds even more accurate results), a value of |\(\hat{\alpha}\)|≤ 8.4 x 10−8. This indicates, if the results are accurate, that any deviation from the expected time dilation of Einstein’s theory is small indeed, at less than one in a hundred million. The paper states that within the “RMS framework, this result constrains the existence of a preferred reference frame in the universe (for example, the cosmic-microwave-background frame).”
This is an apparently strong empirical result, but, importantly, it does not distinguish between the ether-interaction Lorentz interpretation and Einstein’s structure of spacetime-interaction interpretation of the Lorentz transformations. This is the case because the experimenters, in evaluating the results within the RMS framework, used the lab itself as the rest frame, Σ, which is permissible under the RMS test theory (any frame can be chosen as the rest frame in RMS). Thus, the conclusion about the results constraining a CMB reference frame (or other basis for a background reference frame) don’t match up with the measured results.
Since the measured result occurs as a result of using the Lorentz transformations, regardless of whether we follow the Einstein interpretation or the Lorentz interpretation, the RMS test framework, and this experiment specifically, cannot be used to distinguish between the two interpretations. Accordingly, this experiment is not necessarily a test of physical time dilation because it can equally validly be interpreted as finding time dilation as a coordinate effect only. Indeed, Mansouri and Sexl 1977 states: “Thus the much debated question concerning the empirical equivalence of special relativity and an ether theory taking into account time dilatation and length contraction but maintaining absolute simultaneity can be answered affirmatively.” In other words, Lorentz’s ether-based approach and Einstein’s approach are, according to Mansouri and Sexl, empirically equivalent – in terms of measuring the relativistic effects of time dilation and length contraction. And experiments that use the RMS test theory to evaluate results aren’t able to distinguish between these two approaches.
A little more explanation may be helpful in terms of why the Reinhardt et al. experiment, and related Ives-Still experiments that use the RMS test theory, are not able to distinguish between these different interpretations. Reinhardt et al. 2007 assumes the lab as the rest frame for comparison against the expected SR results. If, however, relativistic effects were in fact due to interactions with the ether/field rest frame (as Lorentz supposed) the RMS test theory cannot make this distinction. The physical core of the Lorentz interpretation is that length contraction results from interaction with the ether as physical objects move through the ether. But time dilation was, for Lorentz, a mathematical artifact (coordinate effect only) – a result of mathematically reconciling Maxwell’s equations with dynamics – and not a real physical effect. The lab rest frame is obviously not the same as the actual ether frame, the underlying fabric/field of space, so we would not under Lorentz’s approach expect to find any physical length contraction or other dynamical interactions with the ether when using the lab rest frame.
Mansouri and Sexl 1977 define the “ether system” as follows: “This ether system is defined by the requirements that the Einstein [synchronization technique] and the transport synchronization of clocks agree and that, furthermore, light propagation is isotropic in the ether system.” Einstein synchronization and slow clock transport synchronization procedures would agree in Lorentz’s ether frame but wouldn’t agree in the lab frame posited as rest frame because this is not Lorentz’s ether/field frame. Accordingly, the RMS test theory approach that substitutes the moving lab inertial frame as the rest frame (Σ) cannot distinguish between Lorentz and Einstein’s interpretations of the Lorentz transformations.
Looking at our third paper, both Reinhardt et al. 2007 and Botermann et al. 2014 (a follow up to the 2007 paper that finds slightly more accurate results) cite Wolf and Petit 1997 as one of the previous best tests of time dilation and as an example of “non-storage-ring experiments” (p. 864): “The new upper limit of |\(\hat{\alpha}\)|≤ 8.4 x 10−8 is more than an order of magnitude smaller than that obtained from non-storage-ring experiments.” Reinhardt et al. 2007 also states, again citing Wolf and Petit 1997: “We also provide the only test of time dilation more sensitive than that derived from the global positioning system.” Accordingly, let’s examine this third paper purporting to test relativistic effects.
Wolf and Petit 1997, looking at possible deviations from the constant speed of light between ground-based maser clocks and moving GPS satellite-based atomic clocks, found no deviation from the isotropic speed of light at the unprecedented (in 1997) accuracy of 5 X 10-9, accounting for systematic errors, and at 2 X 10-8 without accounting for such errors.
The authors warn of the risk of presupposing the validity of SR in testing the assumptions and predictions of relativity, and they make a number of methodological adjustments to avoid doing so:
Additionally one has to ensure that corrections applied to the raw timing data used for orbit determination and the measurement of T  do not presuppose the validity of special relativity. In fact, two corrections are routinely applied to GPS timing data, which are of relativistic origin and therefore do imply δc = 0: the correction for the gravitational redshift and the second-order Doppler shift of the rate of the satellite clock with respect to coordinate time, and the correction for the so-called Sagnac effect, which is due to the rotation of the Earth during signal transmission.
Nevertheless, they fall into the trap of tautologically presupposing the validity of SR by their use of slow clock synchronization and Einstein clock synchronization as an ongoing re-synchronization technique to maintain synchronization during the operation of the GPS system (indirectly in both cases, since they simply used available data from the GPS system rather than conducting their own experiment). This is a fatal flaw. Results that are tautologically determined are by definition unscientific and invalid.
The paper states: “δc is the deviation from c of the observed velocity of a light signal traveling one way along a particular spatial direction with the measuring clocks synchronized using slow clock transport.” Slow clock transport is by definition equivalent to Einstein synchronization in the same inertial frame. And under Einstein synchronization the constant speed of light, regardless of the motion of the observer, is assumed. This is an operational assumption made in order to provide a simple and reliable way to synchronize distant clocks. It is important to note also that ongoing re-synchronization cannot, of course, be done using slow clock transport; Einstein synchronization (using light signals) must be used. Einstein states in his well-known book on SR and GR (p. 27 of the 1952 edition, emphasis in the original):
There is only one demand to be made of the definition of simultaneity, namely, that in every real case it must supply us with an empirical decision as to whether or not the conception that has to be defined is fulfilled. That my definition satisfies this demand is indisputable. That light requires the same time to traverse [a given path] is in reality neither a supposition nor a hypothesis about the physical nature of light, but a stipulation which I can make of my own free will in order to arrive at a definition of simultaneity.
This technique does provide a concrete method for defining simultaneity and thus for synchronizing distant clocks, but we must be careful to not use this technique and then forget that we have from the outset assumed an isotropic c in order to achieve synchronization. Unfortunately, Wolf & Petit overlooked this issue in their methodology.
The 1997 paper is often cited (over 100 citations) as strong support for relativistic effects. While finding the methodological tautology in this paper is not readily apparent to the casual reader, it is surprising that no other physicists or philosophers have noticed this fatal flaw in this well-known paper.
In sum, based on this admittedly non-comprehensive review of key time dilation papers, the evidence for physical time dilation doesn’t seem to be very strong. This conclusion weighs further in favor of the Lorentzian ether-based interpretation of the Lorentz transformations and the view that apparent time dilation effects are better interpreted as coordinate effects only rather than physical time dilation.