Int J Sports Med. 2001 May;22(4):270-4.
Reliability of mean power recorded during indoor and outdoor self-paced 40 km cycling time-trials.
Smith MF, Davison RC, Balmer J, Bird SR.
Abstract
The purpose of this study was to assess reliability of both indoor and 
outdoor 40 km time-trial cycling performance. Eight trained cyclists completed three indoor 40 km time-trials on an air-braked ergometer (Kingcycle) and three outdoor 40 km time-trials on a local course. Power output was measured for all trials using the SRM powermeter. Mean performance time across three indoor trials was 54.21 +/- 2.59 (min:sec) and was significantly different (P<0.05) to mean time across three outdoor trials (57.29 +/- 3.22 min:sec). However, there was no significant difference (P = 0.34) for mean power across three indoor trials (303+/-35W) when compared to outdoor performances (312 +/- 23 W). 
Within-subject variation for mean power output expressed as a coefficient of variation (CV) improved in both indoors and outdoors for trials 2 and 3 (CV = 1.9%, 95% CI 1.0 - 3.4 and CV = 2.1 %, 95 % CI 1.1 - 3.8) when compared to trials 1 and 2 (CV=2.1%, 95% CI 1.2-3.8 and CV=2.4%, 95% CI 1.3-4.3). These findings indicate that power output measured using the SRM powermeter is highly reproducible for both laboratory-based and actual 40 km time-trial cycling performance
http://www.ncbi.nlm.nih.gov/pubmed/11414669
Sports Med. 2001;31(7):489-96.
Tests of cycling performance.
Paton CD, Hopkins WG.
Abstract
Performance tests are an integral component of assessment for competitive cyclists in practical and research settings. Cycle ergometry is the basis of most of these tests. Most cycle ergometers are stationary devices that measure power while a cyclist pedals against sliding friction (e.g. Monark), electromagnetic braking (e.g. Lode), or air resistance (e.g. Kingcycle). Mobile ergometers (e.g. SRM cranks) allow measurement of power through the drive train of the cyclist's own bike in real or simulated competitions on the road, in a velodrome or in the laboratory. The manufacturers' calibration of all ergometers is questionable; dynamic recalibration with a special rig is therefore desirable for comparison of cyclists tested on different ergometers. For monitoring changes in performance of a cyclist, an ergometer should introduce negligible random error (variation) in its measurements; in this respect, SRM cranks appear to be the best ergometer, but more comparison studies of ergometers are needed. Random error in the cyclist's performance should also be minimised by choice of an appropriate type of test. Tests based on physiological measures (e.g. maximum oxygen uptake, anaerobic threshold) and 
tests requiring self-selection of pace (e.g. constant-duration and constant-distance tests) usually produce random error of at least approximately 2 to 3% in the measure of power output. Random error as low as approximately 1% is possible for measures of power in 'all-out' sprints, incremental tests, constant-power tests to exhaustion and probably also time trials in an indoor velodrome. Measures with such low error might be suitable for tracking the small changes in competitive performance that matter to elite cyclists.
http://www.ncbi.nlm.nih.gov/pubmed/11428686