The elephant in the room theory

Recently, I have been investigating statistical trends in the seasonal performances of professional cyclists. I have endeavored to create a model of a clean athlete who has never engaged in doping. It feels as though I am a neural network of statistical artificial intelligence algorithm, tasked with determining, based on race results, which cyclists competed fairly.

Certainly, I recognize that this type of research does not provide definitive proof; however, it encourages a more discerning perspective. This is the reason behind the title of this topic. I must admit that I lack the motivation to present extensive statistical analyses here, and I doubt that the majority here will find value in an abundance of endless numerical tables. Nevertheless, should there be interest in this subject, I will consider sharing some summarized findings in the future.

I wish to convey the consistent trends I have noticed:

1. The overwhelming majority of activities that may be linked to doping issues pertains to a single event, the Tour de France. And in recent decades, the Olympic Games have been involved to a lesser degree. If I may draw a comparison, the Tour de France has transformed, since the early nineties, into an event where, akin to a sabbath, all the witches gather annually to consume the magical potions they have concocted throughout the year.

The positive aspect is that the remainder of the calendar features a distinct, less depraved sport entirely. Furthermore, even the usual suspects generally compete in a more 'normal' way in the Not the Tour de France races. This could account for the unexpected reluctance exhibited by some ostensibly clean riders towards mandatory participation in this central race of the year.

2. Among the principles that can be utilized to forecast abnormal outcomes, four factors merit consideration: the rider participates at his top in only one race annually and demonstrates unexpectedly poor performance in other competitions; the rider exhibits numerous irregular fluctuations in results from year to year, deviating from the typical bell-shaped curve of a sports career; the rider is part of a team that includes several other riders with questionable reputations; the rider was later disqualified due to doping violations.

3. Not all suspicious rider exhibit the same level of suspicion. Not every rider employs the same method of cheating. One can discuss several distinct levels of techniques that enable the enhancement of results through dishonest means. If I may return to the metaphorical comparison with the craft of witches, it is appropriate to mention at least three levels: firstly, basic witchcraft, where one slightly improves their results in a single race. This typically allows for the doubling of PCS points earned in that race. Most cheaters utilize this type of magic. Secondly, there is black magic. This can even bring the dead back to life. Generally, one scores five times more PCS points than they should and can compete at a professional level until the age of forty or even longer. Thirdly, there is a pact with Satan. This transforms a second-rate racer into a superman, invincible in high mountains and short stages. The only vulnerability of the pact with Satan is that one cannot defeat the angels of God, the rider born once every hundred years with the genetics bestowed by God.

4. In contrast to the statements made by cheaters who have been caught doping and have admitted to their wrongdoings during interviews, there exist entirely honest competitors in the Tour de France. Despite the attempts of former cheaters to partially absolve themselves by claiming that everyone engages in doping or that doping does not transform a poor rider into a champion, this assertion is likely false. Even during the most challenging years, there were individuals who competed with integrity and whom, perhaps, the majority of fans will never acknowledge. Furthermore, amidst the generally understandable hysteria, some individuals manage to disparage these most honest riders, both figuratively and literally, on the roadside. For instance, based on my research conducted for 2012 regarding the Tour de France, among the top sixteen riders, at least three competed entirely clean, adhering to the four principles I previously outlined. Nibali Vincenzo, Roche Nicolas, and Monfort Maxime. It is also probable that several others were clean as well. To summerize it all I have to repeat it is only a model and not a 100% proof of anything.
 
Tip, follow the money, and you realize that the Tour is THE race of the year in many aspects so the bigger the prize, the more likely that guys will use all the substances available. Reading cycling history will also open your eyes to the wonderful world of professional cycling. There's not a lot new under the big top. (Recently there was a drug bust of guys selling EPO, etc to mainly 45 to 60 year olds and they were raking in the money.. Think about that.)
 
  • Like
  • Wow
Reactions: E_F_ and noob
Interesting. Is it based on output (e.g PCS points) or a physiological model of a rider (e.g. weight, size, est. VO2max etc)? The former will be limited in what you can conclude. I do wonder how you account for natural progress of riders, improvements of material/food/training, changes in his environment (e.g. team change), injuries and their impact, training load and altitude that define when a rider is at peak shape depending on his goals etc? It's look like your model may demonstrate that on average the TdF stands out as a doping period but I think it's difficult to say something definitive over individual riders. You probably end up with a probability of doping with a much bigger error than the average?

Your model can flag riders that make unusual jumps and, if it's based on a physiological model, flag those that cross boundaries that are considered natural limits. That could be interesting.
 
Interesting. Is it based on output (e.g PCS points) or a physiological model of a rider (e.g. weight, size, est. VO2max etc)? The former will be limited in what you can conclude. I do wonder how you account for natural progress of riders, improvements of material/food/training, changes in his environment (e.g. team change), injuries and their impact, training load and altitude that define when a rider is at peak shape depending on his goals etc? It's look like your model may demonstrate that on average the TdF stands out as a doping period but I think it's difficult to say something definitive over individual riders. You probably end up with a probability of doping with a much bigger error than the average?

Your model can flag riders that make unusual jumps and, if it's based on a physiological model, flag those that cross boundaries that are considered natural limits. That could be interesting.
I appreciate your response to my message. Indeed, I solely utilized the data from the PCS. This represents a substantial volume of data, and it is certainly feasible to train a neural network using it. It is a well-established principle that the greater and more varied the data, the more favorable the outcome. The technical, physical, and physiological factors you mentioned would significantly enhance the performance of my algorithm. However, this entails a considerable amount of work that I am not yet prepared to undertake. Currently, it appears to be an excellent project for a long-term study by a scientific institution with adequate funding. Regarding my research, which is based solely on statistics, it can also yield positive results; however, the primary challenge lies in overfitting the model, which refers to the undue emphasis on certain trends over others. For instance, both Armstrong and Lemond excelled in the initial phases of their careers, achieving results across various race types. Subsequently, both faced significant life challenges for different reasons, which cast doubt on their capacity for professional sports engagement. Their potential for attaining favorable results was considerably diminished. Nevertheless, both made a triumphant return to cycling, emerging much stronger than they had been prior to their hiatus. They each began training for only one race per year, excelling in that event while underperforming for much of the remainder of the season. An overfitting model would immediately categorize Armstrong and LeMond within the same cluster, sharing the same suspicion parameter. However, this may not accurately reflect reality.
 
4. In contrast to the statements made by cheaters who have been caught doping and have admitted to their wrongdoings during interviews, there exist entirely honest competitors in the Tour de France. Despite the attempts of former cheaters to partially absolve themselves by claiming that everyone engages in doping or that doping does not transform a poor rider into a champion, this assertion is likely false. Even during the most challenging years, there were individuals who competed with integrity and whom, perhaps, the majority of fans will never acknowledge. Furthermore, amidst the generally understandable hysteria, some individuals manage to disparage these most honest riders, both figuratively and literally, on the roadside. For instance, based on my research conducted for 2012 regarding the Tour de France, among the top sixteen riders, at least three competed entirely clean, adhering to the four principles I previously outlined. Nibali Vincenzo, Roche Nicolas, and Monfort Maxime. It is also probable that several others were clean as well. To summerize it all I have to repeat it is only a model and not a 100% proof of anything.
Sorry, but it's quite obvious that you cannot rule any riders in as clean based on such a model. At best, you can rule some riders out as unlikely to be clean.
 
  • Like
Reactions: E_F_
For instance, based on my research conducted for 2012 regarding the Tour de France, among the top sixteen riders, at least three competed entirely clean, adhering to the four principles I previously outlined. Nibali Vincenzo, Roche Nicolas, and Monfort Maxime.