Jesus Fernando Lopez Fidalgo, Full Professor of Statistics. Institute for Culture and Society University of Navarra

The keys to a quality research

Mon, 27 Feb 2017 18:06:00 +0000 Published in El Español

Several scientists have recently published a manifesto in Nature criticizing the lack of validity of current scientific research

This is a fact that has been denounced for some time and is of concern in the scientific world. Other programs of study show that about 2 million articles are published in the world each year. Each article is read by a average of only 10 people, including the usual co-authors or the editors and reviewers who have intervened. 82% of the articles on Humanities do not receive a single quotation. And of the articles cited, only 20% have actually been read by the authors. There is great pressure to publish a lot and fast and this may be one of the causes of this apparent disaster. But it is not as new a problem as it may seem. Years ago, it was said that, of the ten most relevant articles in the history of mathematics, seven of them had been rejected at first written request.

In any case, it is not possible for a study to be one hundred percent replicable. Only mathematical results demonstrated with the laws of logic and based on perfectly defined and unappealable principles or assumptions can be said to be completely reproducible. When we demonstrate mathematically that any continuous line with negative and positive values must necessarily equal zero at some point, it is infallible. No matter how many times we reproduce the experiment by drawing continuous lines starting at the negative and ending at the positive, they will always be forced to pass through zero. The experiment will never fail. But almost everything else is reproducible with some probability. That the result is reproduced in front of a group of experts or provide the data is a guarantee, but does not add certainty.

On the other hand, it is obvious that the recent scientific research has meant, for example, that life expectancy has increased spectacularly in recent years and that there have been scientific advances that contribute to well-being in a very clear way. That is why I am no friend of falling into alarmism that does not help to improve the status. For something relevant to come out, we have to try many times and therefore produce research that will never be relevant, even if it has been done in good faith. This does not justify that things should not be done well. The manifesto cited above gives some clues as to what a good research should be like, and I think this is very important and to be welcomed.

From a statistical methodological point of view, the following basic principles, among others, should be taken into account in order to conduct a research survey correctly. The hypotheses to be tested should be established at the beginning, not only before collecting the data, but also before planning the collection. Precisely this will allow us to design an adequate and effective collection plan for data (design experimental or sampling). Experiments, surveys, etc. must be carried out under the supervision of the person who designed the plan. The correct treatment of non-response or other difficulties encountered in the collection of data can be core topic in a given study. When analyzing the data , the hypotheses that support them should be checked and not be satisfied with the results obtained automatically by statistical software. To give an example for people outside these fields, if we collect data on the preferences in the subject of literature that one reads, we could code the answers as 1 (science fiction), 2 (essay), 3 (historical)... The fact is that a statistical software will calculate the average of these codes if we ask it to do so, but this does not make any sense.

No one would think of doing this, but there are more subtle errors of this subject, which escape if the appropriate knowledge or financial aid is missing. When presenting the results in a publication, the inconveniences and difficulties encountered during the whole process and everything that can help to evaluate the study in a balanced way should be clearly stated.

The statistical model that has been used never perfectly matches reality and therefore should always be taken with due caution. George Box, one of the most prominent contemporary statisticians, who died in 2013, used to say that all models are false, but some are useful. At least statistical models are able to pinpoint the error that is made, always under the basic assumptions of the model that has been employee. Other procedures, for example nature-based algorithms, work on the internship, but they do not allow to control that error. This does not mean that they should not be used. In fact they are used, for example in Internet search engines or automatic translators, and they work, but this should always be done with caution.

It is necessary to continue to do a lot of research and to the best of our ability. Great progress has been made in our country. It is necessary to support research, not just in words, but with deeds and tangible funding to guarantee generational renewal. It is also necessary to calmly rethink the assessment procedures of the research, but without lowering their requirements or making false excuses.