Different Perspectives on Scientific Misconduct

Markus Pössel

For me, the most interesting takeaway from the panel discussion on scientific integrity at the 12th Heidelberg Laureate Forum was that scientific misconduct means rather different things from different perspectives, and that these perspectives shape how serious, or not, we take different forms of misconduct.

The first perspective is that of scientific progress. Present-day research builds on what is already there; if that foundation is flawed, e.g. because somebody falsified the data they then pretended to analyse, whatever is built upon the foundation suffers. After all, as one of the panelists, the mathematician Yukari Ito (Tokyo University) put it: Papers are meant to document what scientists within a field believe to be true.

The discussion panel on “Scientific Integrity.” From left to right: Benjamin Skuse, Lonni Besançon, Eunsang Lee, Yukari Ito. Image credits: HLFF / Flemming

The second perspective is that of career building. When scientists apply for a postdoc position, or a professorship, or for tenure that gives them long-term security, their publication record plays a key role. In some institutions, this goes as far as setting a minimum number of publications per year as a requirement for continued employment. There have even been (and probably still are) institutions that offer their researchers cash bonuses for publications in prestigious journals.

With this in mind, consider two forms of scientific misconduct: First, taking short-cuts by falsifying your data, e.g. by fabricating a table, or an image, without going through the time-consuming process of actually doing the experiment. Sleuthing out this kind of falsification is the contribution to scientific integrity of another panelist, Lonni Besançon (Linköping University); “sanitation work” done in his spare time.

Second, consider the possibilities of generative AI. Beyond ethical uses (and I will later on use DeepL to produce a first draft for the German translation of this blog entry), there are numerous ways to abuse AI. In an extreme case, a researcher might clandestinely use generative AI to produce a whole paper, either with the help of genuine data or from scratch. Falsification of data, whether “by hand” or via AI hallucination, definitely hurts research. On the other hand, if an AI-generated paper could adhere to all the standards a field sets for its research methods (and no, in important respects AI capablities do not appear to be at that point, by a long shot), it might not impede progress in the field. But it could still represent a dishonest attempt by the author to game a system where career advancement demands numerous publications.

A decoupling of perspectives was very clear in the live online survey that the moderator, Benjamin Skuse, had the HLF audience complete: Which of a given list of items did we think was the most significant threat to research integrity? Somewhat to the moderator’s suprise, “Fabricating/falsifying data” came in first, relegating “Mass-produced genAI papers” to second place. Presumably, the audience was taking the perspective of hurting a field’s progress, and decided that mass-produced generative AI papers might amount to clutter – but that clutter that is widely ignored by serious researchers was not in significant danger of “polluting” a field.

Slide showing the results of the online audience survey. In order of descending importance, the chosen issues of scientific misconduct are: Data falsification, mass-produced genAI papers, conflicts of interest, plagiarism, and research mistakes — Result of the audience survey on the importance of various scientific integrity issues.

From the perspective of the third panelist, Eunsang Lee from the research integrity group at the Springer Nature publishing company, ignoring the onslaught of generative-AI papers is sadly not an option. Mass submissions are real. Lee mentioned five full papers by the same author in a single month, and that in mathematics, traditionally considered a “slow science.”

So, what to do? And yes, like the rest of the audience I laughed at examples of “tortured phrases,” which arise when someone asks generative AI to paraphrase an article (e.g. in order to cover up plagiarism). When the “immune system” morphs into the “invulnerable framework”, that certainly is a red flag, and looking for this and other indicators will hopefully help fight fraudulent submissions.

But I think that the key to dealing with scientific misconduct lies elsewhere, namely in a statement that Lonni Besançon made: If you make a metric out of it, people will game that metric. Following that statement to its logical conclusion means a long, hard look at the various convenient short-cuts within the scientific ecosystem. If you want to evaluate a prospective colleague’s scientific career, you will need to read and understand their papers, and have in-depth conversations with them. No short-hand proxy, such as counting first-author papers or computing metrics like the h-index, can be an adequate substitute for this sort of in-depth process.

Even at the first stage of filling a position, namely producing a short-list from the applicant pool, if your process relies on gameable metrics then you are putting those at an advantage who are doing their best (worst?) to game the various metrics.

Generative AI exacerbates problems of this kind, although the problems themselves, from paper mills to citation cartels, are not new. But if we’re lucky, maybe the sheer scale of how generative AI can be misused within science can provide a wake-up call for finally eliminating our convenient shortcuts, and making science gaming-proof.

The post Different Perspectives on Scientific Misconduct originally appeared on the HLFF SciLogs blog.

Back