Hallucinations may not always be bad; they can enhance LLM performance in drug discovery.
This paper finds that hallucinations can enhance performance in molecule property prediction tasks.
-----
https://arxiv.org/abs/2501.13824
Original Problem 🤔:
→ LLMs hallucinate, raising concerns about reliability in drug discovery.
→ Drug discovery is complex, time-consuming, and expensive.
-----
Solution in this Paper 💡:
→ LLMs generate textual descriptions of molecules from SMILES strings.
→ These descriptions, including hallucinations, are added to prompts for drug discovery tasks.
→ The LLM then predicts molecular properties based on this augmented prompt.
-----
Key Insights from this Paper 🔎:
→ Hallucinations, while factually incorrect, can contain useful information.
→ This information may improve LLM ability to differentiate molecules and predict properties.
→ Larger models and specific hallucination sources like GPT-40 show more significant improvement.
→ The language of hallucination also affects performance, with Chinese yielding surprising benefits.
-----
Results 📊:
→ Llama-3.1-8B achieves an 18.35% ROC-AUC gain over the SMILES baseline.
→ It also gains 13.79% ROC-AUC over the MolT5 baseline.
→ GPT-40 hallucinations provide the most consistent improvements across different LLMs.
Share this post