0:00
/
0:00
Transcript

"SciPIP: An LLM-based Scientific Paper Idea Proposer"

The podcast on this paper is generated with Google's Illuminate.

Stuck for research ideas? This AI has read 48k papers and wants to help!

A smart research assistant that reads papers and suggests new directions worth exploring

SciPIP, proposed in this paper, combines literature analysis and brainstorming to help researchers generate novel, feasible paper ideas

📚 https://arxiv.org/abs/2410.23166

🎯 Original Problem:

Researchers face challenges in generating novel research ideas due to information overload and complex interdisciplinary requirements. Existing LLM-based idea generators struggle with comprehensive literature retrieval and balancing novelty with feasibility.

-----

🔧 Solution in this Paper:

→ Built SciPIP: A system combining literature retrieval with dual-path idea generation

→ Created database of 48,895 NLP papers with multi-dimensional info extraction using GLM-4

→ Implemented SEC-based retrieval combining:

- Semantic matching using SentenceBERT embeddings

- Entity-based matching with expanded key terms

- Citation co-occurrence patterns

→ Developed 3 idea generation variants:

- SciPIP-A: Pure literature-inspired

- SciPIP-B: Dual-path with separate brainstorming

- SciPIP-C: Enhanced dual-path using brainstorming for retrieval

-----

💡 Key Insights:

→ Literature retrieval needs semantic, entity and citation-based approaches for completeness

→ Brainstorming complements literature-based ideation for better novelty

→ Clustering retrieved papers reduces redundancy and noise

→ Entity expansion helps catch papers using different terms for same concepts

-----

📊 Results:

→ Generated 4-5 ideas matching ACL 2024 papers per 100 backgrounds

→ Achieved 41.9% recall@10 for literature retrieval vs 38.1% baseline

→ Produced 92 highly novel ideas (score 9/10) vs 12 from baseline

→ Maintained feasibility across different novelty levels (19.1-25.5% win rate)

Discussion about this video

User's avatar