"Pron vs Prompt: Can Large Language Models already Challenge a World-Class Fiction Author at Creative Text Writing?"

Playback speed

Share post at current time

0:00

Transcript

The Podcast is generated with Google's Illuminate, the tool trained on AI & science-related Arxiv papers.

Jan 01, 2025

As of now, GPT-4's creative writing consistently falls short of world-class novelist's capabilities.

However pretty sure, it will change soon.

So this paper carried out a contest between Patricio Pron (an awarded novelist, considered one of the best of his generation)

and GPT-4.

• 95% of GPT-4's style and 83% of themes rated unattractive (0-1 scores)

• Only 24% of GPT-4 texts rated creative (2-3 scores) vs 88% for Pron

----

Key Insights from this Paper 💡:

• First comprehensive comparison between GPT-4 and a world-class novelist

• Prompt influence significantly impacts AI-generated text quality

• GPT-4 performs better in English than Spanish for creative writing

• AI-generated text style becomes recognizable to experts over time

• Boden's creativity framework effective for evaluating AI-generated texts

Solution in this Paper 🧠:

• Designed 60 creative writing tasks (movie synopses) based on titles from both AI and human

• Developed evaluation rubric based on Boden's creativity dimensions (attractiveness, originality, creativity)

• Collected 5,400 expert assessments from 6 literature critics/scholars

• Analyzed performance differences between AI and human, impact of prompts, language differences

• Statistically validated correlation between Boden's dimensions and perceived creativity

Results 📊:

• GPT-4 texts improved when using Pron's titles (e.g., style originality +57%)

• Experts' accuracy in detecting AI-generated texts increased over time

🗞️ https://arxiv.org/pdf/2407.01119

------

Are you into AI and LLMs❓ Join me on Twitter with 31.8K others, to remain on the bleeding-edge every day.

𝕏/🐦 https://x.com/rohanpaul_ai

Rohan's Bytes