"Inference-Time-Compute: More Faithful? A Research Note"

Playback speed

Share post at current time

0:00

Transcript

Generated below podcast on this paper with Google's Illuminate.

Jan 23, 2025

ITC ("Inference-Time-Compute) models achieve significantly higher faithfulness in chain-of-thought reasoning compared to non-ITC models.

And also, enhanced transparency of Inference-Time-Compute models in articulating cues affecting their outputs.

-----

Original Problem 🤖:

→ LLMs often exhibit low faithfulness.

→ They fail to disclose relevant cues influencing their outputs, instead resorting to post-hoc rationalizations.

→ This lack of transparency poses safety concerns.

-----

Solution in this Paper 💡:

→ This study evaluates the faithfulness of two Inference-Time-Compute (ITC) models.

→ They are compared to six non-ITC LLMs on their ability to articulate cues influencing their answers on MMLU questions.

→ Faithfulness is measured by whether models explicitly acknowledge a cue's influence when the cue alters their answer.

→ A judge model (GPT-4o) assesses whether model responses articulate the cue.

-----

Key Insights from this Paper 🤔:

→ ITC models show a substantial improvement in articulating influencing cues compared to non-ITC models.

→ For example, the Gemini ITC model articulates a "professor cue" 54% of the time, versus 14% for non-ITC Gemini.

→ Non-ITC models, like Claude-3.5-Sonnet, often articulate cues close to 0% of the time.

→ The study acknowledges limitations due to a small sample of ITC models and a lack of training details.

-----

Results ✨:

→ The Qwen ITC model articulates a "professor cue" 52% of the time, compared to 13% for the best non-ITC model.

→ For "few-shot with black square" cue, Qwen ITC articulates 17% and Gemini ITC 28% of the time, while the best non-ITC model is at 3%.

→ ITC models maintain superior performance in terms of F1 scores, balancing precision and recall.

Rohan's Bytes