wassname 54e86da9dd init
2024-01-02 10:11:54 +08:00
2024-01-02 10:11:54 +08:00
2024-01-02 10:11:54 +08:00
2024-01-02 10:11:54 +08:00
2024-01-02 10:11:54 +08:00
2024-01-02 10:11:54 +08:00

Experiment using LLM's to detect BS writing

An experiment to see if I can detect "BS" using LLM's in a robust way.

It's impossible to detect AI generated text with 100% accuracy. But we will have more success detecting bad, empty, or contentless text.

Why might this work? As with Schmidhuber's definition of curiosity good writing should initially suprise the reader but become less suprising as they learn about it. Empty writing is not suprising. And confusing writing stays confusing even after you have read it. In other words it should have a high perplexity, which goes down after learning it.

The ideal way to do this would be by fine tuning. But that would be momory intensive so I will try and do it with prompts. I ask:

  • Is the text suprising (high perplexity)
  • Is less suprising when given a summary? (low perplexity)

If yes/yes then it's may be suprising new information. If either is not true, then it's proboably BS.

See main.ipynb

S
Description
Can we measure how good a text is by how much an LLM learns from it?
Readme 83 MiB
Languages
Jupyter Notebook 100%