Files
Open-Assistant/model/reward/instructor/TODO.md
T
2023-01-01 20:57:35 +00:00

732 B

Some other reward features we can use

  1. Finish classifcation feature

  2. Summaries from human feedback

  • use confidence score into the RM learning, ensure the output rank score correlates with confidence

  • each labeling has a labeling note, basically comments by labeler, not sure what else we can use

  • Use the score for "overall", "accuracy", "coverage", "coherence" from axis/evals to train an addition model (rank additional aspect of the policy model)

    • this should be placed under experimental_dataset.py
  1. Add support for anthropic dataset
  • anthropic dataset is more like a conversation tree which is much complex than simply question-answer schema

    • this is basically a MCTS from alphazero.