Files
Open-Assistant/docs/data_augmentation.md
T
2023-01-02 21:53:39 +00:00

24 lines
855 B
Markdown

# Data Augmentation
(pull request welcome)
## What is data augmentation
Data augmentation is a technique we can use to get better data faster. Using
machine learning models to analyze long data (like an essay) and compress it
into instructions.
## How to contribute
To contribute to data augmentation you can write a short Python script that uses
a model from HuggingFace to analyze the text.
[Here](https://docs.google.com/document/d/13a188pPvqnlvuVa3e_suVz4YO5s-JWeiOOrpp0odImg/edit)
are examples of what you can do.
And here are example implementations:
[Idea 3](https://colab.research.google.com/drive/1GllCN5PgSYxBxINZsv3A2r0SpdznHlbT?usp=sharing),
[Idea 4](https://colab.research.google.com/drive/1nZx5LRjO61fYprFyqtrwPDLOis6ctR4p#scrollTo=1EE8CriiaCXj)
To contribute simply choose one of many ideas from the document above and
implement it.