mirror of
https://github.com/wassname/Open-Assistant.git
synced 2026-06-27 16:10:30 +08:00
24 lines
855 B
Markdown
24 lines
855 B
Markdown
# Data Augmentation
|
|
|
|
(pull request welcome)
|
|
|
|
## What is data augmentation
|
|
|
|
Data augmentation is a technique we can use to get better data faster. Using
|
|
machine learning models to analyze long data (like an essay) and compress it
|
|
into instructions.
|
|
|
|
## How to contribute
|
|
|
|
To contribute to data augmentation you can write a short Python script that uses
|
|
a model from HuggingFace to analyze the text.
|
|
[Here](https://docs.google.com/document/d/13a188pPvqnlvuVa3e_suVz4YO5s-JWeiOOrpp0odImg/edit)
|
|
are examples of what you can do.
|
|
|
|
And here are example implementations:
|
|
[Idea 3](https://colab.research.google.com/drive/1GllCN5PgSYxBxINZsv3A2r0SpdznHlbT?usp=sharing),
|
|
[Idea 4](https://colab.research.google.com/drive/1nZx5LRjO61fYprFyqtrwPDLOis6ctR4p#scrollTo=1EE8CriiaCXj)
|
|
|
|
To contribute simply choose one of many ideas from the document above and
|
|
implement it.
|