Fixed readme file for style checks

This commit is contained in:
Szymon Ożóg
2022-12-31 15:00:28 +01:00
parent 5bd4cc84f1
commit 11241c9895
+22 -33
View File
@@ -1,33 +1,25 @@
# Detoxify evaluation
[Detoxify](https://github.com/unitaryai/detoxify) is a open source model used to identify prompts as toxic
<img src="https://raw.githubusercontent.com/unitaryai/detoxify/master/examples.png" alt="Image from detoxify github that shows the example input/output of their model" />
It contains 3 different models that vary in transformer type and data it was trained on
| Model name | Transformer type | Data from |
| :----------: | :---------------: | :----------------------------------------: |
| original | bert-base-uncased | Toxic Comment Classification Challenge |
| unbiased | roberta-base | Unintended Bias in Toxicity Classification |
| multilingual | xlm-roberta-base | Multilingual Toxic Comment Classification |
| Model name | Transformer type | Data from |
| :---: | :---: | :---: |
|original| bert-base-uncased| Toxic Comment Classification Challenge|
|unbiased| roberta-base| Unintended Bias in Toxicity Classification|
|multilingual| xlm-roberta-base| Multilingual Toxic Comment Classification|
Unbiased and original models also have a 'small' version - but since normal models are not memory heavy, and small models perform noticably worse, they are only described in the notebook
## All tests below were ran on a 3090TI
# Inference and training times and memory usages
Charts showing detailed memory usages and times for different sentence lengths and batch sizes are inside the notebook
Quick overview batch size 16, sentence length 4k for training, batch size 128 sentence length 4k for inference
Quick overview batch size 16, sentence length 4k for training, batch size 128 sentence length 4k for inference
| Model name | Training memory| Training speed | Inference Memory| Inference Speed|
| :---: | :---: | :---: |:---: | :---: |
|original| 11.8GB | 2.40s| 4.8GB|16.48s|
@@ -35,25 +27,27 @@ Quick overview batch size 16, sentence length 4k for training, batch size 128 se
|multilingual|14GB| 1.00s| 5.5GB| 4.89s|
# Filtering quality
Detoxify was tested on 4 different types of inputs
- Not obviously toxic
- Not obviously non-toxic
- Obviously toxic
- Obviously non-toxic
### Sentences used for testing and rating are contained inside the .ipynb
| Model name | Not obviously toxic| Not obviously non-toxic | Obviously toxic| Obviously non-toxic|
| :---: | :---: | :---: |:---: | :---: |
|original| failed at all, easily accepted racist, sexist overally toxic prompts that were well formulated |Very sensitive on swear words, failed to reckognize context| good performance|good performance|
|unbiased|Managed to find some hidden toxicity but not on all sentences| Very sensitive explicit language but shown ability to recognize context| Did well but failed to reckognize some gender stereotype mockery | good performance
|multilingual|Managed to find some hidden toxicity but not on all sentences| Very sensitive explicit language but shown ability to recognize context| Did well but failed to reckognize some gender stereotype mockery | good performance
Subjectivly 'unbiased' looks like the best performing model.
| Model name | Not obviously toxic | Not obviously non-toxic | Obviously toxic | Obviously non-toxic |
| :----------: | :--------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------: | :--------------------------------------------------------------: | :-----------------: |
| original | failed at all, easily accepted racist, sexist overally toxic prompts that were well formulated | Very sensitive on swear words, failed to reckognize context | good performance | good performance |
| unbiased | Managed to find some hidden toxicity but not on all sentences | Very sensitive explicit language but shown ability to recognize context | Did well but failed to reckognize some gender stereotype mockery | good performance |
| multilingual | Managed to find some hidden toxicity but not on all sentences | Very sensitive explicit language but shown ability to recognize context | Did well but failed to reckognize some gender stereotype mockery | good performance |
I don't think it would do well as a security layer in a live version of open assistant unless we do some finetuning first, because it can be fooled to pass toxicity if it's presented in formal language.
Subjectivly 'unbiased' looks like the best performing model.
With some caution it can be used to filter prompts but I would suggest also using someone for verification of messages that are marked as toxic but still below 90% confidence
I don't think it would do well as a security layer in a live version of open assistant unless we do some finetuning first, because it can be fooled to pass toxicity if it's presented in formal language.
With some caution it can be used to filter prompts but I would suggest also using someone for verification of messages that are marked as toxic but still below 90% confidence
# Licensing
@@ -71,16 +65,12 @@ With some caution it can be used to filter prompts but I would suggest also usin
- Private use
#### You cannot
- Hold the owner liable
- Use the owner's trademark
#### You must
- Include Copyright
@@ -91,21 +81,20 @@ With some caution it can be used to filter prompts but I would suggest also usin
- Include notice
This is obviously not legal advice.
# Hosting
The model is currently available on [huggingface](https://huggingface.co/unitary) and torch hub
The model is currently available on [huggingface](https://huggingface.co/unitary) and torch hub
```
torch.hub.load('unitaryai/detoxify',model)
```
where model is one of:
where model is one of:
- toxic_bert
- unbiased_toxic_roberta
- multilingual_toxic_xlm_r
- multilingual_toxic_xlm_r