CarperAI
diff --git a/‎.pre-commit-config.yaml‎
Lines changed: 14 additions & 7 deletions b/‎.pre-commit-config.yaml‎
Lines changed: 14 additions & 7 deletions
diff --git a/‎.readthedocs.yml‎
Lines changed: 19 additions & 3 deletions b/‎.readthedocs.yml‎
Lines changed: 19 additions & 3 deletions
diff --git a/‎CONTRIBUTING.md‎
Lines changed: 56 additions & 9 deletions b/‎CONTRIBUTING.md‎
Lines changed: 56 additions & 9 deletions
diff --git a/‎README.md‎
Lines changed: 15 additions & 41 deletions b/‎README.md‎
Lines changed: 15 additions & 41 deletions
diff --git a/‎docs/Makefile‎
Lines changed: 1 addition & 1 deletion b/‎docs/Makefile‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎docs/requirements.txt‎
Lines changed: 19 additions & 10 deletions b/‎docs/requirements.txt‎
Lines changed: 19 additions & 10 deletions
diff --git a/‎docs/source/conf.py‎
Lines changed: 0 additions & 54 deletions b/‎docs/source/conf.py‎
Lines changed: 0 additions & 54 deletions
@@ -1,7 +1,7 @@
 # See https://pre-commit.com for more information
 # See https://pre-commit.com/hooks.html for more hooks
 repos:
--   repo: https://github.com/pre-commit/pre-commit-hooks
+  - repo: https://github.com/pre-commit/pre-commit-hooks
     rev: v4.4.0
     hooks:
         - id: check-case-conflict
@@ -18,17 +18,24 @@ repos:
           args: [--fix=lf]
         - id: requirements-txt-fixer
         - id: trailing-whitespace
--   repo: https://github.com/psf/black
+  - repo: https://github.com/psf/black
     rev: 23.1.0
     hooks:
-    -   id: black
+      - id: black
         files: ^(trlx|examples|tests|setup.py)/
--   repo: https://github.com/pycqa/isort
+  - repo: https://github.com/pycqa/isort
     rev: 5.12.0
     hooks:
-    -   id: isort
+      - id: isort
         name: isort (python)
--   repo: https://github.com/pycqa/flake8
+  - repo: https://github.com/pycqa/flake8
     rev: 6.0.0
     hooks:
-    -   id: flake8
+      - id: flake8
+  - repo: https://github.com/codespell-project/codespell
+    rev: v2.2.2
+    hooks:
+      - id: codespell
+        args: [--ignore-words, dictionary.txt]
+        additional_dependencies:
+          - tomli
@@ -1,9 +1,25 @@
+# .readthedocs.yml
+# Read the Docs configuration file
+# See https://docs.readthedocs.io/en/stable/config-file/v2.html for details
+
+# Required
 version: 2
 
+build:
+  os: "ubuntu-20.04"
+  tools:
+    python: "3.8"
+
+# Build documentation in the docs/ directory with Sphinx
 sphinx:
-  configuration: docs/source/conf.py
+  configuration: docs/conf.py
+  fail_on_warning: false
+
+# Optionally build your docs in additional formats such as PDF and ePub
+formats:
+  - htmlzip
 
+# Optionally set the version of Python and requirements required to build your docs
 python:
-  version: 3.9
   install:
-  - requirements: docs/requirements.txt
+    - requirements: docs/requirements.txt
@@ -2,7 +2,11 @@
 
 Looking to improve `trlX`? Thanks for considering!
 
-There are many ways to contribute, from writing tutorials in [Colab notebooks](https://colab.research.google.com) to improving the project's [documentation](https://trlx.readthedocs.io), submitting bug reports and feature requests, or even implementing new features themselves. See the outstanding [issues](https://github.com/CarperAI/trlx/issues) for ideas on where to begin.
+There are many ways to contribute, from writing tutorials in [Colab notebooks](https://colab.research.google.com) to improving the project's [documentation](https://trlx.readthedocs.io), to submitting bug reports and feature requests, or even implementing new features themselves. See the outstanding [issues](https://github.com/CarperAI/trlx/issues) for ideas on where to begin.
+
+- [Documentation Issues](https://github.com/CarperAI/trlx/issues?q=is%3Aissue+is%3Aopen+label%3Adocumentation)
+- [Bug Fixes](https://github.com/CarperAI/trlx/issues?q=is%3Aissue+is%3Aopen+label%3Abug)
+- [Feature Requests](https://github.com/CarperAI/trlx/issues?q=is%3Aissue+is%3Aopen+label%3A%22feature+request%22)
 
 Here are some guidelines to help you get started 🚀.
 
@@ -16,40 +20,83 @@ To submit a bug report or a feature request, please open an [issue](https://gith
 
 Follow these steps to start contributing code:
 
+1. Setup your environment:
+
+```bash
+conda create -n trlx python=3.8 torch torch-cuda=11.7 -c pytorch -c nvidia
+git clone https://github.com/CarperAI/trlx
+cd trlx
+pip install -e ".[dev]"
+pre-commit install
+```
+
 1. Create your own [fork](https://docs.github.com/en/get-started/quickstart/fork-a-repo#forking-a-repository) of the repository and clone it to your local machine.
+
     ```bash
     git clone https://github.com/<YOUR-USERNAME>/trlx.git
     cd trlx
     git remote add upstream https://github.com/CarperAI/trlx.git
     ```
-2. Create a new branch for your changes and give it a concise name that reflects your contribution.
+
+1. Create a new branch for your changes and give it a concise name that reflects your contribution.
+
     ```bash
     git checkout -b <BRANCH-NAME>
     ```
-2. Install the development dependencies in a Python environment.
+
+1. Install the development dependencies in a Python environment.
+
     ```bash
     pip install -e ".[dev]"
     pre-commit install
     ```
-4. Implement your changes. Make small, independent, and well documented commits along the way (check out [these](https://cbea.ms/git-commit/) tips).
-5. Add unit tests whenever appropriate and ensure that the tests pass. To run the entire test suite, use the following command from within the project root directory.
+
+install pre-commit
+
+```bash
+pip install pre-commit
+pre-commit install
+```
+
+bonus: force run pre-commit on all the files
+
+```bash
+pre-commit run --all-files
+```
+
+1. Implement your changes. Make small, independent, and well documented commits along the way (check out [these](https://cbea.ms/git-commit/) tips).
+
+1. Add unit tests whenever appropriate and ensure that the tests pass. To run the entire test suite, use the following command from within the project root directory.
+
     ```bash
     pytest
     ```
+
     For changes with minimal project scope (e.g. a simple bug fix), you might want to run the unit tests for just a specific test file instead:
+
     ```bash
     pytest -vv -k "<TEST-FILE-NAME>"
     ```
-5. Commit your final changes. Our `pre-commit` hooks will automatically run before each commit and will prevent you from committing code that does not pass our style and linter checks. They'll also automatically format your code! To run these manually, use the following command:
+
+1. Commit your final changes. Our `pre-commit` hooks will automatically run before each commit and will prevent you from committing code that does not pass our style and linter checks. They'll also automatically format your code! To run these manually, use the following command:
+
     ```bash
     pre-commit run --all-files
     ```
 
-6. Push the changes to your fork.
+1. Push the changes to your fork.
 
 Finally ... 🥁 ... Create a [pull request](https://docs.github.com/en/github/collaborating-with-issues-and-pull-requests/creating-a-pull-request) to the `trlX` repository! Make sure to include a description of your changes and link to any relevant issues.
 
-> __Tip__: If you're looking to introduce an experimental feature, we suggest testing the behavior of your proposed feature on some of the existing [examples](https://github.com/CarperAI/trlx/tree/master/examples), such as [random walks](https://github.com/CarperAI/trlx/blob/master/examples/randomwalks). This will help you get a better sense of how the feature would work in practice and will also help you identify any potential flaws in the implementation.
+> **Tip**: If you're looking to introduce an experimental feature, we suggest testing the behavior of your proposed feature on some of the existing [examples](https://github.com/CarperAI/trlx/tree/master/examples), such as [random walks](https://github.com/CarperAI/trlx/blob/master/examples/randomwalks). This will help you get a better sense of how the feature would work in practice and will also help you identify any potential flaws in the implementation.
+
+## Tips & Tricks
+
+Set transformers verbosity level
+
+```bash
+TRANSFORMERS_VERBOSITY=error
+```
 
 ## Asking questions
 
@@ -63,4 +110,4 @@ This project adheres to the [Contributor Covenant Code of Conduct](https://githu
 
 By contributing, you agree that your contributions will be licensed under its MIT License.
 
-# Thank you for your contribution 🐠!
+## Thank you for your contribution! 🐠
@@ -1,3 +1,5 @@
+![TRLX](./docs/_static/apple-touch-icon-114x114.png)
+
 [docs-image]: https://readthedocs.org/projects/trlX/badge/?version=latest
 [docs-url]: https://trlX.readthedocs.io/en/latest/?badge=latest
 
@@ -12,6 +14,7 @@ You can read more about trlX in our [documentation](https://trlX.readthedocs.io)
 Want to collect human annotations for your RL application? Check out [CHEESE!](https://github.com/carperai/cheese), our library for HiTL data collection.
 
 ## Installation
+
 ```bash
 git clone https://github.com/CarperAI/trlx.git
 cd trlx
@@ -20,31 +23,37 @@ pip install -e .
 ```
 
 ## Examples
+
 For more usage see [examples](./examples). You can also try the colab notebooks below:
 | Description      | Link |
 | ----------- | ----------- |
 | Simulacra Example | [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/1vrmCLoHNlKvDVqJjMig-8tKDCfIEoym4?usp=sharing)|
 
 
-
 ## How to Train
+
 You can train a model using a reward function or a reward-labeled dataset.
 
-#### Using a reward function
+### Using a reward function
+
 ```python
 trainer = trlx.train('gpt2', reward_fn=lambda samples, **kwargs: [sample.count('cats') for sample in samples])
 ```
-#### Using a reward-labeled dataset
+
+### Using a reward-labeled dataset
+
 ```python
 trainer = trlx.train('EleutherAI/gpt-j-6B', dataset=[('dolphins', 'geese'), (1.0, 100.0)])
 ```
 
-#### Trainers provide a wrapper over their underlying model
+### Trainers provide a wrapper over their underlying model
+
 ```python
 trainer.generate(**tokenizer('Q: Who rules the world? A:', return_tensors='pt'), do_sample=True)
 ```
 
-#### Save the resulting model to a Hugging Face pretrained language model. (Ready to upload to the Hub!)
+### Save the resulting model to a Hugging Face pretrained language model. (Ready to upload to the Hub!)
+
 ```python
 trainer.save_pretrained('/path/to/output/folder/')
 ```
@@ -69,46 +78,11 @@ python examples/nemo_ilql_sentiments.py
 For more usage see the [NeMo README](./trlx/trainer/nemo)
 
 #### Use Ray Tune to launch hyperparameter sweep
+
 ```bash
 python -m trlx.sweep --config configs/sweeps/ppo_sweep.yml examples/ppo_sentiments.py
 ```
 
-## Logging
-
-trlX uses the standard Python `logging` library to log training information to the console. The default logger is set to the `INFO` level, which means that `INFO`, `WARNING`, `ERROR`, and `CRITICAL` level messages will be printed to standard output.
-
-To change the log level directly, you can use the verbosity setter. For example, to set the log level to `WARNING` use:
-
-```python
-import trlx
-
-trlx.logging.set_verbosity(trlx.logging.WARNING)
-```
-
-This will suppress `INFO` level messages, but still print `WARNING`, `ERROR`, and `CRITICAL` level messages.
-
-You can also control logging verbosity by setting the `TRLX_VERBOSITY` environment variable to one of the standard logging [level names](https://docs.python.org/3/library/logging.html#logging-levels):
-
-* `CRITICAL` (`trlx.logging.CRITICAL`)
-* `ERROR` (`trlx.logging.ERROR`)
-* `WARNING` (`trlx.logging.WARNING`)
-* `INFO` (`trlx.logging.INFO`)
-* `DEBUG` (`trlx.logging.DEBUG`)
-
-```sh
-export TRLX_VERBOSITY=WARNING
-```
-
-By default, [`tqdm`](https://tqdm.github.io/docs/tqdm/) progress bars are used to display training progress. You can disable them by calling `trlx.logging.disable_progress_bar()`, otherwise `trlx.logging.enable_progress_bar()` to enable.
-
-Messages can be formatted with greater detail by setting `trlx.logging.enable_explicit_format()`. This will inject call-site information into each log which may be helpful for debugging.
-
-```sh
-[2023-01-01 05:00:00,000] [INFO] [ppo_orchestrator.py:63:make_experience] [RANK 0] Message...
-```
-
-> 💡 Tip: To reduce the amount of logging output, you might find it helpful to change log levels of third-party libraries used by trlX. For example, try adding `transformers.logging.set_verbosity_error()` to the top of your trlX scripts to silence verbose messages from the `transformers` library (see their [logging docs](https://huggingface.co/docs/transformers/main_classes/logging#logging) for more details).
-
 ## Contributing
 
 For development check out these [guidelines](./CONTRIBUTING.md)
 
@@ -5,7 +5,7 @@
 # from the environment for the first two.
 SPHINXOPTS    ?=
 SPHINXBUILD   ?= sphinx-build
-SOURCEDIR     = source
+SOURCEDIR     = .
 BUILDDIR      = build
 
 # Put it first so that "make" without argument is like "make help".
 
@@ -1,11 +1,20 @@
-accelerate==0.12.0
-datasets==2.4.0
-deepspeed==0.7.3
-einops==0.4.1
-numpy==1.23.2
-sphinx==4.0.0
-sphinx_rtd_theme
+accelerate
+commonmark
+datasets
+deepspeed
+docutils
+jupyter-sphinx
+matplotlib
+myst-nb
+nbsphinx
+Pygments
+ray
+readthedocs-sphinx-ext
+rich
+sphinx-autodoc-typehints
+sphinx-book-theme
+sphinx-copybutton
+sphinx-design
+sphinx-remove-toctrees
 torchtyping
-tqdm==4.64.0
-transformers==4.21.2
-wandb==0.13.2
+transformers