hamel winglian commited on
Commit
6d342b5
1 Parent(s): b502392

Add section for debugging with Docker (#1104)

Browse files

* add docker debug

* Update docs/debugging.md

Co-authored-by: Wing Lian <wing.lian@gmail.com>

* explain editable install

* explain editable install

* upload new video

---------

Co-authored-by: Wing Lian <wing.lian@gmail.com>

Files changed (1) hide show
  1. docs/debugging.md +79 -2
docs/debugging.md CHANGED
@@ -10,6 +10,10 @@ This document provides some tips and tricks for debugging Axolotl. It also prov
10
  - [Configuration](#configuration)
11
  - [Customizing your debugger](#customizing-your-debugger)
12
  - [Video Tutorial](#video-tutorial)
 
 
 
 
13
 
14
  ## General Tips
15
 
@@ -18,7 +22,8 @@ While debugging it's helpful to simplify your test scenario as much as possible.
18
  > [!Important]
19
  > All of these tips are incorporated into the [example configuration](#configuration) for debugging with VSCode below.
20
 
21
- 1. **Eliminate Concurrency**: Restrict the number of processes to 1 for both training and data preprocessing:
 
22
  - Set `CUDA_VISIBLE_DEVICES` to a single GPU, ex: `export CUDA_VISIBLE_DEVICES=0`.
23
  - Set `dataset_processes: 1` in your axolotl config or run the training command with `--dataset_processes=1`.
24
  2. **Use a small dataset**: Construct or use a small dataset from HF Hub. When using a small dataset, you will often have to make sure `sample_packing: False` and `eval_sample_packing: False` to avoid errors. If you are in a pinch and don't have time to construct a small dataset but want to use from the HF Hub, you can shard the data (this will still tokenize the entire dataset, but will only use a fraction of the data for training. For example, to shard the dataset into 20 pieces, add the following to your axolotl config):
@@ -56,6 +61,21 @@ datasets:
56
  >[!Tip]
57
  > If you prefer to watch a video, rather than read, you can skip to the [video tutorial](#video-tutorial) below (but doing both is recommended).
58
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
59
  ### Configuration
60
 
61
  The easiest way to get started is to modify the [.vscode/launch.json](../.vscode/launch.json) file in this project. This is just an example configuration, so you may need to modify or copy it to suit your needs.
@@ -150,7 +170,7 @@ The following video tutorial walks through the above configuration and demonstra
150
 
151
  <div style="text-align: center; line-height: 0;">
152
 
153
- <a href="https://youtu.be/xUUB11yeMmc?si=z6Ea1BrRYkq6wsMx" target="_blank"
154
  title="How to debug Axolotl (for fine tuning LLMs)"><img
155
  src="https://i.ytimg.com/vi/xUUB11yeMmc/maxresdefault.jpg"
156
  style="border-radius: 10px; display: block; margin: auto;" width="560" height="315" /></a>
@@ -160,6 +180,63 @@ style="border-radius: 10px; display: block; margin: auto;" width="560" height="3
160
  </div>
161
  <br>
162
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
163
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
164
 
165
  [^1]: The config actually mimics the command `CUDA_VISIBLE_DEVICES=0 python -m accelerate.commands.launch -m axolotl.cli.train devtools/sharegpt.yml`, but this is the same thing.
 
 
 
10
  - [Configuration](#configuration)
11
  - [Customizing your debugger](#customizing-your-debugger)
12
  - [Video Tutorial](#video-tutorial)
13
+ - [Debugging With Docker](#debugging-with-docker)
14
+ - [Setup](#setup)
15
+ - [Attach To Container](#attach-to-container)
16
+ - [Video - Attaching To Docker On Remote Host](#video---attaching-to-docker-on-remote-host)
17
 
18
  ## General Tips
19
 
 
22
  > [!Important]
23
  > All of these tips are incorporated into the [example configuration](#configuration) for debugging with VSCode below.
24
 
25
+ 1. **Make sure you are using the latest version of axolotl**: This project changes often and bugs get fixed fast. Check your git branch and make sure you have pulled the latest changes from `main`.
26
+ 1. **Eliminate concurrency**: Restrict the number of processes to 1 for both training and data preprocessing:
27
  - Set `CUDA_VISIBLE_DEVICES` to a single GPU, ex: `export CUDA_VISIBLE_DEVICES=0`.
28
  - Set `dataset_processes: 1` in your axolotl config or run the training command with `--dataset_processes=1`.
29
  2. **Use a small dataset**: Construct or use a small dataset from HF Hub. When using a small dataset, you will often have to make sure `sample_packing: False` and `eval_sample_packing: False` to avoid errors. If you are in a pinch and don't have time to construct a small dataset but want to use from the HF Hub, you can shard the data (this will still tokenize the entire dataset, but will only use a fraction of the data for training. For example, to shard the dataset into 20 pieces, add the following to your axolotl config):
 
61
  >[!Tip]
62
  > If you prefer to watch a video, rather than read, you can skip to the [video tutorial](#video-tutorial) below (but doing both is recommended).
63
 
64
+ ### Setup
65
+
66
+ Make sure you have an [editable install](https://setuptools.pypa.io/en/latest/userguide/development_mode.html) of Axolotl, which ensures that changes you make to the code are reflected at runtime. Run the following commands from the root of this project:
67
+
68
+ ```bash
69
+ pip3 install packaging
70
+ pip3 install -e '.[flash-attn,deepspeed]'
71
+ ```
72
+
73
+ #### Remote Hosts
74
+
75
+ If you developing on a remote host, you can easily use VSCode to debug remotely. To do so, you will need to follow this [remote - SSH guide](https://code.visualstudio.com/docs/remote/ssh). You can also see the video below on [Docker and Remote SSH debugging](#video---attaching-to-docker-on-remote-host).
76
+
77
+ ```bash
78
+
79
  ### Configuration
80
 
81
  The easiest way to get started is to modify the [.vscode/launch.json](../.vscode/launch.json) file in this project. This is just an example configuration, so you may need to modify or copy it to suit your needs.
 
170
 
171
  <div style="text-align: center; line-height: 0;">
172
 
173
+ <a href="https://youtu.be/xUUB11yeMmc" target="_blank"
174
  title="How to debug Axolotl (for fine tuning LLMs)"><img
175
  src="https://i.ytimg.com/vi/xUUB11yeMmc/maxresdefault.jpg"
176
  style="border-radius: 10px; display: block; margin: auto;" width="560" height="315" /></a>
 
180
  </div>
181
  <br>
182
 
183
+ ## Debugging With Docker
184
+
185
+ Using [official Axolotl Docker images](https://hub.docker.com/r/winglian/axolotl/tags) is a great way to debug your code, and is a very popular way to use Axolotl. Attaching VSCode to Docker takes a few more steps.
186
+
187
+ ### Setup
188
+
189
+ On the host that is running axolotl (ex: if you are using a remote host), clone the axolotl repo and change your current directory to the root:
190
+
191
+ ```bash
192
+ git clone https://github.com/OpenAccess-AI-Collective/axolotl
193
+ cd axolotl
194
+ ```
195
+
196
+ >[!Tip]
197
+ > If you already have axolotl cloned on your host, make sure you have the latest changes and change into the root of the project.
198
+
199
+ Next, run the desired docker image and mount the current directory. Below is a docker command you can run to do this:[^2]
200
+
201
+ ```bash
202
+ docker run --privileged --gpus '"all"' --shm-size 10g --rm -it --name axolotl --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 --mount type=bind,src="${PWD}",target=/workspace/axolotl -v ${HOME}/.cache/huggingface:/root/.cache/huggingface winglian/axolotl:main-py3.10-cu118-2.0.1
203
+ ```
204
+
205
+ >[!Tip]
206
+ > To understand which containers are available, see the [Docker section of the README](../README.md#docker) and the [DockerHub repo](https://hub.docker.com/r/winglian/axolotl/tags). For details of how the Docker containers are built, see axolotl's [Docker CI builds](../.github/workflows/main.yml).
207
+
208
+ You will now be in the container. Next, perform an editable install of Axolotl:
209
+
210
+ ```bash
211
+ pip3 install packaging
212
+ pip3 install -e '.[flash-attn,deepspeed]'
213
+ ```
214
+
215
+ ### Attach To Container
216
+
217
+ Next, if you are using a remote host, [Remote into this host with VSCode](https://code.visualstudio.com/docs/remote/ssh). If you are using a local host, you can skip this step.
218
+
219
+ Next, select `Dev Containers: Attach to Running Container...` using the command palette (`CMD + SHIFT + P`) in VSCode. You will be prompted to select a container to attach to. Select the container you just created. You will now be in the container with a working directory that is at the root of the project. Any changes you make to the code will be reflected both in the container and on the host.
220
+
221
+ Now you are ready to debug as described above (see [Debugging with VSCode](#debugging-with-vscode)).
222
+
223
+ ### Video - Attaching To Docker On Remote Host
224
 
225
+ Here is a short video that demonstrates how to attach to a Docker container on a remote host:
226
+
227
+ <div style="text-align: center; line-height: 0;">
228
+
229
+ <a href="https://youtu.be/A_A2CEHj4ew" target="_blank"
230
+ title="Debugging Axolotl Part 2: Attaching to Docker on a Remote Host"><img
231
+ src="https://i.ytimg.com/vi/A_A2CEHj4ew/maxresdefault.jpg"
232
+ style="border-radius: 10px; display: block; margin: auto;" width="560" height="315" /></a>
233
+
234
+ <figcaption style="font-size: smaller;"><a href="https://hamel.dev">Hamel Husain's</a> tutorial: <a href="https://youtu.be/A_A2CEHj4ew">Debugging Axolotl Part 2: Attaching to Docker on a Remote Host
235
+ </a></figcaption>
236
+
237
+ </div>
238
+ <br>
239
 
240
  [^1]: The config actually mimics the command `CUDA_VISIBLE_DEVICES=0 python -m accelerate.commands.launch -m axolotl.cli.train devtools/sharegpt.yml`, but this is the same thing.
241
+
242
+ [^2]: Many of the below flags are recommended best practices by Nvidia when using nvidia-container-toolkit. You can read more about these flags [here](https://docs.nvidia.com/deeplearning/frameworks/user-guide/index.html).