Bingleng commited on
Commit
ec45c45
1 Parent(s): c086748

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +54 -1
README.md CHANGED
@@ -1,4 +1,57 @@
1
  ---
2
  license: apache-2.0
3
  ---
4
- # Chinese Visual-language Multi-modal models for captions and robot actions*
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+ # Chinese Visual-language Multi-modal models for captions and robot actions
5
+
6
+ ## Release
7
+ - [9/22] 🔥 We release two major models. The *CN-caption* model is for accurate chinese image captioning while *robot action* model is for demo-level robot action.
8
+
9
+ ## Contents
10
+
11
+
12
+ ## Install
13
+
14
+ 1. Install Package
15
+ ```Shell
16
+ conda create -n llava python=3.10 -y
17
+ conda activate llava
18
+ pip install --upgrade pip
19
+ pip install -e .
20
+ ```
21
+
22
+ 2. Install additional packages for training cases
23
+ ```
24
+ pip install ninja
25
+ pip install flash-attn --no-build-isolation
26
+ ```
27
+
28
+ ## Demo
29
+
30
+ To run our demo, you need to prepare LLaVA checkpoints locally. Please follow the instructions [here](#llava-weights) to download the checkpoints.
31
+
32
+ ### Gradio Web UI
33
+
34
+ To launch a Gradio demo locally, please run the following commands one by one. If you plan to launch multiple model workers to compare between different checkpoints, you only need to launch the controller and the web server *ONCE*.
35
+
36
+ #### Launch a controller
37
+ ```Shell
38
+ python -m llava.serve.controller --host 0.0.0.0 --port 10000
39
+ ```
40
+
41
+ #### Launch a gradio web server.
42
+ ```Shell
43
+ python -m llava.serve.gradio_web_server --controller http://localhost:10000 --model-list-mode reload
44
+ ```
45
+ You just launched the Gradio web interface. Now, you can open the web interface with the URL printed on the screen. You may notice that there is no model in the model list. Do not worry, as we have not launched any model worker yet. It will be automatically updated when you launch a model worker.
46
+
47
+ #### Launch a model worker
48
+
49
+ This is the actual *worker* that performs the inference on the GPU. Each worker is responsible for a single model specified in `--model-path`.
50
+
51
+
52
+ ## inference
53
+
54
+ ### Server
55
+
56
+ ### Client
57
+