Skip to content

Commit 80e3e2a

Browse files
XinyaoWapre-commit-ci[bot]
andauthoredAug 13, 2024··
Update mainifest for FaqGen (#582)
* update tgi version Signed-off-by: Xinyao Wang <[email protected]> * add k8s for faq Signed-off-by: Xinyao Wang <[email protected]> * add benchmark for faq Signed-off-by: Xinyao Wang <[email protected]> * refine k8s for faq Signed-off-by: Xinyao Wang <[email protected]> * add tuning for faq Signed-off-by: Xinyao Wang <[email protected]> * add prompts with different length for faq Signed-off-by: Xinyao Wang <[email protected]> * add tgi docker for llama3.1 Signed-off-by: Xinyao Wang <[email protected]> * remove useless code Signed-off-by: Xinyao Wang <[email protected]> * remove nodeselector Signed-off-by: Xinyao Wang <[email protected]> * remove hg token Signed-off-by: Xinyao Wang <[email protected]> * refine code structure Signed-off-by: Xinyao Wang <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix readme Signed-off-by: Xinyao Wang <[email protected]> --------- Signed-off-by: Xinyao Wang <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
1 parent 8c384e0 commit 80e3e2a

File tree

6 files changed

+308
-330
lines changed

6 files changed

+308
-330
lines changed
 

‎FaqGen/docker/gaudi/README.md

+2-2
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@ cd GenAIComps
1616
As TGI Gaudi has been officially published as a Docker image, we simply need to pull it:
1717

1818
```bash
19-
docker pull ghcr.io/huggingface/tgi-gaudi:1.2.1
19+
docker pull ghcr.io/huggingface/tgi-gaudi:2.0.1
2020
```
2121

2222
### 2. Build LLM Image
@@ -56,7 +56,7 @@ docker build -t opea/faqgen-react-ui:latest --build-arg https_proxy=$https_proxy
5656

5757
Then run the command `docker images`, you will have the following Docker Images:
5858

59-
1. `ghcr.io/huggingface/tgi-gaudi:1.2.1`
59+
1. `ghcr.io/huggingface/tgi-gaudi:2.0.1`
6060
2. `opea/llm-faqgen-tgi:latest`
6161
3. `opea/faqgen:latest`
6262
4. `opea/faqgen-ui:latest`

‎FaqGen/docker/gaudi/compose.yaml

+4-2
Original file line numberDiff line numberDiff line change
@@ -17,12 +17,14 @@ services:
1717
https_proxy: ${https_proxy}
1818
HABANA_VISIBLE_DEVICES: all
1919
OMPI_MCA_btl_vader_single_copy_mechanism: none
20-
HF_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
20+
HUGGING_FACE_HUB_TOKEN: ${HUGGINGFACEHUB_API_TOKEN}
21+
PREFILL_BATCH_BUCKET_SIZE: 1
22+
BATCH_BUCKET_SIZE: 8
2123
runtime: habana
2224
cap_add:
2325
- SYS_NICE
2426
ipc: host
25-
command: --model-id ${LLM_MODEL_ID} --max-input-length 1024 --max-total-tokens 2048
27+
command: --model-id ${LLM_MODEL_ID} --max-input-length 2048 --max-total-tokens 4096 --max-batch-total-tokens 65536 --max-batch-prefill-tokens 4096
2628
llm_faqgen:
2729
image: opea/llm-faqgen-tgi:latest
2830
container_name: llm-faqgen-server

‎FaqGen/kubernetes/manifests/README.md

+12-1
Original file line numberDiff line numberDiff line change
@@ -23,13 +23,24 @@ sed -i "s/insert-your-huggingface-token-here/${HUGGINGFACEHUB_API_TOKEN}/g" faqg
2323
kubectl apply -f faqgen.yaml
2424
```
2525

26+
## Deploy UI
27+
28+
```
29+
cd GenAIExamples/FaqGen/kubernetes/manifests/
30+
kubectl get svc # get ip address
31+
ip_address="" # according to your svc address
32+
sed -i "s/insert_your_ip_here/${ip_address}/g" ui.yaml
33+
kubectl apply -f ui.yaml
34+
```
35+
2636
## Verify Services
2737

2838
Make sure all the pods are running, and restart the faqgen-xxxx pod if necessary.
2939

3040
```
3141
kubectl get pods
32-
curl http://${host_ip}:8888/v1/faqgen -H "Content-Type: application/json" -d '{
42+
port=7779 # 7779 for gaudi, 7778 for xeon
43+
curl http://${host_ip}:7779/v1/faqgen -H "Content-Type: application/json" -d '{
3344
"messages": "Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5."
3445
}'
3546
```
+133-163
Original file line numberDiff line numberDiff line change
@@ -1,216 +1,186 @@
11
---
2-
# Source: codegen/charts/llm-uservice/charts/tgi/templates/service.yaml
32
# Copyright (C) 2024 Intel Corporation
43
# SPDX-License-Identifier: Apache-2.0
54

6-
apiVersion: v1
7-
kind: Service
8-
metadata:
9-
name: faqgen-tgi
10-
labels:
11-
helm.sh/chart: tgi-0.1.0
12-
app.kubernetes.io/name: tgi
13-
app.kubernetes.io/instance: faqgen
14-
app.kubernetes.io/version: "1.4"
15-
app.kubernetes.io/managed-by: Helm
16-
spec:
17-
type: ClusterIP
18-
ports:
19-
- port: 80
20-
targetPort: 80
21-
protocol: TCP
22-
name: tgi
23-
selector:
24-
app.kubernetes.io/name: tgi
25-
app.kubernetes.io/instance: faqgen
26-
---
27-
apiVersion: v1
28-
kind: Service
29-
metadata:
30-
name: faqgen-llm-uservice
31-
labels:
32-
helm.sh/chart: llm-uservice-0.1.0
33-
app.kubernetes.io/name: llm-uservice
34-
app.kubernetes.io/instance: faqgen
35-
app.kubernetes.io/version: "1.0.0"
36-
app.kubernetes.io/managed-by: Helm
37-
spec:
38-
type: ClusterIP
39-
ports:
40-
- port: 9000
41-
targetPort: 9000
42-
protocol: TCP
43-
name: llm-uservice
44-
selector:
45-
app.kubernetes.io/name: llm-uservice
46-
app.kubernetes.io/instance: faqgen
47-
---
48-
apiVersion: v1
49-
kind: Service
50-
metadata:
51-
name: faqgen
52-
labels:
53-
helm.sh/chart: faqgen-0.1.0
54-
app.kubernetes.io/name: faqgen
55-
app.kubernetes.io/instance: faqgen
56-
app.kubernetes.io/version: "1.0.0"
57-
app.kubernetes.io/managed-by: Helm
58-
spec:
59-
type: ClusterIP
60-
ports:
61-
- port: 8888
62-
targetPort: 8888
63-
protocol: TCP
64-
name: faqgen
65-
selector:
66-
app.kubernetes.io/name: faqgen
67-
app.kubernetes.io/instance: faqgen
68-
---
695
apiVersion: apps/v1
706
kind: Deployment
717
metadata:
72-
name: faqgen-tgi
73-
labels:
74-
helm.sh/chart: tgi-0.1.0
75-
app.kubernetes.io/name: tgi
76-
app.kubernetes.io/instance: faqgen
77-
app.kubernetes.io/version: "1.4"
78-
app.kubernetes.io/managed-by: Helm
8+
name: faq-tgi-deploy
9+
namespace: default
7910
spec:
8011
replicas: 1
8112
selector:
8213
matchLabels:
83-
app.kubernetes.io/name: tgi
84-
app.kubernetes.io/instance: faqgen
14+
app: faq-tgi-deploy
8515
template:
8616
metadata:
17+
annotations:
18+
sidecar.istio.io/rewriteAppHTTPProbers: 'true'
8719
labels:
88-
app.kubernetes.io/name: tgi
89-
app.kubernetes.io/instance: faqgen
20+
app: faq-tgi-deploy
9021
spec:
91-
securityContext: {}
22+
hostIPC: true
9223
containers:
93-
- name: tgi
94-
env:
95-
- name: MODEL_ID
96-
value: Intel/neural-chat-7b-v3-3
97-
- name: PORT
98-
value: "80"
99-
- name: http_proxy
100-
value:
101-
- name: https_proxy
102-
value:
103-
- name: no_proxy
104-
value:
105-
securityContext: {}
106-
image: "ghcr.io/huggingface/text-generation-inference:1.4"
107-
imagePullPolicy: IfNotPresent
108-
volumeMounts:
109-
- mountPath: /data
110-
name: model-volume
111-
ports:
112-
- name: http
113-
containerPort: 80
114-
protocol: TCP
115-
resources: {}
24+
- name: faq-tgi-deploy-demo
25+
env:
26+
- name: HUGGING_FACE_HUB_TOKEN
27+
value: "insert-your-huggingface-token-here"
28+
- name: OMPI_MCA_btl_vader_single_copy_mechanism
29+
value: none
30+
- name: PT_HPU_ENABLE_LAZY_COLLECTIVES
31+
value: 'true'
32+
- name: runtime
33+
value: habana
34+
- name: HABANA_VISIBLE_DEVICES
35+
value: all
36+
- name: PREFILL_BATCH_BUCKET_SIZE
37+
value: "1"
38+
- name: BATCH_BUCKET_SIZE
39+
value: "8"
40+
- name: PORT
41+
value: "80"
42+
image: ghcr.io/huggingface/tgi-gaudi:2.0.1
43+
imagePullPolicy: IfNotPresent
44+
securityContext:
45+
capabilities:
46+
add:
47+
- SYS_NICE
48+
args:
49+
- --model-id
50+
- 'meta-llama/Meta-Llama-3-8B-Instruct'
51+
- --max-input-length
52+
- '3096'
53+
- --max-total-tokens
54+
- '4096'
55+
- --max-batch-total-tokens
56+
- '65536'
57+
- --max-batch-prefill-tokens
58+
- '4096'
59+
volumeMounts:
60+
- mountPath: /data
61+
name: model-volume
62+
- mountPath: /dev/shm
63+
name: shm
64+
ports:
65+
- containerPort: 80
66+
resources:
67+
limits:
68+
habana.ai/gaudi: 1
69+
serviceAccountName: default
11670
volumes:
117-
- name: model-volume
118-
hostPath:
119-
path: /mnt
120-
type: Directory
71+
- name: model-volume
72+
hostPath:
73+
path: /home/sdp/cesg
74+
type: Directory
75+
- name: shm
76+
emptyDir:
77+
medium: Memory
78+
sizeLimit: 1Gi
79+
---
80+
kind: Service
81+
apiVersion: v1
82+
metadata:
83+
name: faq-tgi-svc
84+
spec:
85+
type: ClusterIP
86+
selector:
87+
app: faq-tgi-deploy
88+
ports:
89+
- name: service
90+
port: 8010
91+
targetPort: 80
12192
---
12293
apiVersion: apps/v1
12394
kind: Deployment
12495
metadata:
125-
name: faqgen-llm-uservice
126-
labels:
127-
helm.sh/chart: llm-uservice-0.1.0
128-
app.kubernetes.io/name: llm-uservice
129-
app.kubernetes.io/instance: faqgen
130-
app.kubernetes.io/version: "1.0.0"
131-
app.kubernetes.io/managed-by: Helm
96+
name: faq-micro-deploy
97+
namespace: default
13298
spec:
13399
replicas: 1
134100
selector:
135101
matchLabels:
136-
app.kubernetes.io/name: llm-uservice
137-
app.kubernetes.io/instance: faqgen
102+
app: faq-micro-deploy
138103
template:
139104
metadata:
105+
annotations:
106+
sidecar.istio.io/rewriteAppHTTPProbers: 'true'
140107
labels:
141-
app.kubernetes.io/name: llm-uservice
142-
app.kubernetes.io/instance: faqgen
108+
app: faq-micro-deploy
143109
spec:
144-
securityContext: {}
110+
hostIPC: true
145111
containers:
146-
- name: faqgen
112+
- name: faq-micro-deploy
147113
env:
148114
- name: TGI_LLM_ENDPOINT
149-
value: "http://faqgen-tgi:80"
115+
value: "http://faq-tgi-svc.default.svc.cluster.local:8010"
150116
- name: HUGGINGFACEHUB_API_TOKEN
151117
value: "insert-your-huggingface-token-here"
152-
- name: http_proxy
153-
value:
154-
- name: https_proxy
155-
value:
156-
- name: no_proxy
157-
value:
158-
securityContext: {}
159-
image: "opea/llm-faqgen-tgi:latest"
118+
image: opea/llm-faqgen-tgi:latest
160119
imagePullPolicy: IfNotPresent
120+
args: null
161121
ports:
162-
- name: llm-uservice
163-
containerPort: 9000
164-
protocol: TCP
165-
startupProbe:
166-
exec:
167-
command:
168-
- curl
169-
- http://faqgen-tgi:80
170-
initialDelaySeconds: 5
171-
periodSeconds: 5
172-
failureThreshold: 120
173-
resources: {}
122+
- containerPort: 9000
123+
serviceAccountName: default
124+
---
125+
kind: Service
126+
apiVersion: v1
127+
metadata:
128+
name: faq-micro-svc
129+
spec:
130+
type: ClusterIP
131+
selector:
132+
app: faq-micro-deploy
133+
ports:
134+
- name: service
135+
port: 9003
136+
targetPort: 9000
174137
---
175138
apiVersion: apps/v1
176139
kind: Deployment
177140
metadata:
178-
name: faqgen
179-
labels:
180-
helm.sh/chart: faqgen-0.1.0
181-
app.kubernetes.io/name: faqgen
182-
app.kubernetes.io/instance: faqgen
183-
app.kubernetes.io/version: "1.0.0"
184-
app.kubernetes.io/managed-by: Helm
141+
name: faq-mega-server-deploy
142+
namespace: default
185143
spec:
186144
replicas: 1
187145
selector:
188146
matchLabels:
189-
app.kubernetes.io/name: faqgen
190-
app.kubernetes.io/instance: faqgen
147+
app: faq-mega-server-deploy
191148
template:
192149
metadata:
150+
annotations:
151+
sidecar.istio.io/rewriteAppHTTPProbers: 'true'
193152
labels:
194-
app.kubernetes.io/name: faqgen
195-
app.kubernetes.io/instance: faqgen
153+
app: faq-mega-server-deploy
196154
spec:
197-
securityContext: null
155+
hostIPC: true
198156
containers:
199-
- name: faqgen
157+
- name: faq-mega-server-deploy
200158
env:
201159
- name: LLM_SERVICE_HOST_IP
202-
value: faqgen-llm-uservice
203-
- name: http_proxy
204-
value:
205-
- name: https_proxy
206-
value:
207-
- name: no_proxy
208-
value:
209-
securityContext: null
210-
image: "opea/faqgen:latest"
160+
value: faq-micro-svc
161+
- name: LLM_SERVICE_PORT
162+
value: "9003"
163+
- name: MEGA_SERVICE_HOST_IP
164+
value: faq-mega-server-svc
165+
- name: MEGA_SERVICE_PORT
166+
value: "7777"
167+
image: opea/faqgen:latest
211168
imagePullPolicy: IfNotPresent
169+
args: null
212170
ports:
213-
- name: faqgen
214-
containerPort: 8888
215-
protocol: TCP
216-
resources: null
171+
- containerPort: 7777
172+
serviceAccountName: default
173+
---
174+
kind: Service
175+
apiVersion: v1
176+
metadata:
177+
name: faq-mega-server-svc
178+
spec:
179+
type: NodePort
180+
selector:
181+
app: faq-mega-server-deploy
182+
ports:
183+
- name: service
184+
port: 7779
185+
targetPort: 7777
186+
nodePort: 30779

‎FaqGen/kubernetes/manifests/ui.yaml

+46
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
---
2+
# Copyright (C) 2024 Intel Corporation
3+
# SPDX-License-Identifier: Apache-2.0
4+
apiVersion: apps/v1
5+
kind: Deployment
6+
metadata:
7+
name: faq-mega-ui-deploy
8+
namespace: default
9+
spec:
10+
replicas: 1
11+
selector:
12+
matchLabels:
13+
app: faq-mega-ui-deploy
14+
template:
15+
metadata:
16+
annotations:
17+
sidecar.istio.io/rewriteAppHTTPProbers: 'true'
18+
labels:
19+
app: faq-mega-ui-deploy
20+
spec:
21+
hostIPC: true
22+
containers:
23+
- name: faq-mega-ui-deploy
24+
env:
25+
- name: DOC_BASE_URL
26+
value: http://{insert_your_ip_here}:7779/v1/faqgen
27+
image: opea/faqgen-ui:latest
28+
imagePullPolicy: IfNotPresent
29+
args: null
30+
ports:
31+
- containerPort: 5173
32+
serviceAccountName: default
33+
---
34+
kind: Service
35+
apiVersion: v1
36+
metadata:
37+
name: faq-mega-ui-svc
38+
spec:
39+
type: NodePort
40+
selector:
41+
app: faq-mega-ui-deploy
42+
ports:
43+
- name: service
44+
port: 5175
45+
targetPort: 5173
46+
nodePort: 30175
+111-162
Original file line numberDiff line numberDiff line change
@@ -1,216 +1,165 @@
11
---
2-
# Source: codegen/charts/llm-uservice/charts/tgi/templates/service.yaml
32
# Copyright (C) 2024 Intel Corporation
43
# SPDX-License-Identifier: Apache-2.0
54

6-
apiVersion: v1
7-
kind: Service
8-
metadata:
9-
name: faqgen-tgi
10-
labels:
11-
helm.sh/chart: tgi-0.1.0
12-
app.kubernetes.io/name: tgi
13-
app.kubernetes.io/instance: faqgen
14-
app.kubernetes.io/version: "1.4"
15-
app.kubernetes.io/managed-by: Helm
16-
spec:
17-
type: ClusterIP
18-
ports:
19-
- port: 80
20-
targetPort: 80
21-
protocol: TCP
22-
name: tgi
23-
selector:
24-
app.kubernetes.io/name: tgi
25-
app.kubernetes.io/instance: faqgen
26-
---
27-
apiVersion: v1
28-
kind: Service
29-
metadata:
30-
name: faqgen-llm-uservice
31-
labels:
32-
helm.sh/chart: llm-uservice-0.1.0
33-
app.kubernetes.io/name: llm-uservice
34-
app.kubernetes.io/instance: faqgen
35-
app.kubernetes.io/version: "1.0.0"
36-
app.kubernetes.io/managed-by: Helm
37-
spec:
38-
type: ClusterIP
39-
ports:
40-
- port: 9000
41-
targetPort: 9000
42-
protocol: TCP
43-
name: llm-uservice
44-
selector:
45-
app.kubernetes.io/name: llm-uservice
46-
app.kubernetes.io/instance: faqgen
47-
---
48-
apiVersion: v1
49-
kind: Service
50-
metadata:
51-
name: faqgen
52-
labels:
53-
helm.sh/chart: faqgen-0.1.0
54-
app.kubernetes.io/name: faqgen
55-
app.kubernetes.io/instance: faqgen
56-
app.kubernetes.io/version: "1.0.0"
57-
app.kubernetes.io/managed-by: Helm
58-
spec:
59-
type: ClusterIP
60-
ports:
61-
- port: 8888
62-
targetPort: 8888
63-
protocol: TCP
64-
name: faqgen
65-
selector:
66-
app.kubernetes.io/name: faqgen
67-
app.kubernetes.io/instance: faqgen
68-
---
695
apiVersion: apps/v1
706
kind: Deployment
717
metadata:
72-
name: faqgen-tgi
73-
labels:
74-
helm.sh/chart: tgi-0.1.0
75-
app.kubernetes.io/name: tgi
76-
app.kubernetes.io/instance: faqgen
77-
app.kubernetes.io/version: "1.4"
78-
app.kubernetes.io/managed-by: Helm
8+
name: faq-tgi-cpu-deploy
9+
namespace: default
7910
spec:
8011
replicas: 1
8112
selector:
8213
matchLabels:
83-
app.kubernetes.io/name: tgi
84-
app.kubernetes.io/instance: faqgen
14+
app: faq-tgi-cpu-deploy
8515
template:
8616
metadata:
17+
annotations:
18+
sidecar.istio.io/rewriteAppHTTPProbers: 'true'
8719
labels:
88-
app.kubernetes.io/name: tgi
89-
app.kubernetes.io/instance: faqgen
20+
app: faq-tgi-cpu-deploy
9021
spec:
22+
hostIPC: true
9123
securityContext: {}
9224
containers:
93-
- name: tgi
94-
env:
95-
- name: MODEL_ID
96-
value: Intel/neural-chat-7b-v3-3
97-
- name: PORT
98-
value: "80"
99-
- name: http_proxy
100-
value:
101-
- name: https_proxy
102-
value:
103-
- name: no_proxy
104-
value:
105-
securityContext: {}
106-
image: "ghcr.io/huggingface/text-generation-inference:1.4"
107-
imagePullPolicy: IfNotPresent
108-
volumeMounts:
109-
- mountPath: /data
110-
name: model-volume
111-
ports:
112-
- name: http
113-
containerPort: 80
114-
protocol: TCP
115-
resources: {}
25+
- name: faq-tgi-cpu-deploy-demo
26+
env:
27+
- name: HUGGING_FACE_HUB_TOKEN
28+
value: "insert-your-huggingface-token-here"
29+
- name: PORT
30+
value: "80"
31+
image: ghcr.io/huggingface/text-generation-inference:1.4
32+
imagePullPolicy: IfNotPresent
33+
securityContext: {}
34+
args:
35+
- --model-id
36+
- 'meta-llama/Meta-Llama-3-8B-Instruct'
37+
- --max-input-length
38+
- '3096'
39+
- --max-total-tokens
40+
- '4096'
41+
volumeMounts:
42+
- mountPath: /data
43+
name: model-volume
44+
- mountPath: /dev/shm
45+
name: shm
46+
ports:
47+
- containerPort: 80
48+
serviceAccountName: default
11649
volumes:
117-
- name: model-volume
118-
hostPath:
119-
path: /mnt
120-
type: Directory
50+
- name: model-volume
51+
hostPath:
52+
path: /home/sdp/cesg
53+
type: Directory
54+
- name: shm
55+
emptyDir:
56+
medium: Memory
57+
sizeLimit: 1Gi
58+
---
59+
kind: Service
60+
apiVersion: v1
61+
metadata:
62+
name: faq-tgi-cpu-svc
63+
spec:
64+
type: ClusterIP
65+
selector:
66+
app: faq-tgi-cpu-deploy
67+
ports:
68+
- name: service
69+
port: 8011
70+
targetPort: 80
12171
---
12272
apiVersion: apps/v1
12373
kind: Deployment
12474
metadata:
125-
name: faqgen-llm-uservice
126-
labels:
127-
helm.sh/chart: llm-uservice-0.1.0
128-
app.kubernetes.io/name: llm-uservice
129-
app.kubernetes.io/instance: faqgen
130-
app.kubernetes.io/version: "1.0.0"
131-
app.kubernetes.io/managed-by: Helm
75+
name: faq-micro-cpu-deploy
76+
namespace: default
13277
spec:
13378
replicas: 1
13479
selector:
13580
matchLabels:
136-
app.kubernetes.io/name: llm-uservice
137-
app.kubernetes.io/instance: faqgen
81+
app: faq-micro-cpu-deploy
13882
template:
13983
metadata:
84+
annotations:
85+
sidecar.istio.io/rewriteAppHTTPProbers: 'true'
14086
labels:
141-
app.kubernetes.io/name: llm-uservice
142-
app.kubernetes.io/instance: faqgen
87+
app: faq-micro-cpu-deploy
14388
spec:
144-
securityContext: {}
89+
hostIPC: true
14590
containers:
146-
- name: faqgen
91+
- name: faq-micro-cpu-deploy
14792
env:
14893
- name: TGI_LLM_ENDPOINT
149-
value: "http://faqgen-tgi:80"
94+
value: "http://faq-tgi-cpu-svc.default.svc.cluster.local:8011"
15095
- name: HUGGINGFACEHUB_API_TOKEN
15196
value: "insert-your-huggingface-token-here"
152-
- name: http_proxy
153-
value:
154-
- name: https_proxy
155-
value:
156-
- name: no_proxy
157-
value:
158-
securityContext: {}
159-
image: "opea/llm-faqgen-tgi:latest"
97+
image: opea/llm-faqgen-tgi:latest
16098
imagePullPolicy: IfNotPresent
99+
args: null
161100
ports:
162-
- name: llm-uservice
163-
containerPort: 9000
164-
protocol: TCP
165-
startupProbe:
166-
exec:
167-
command:
168-
- curl
169-
- http://faqgen-tgi:80
170-
initialDelaySeconds: 5
171-
periodSeconds: 5
172-
failureThreshold: 120
173-
resources: {}
101+
- containerPort: 9000
102+
serviceAccountName: default
103+
---
104+
kind: Service
105+
apiVersion: v1
106+
metadata:
107+
name: faq-micro-cpu-svc
108+
spec:
109+
type: ClusterIP
110+
selector:
111+
app: faq-micro-cpu-deploy
112+
ports:
113+
- name: service
114+
port: 9004
115+
targetPort: 9000
174116
---
175117
apiVersion: apps/v1
176118
kind: Deployment
177119
metadata:
178-
name: faqgen
179-
labels:
180-
helm.sh/chart: faqgen-0.1.0
181-
app.kubernetes.io/name: faqgen
182-
app.kubernetes.io/instance: faqgen
183-
app.kubernetes.io/version: "1.0.0"
184-
app.kubernetes.io/managed-by: Helm
120+
name: faq-mega-server-cpu-deploy
121+
namespace: default
185122
spec:
186123
replicas: 1
187124
selector:
188125
matchLabels:
189-
app.kubernetes.io/name: faqgen
190-
app.kubernetes.io/instance: faqgen
126+
app: faq-mega-server-cpu-deploy
191127
template:
192128
metadata:
129+
annotations:
130+
sidecar.istio.io/rewriteAppHTTPProbers: 'true'
193131
labels:
194-
app.kubernetes.io/name: faqgen
195-
app.kubernetes.io/instance: faqgen
132+
app: faq-mega-server-cpu-deploy
196133
spec:
197-
securityContext: null
134+
hostIPC: true
198135
containers:
199-
- name: faqgen
136+
- name: faq-mega-server-cpu-deploy
200137
env:
201138
- name: LLM_SERVICE_HOST_IP
202-
value: faqgen-llm-uservice
203-
- name: http_proxy
204-
value:
205-
- name: https_proxy
206-
value:
207-
- name: no_proxy
208-
value:
209-
securityContext: null
210-
image: "opea/faqgen:latest"
139+
value: faq-micro-cpu-svc
140+
- name: LLM_SERVICE_PORT
141+
value: "9004"
142+
- name: MEGA_SERVICE_HOST_IP
143+
value: faq-mega-server-cpu-svc
144+
- name: MEGA_SERVICE_PORT
145+
value: "7777"
146+
image: opea/faqgen:latest
211147
imagePullPolicy: IfNotPresent
148+
args: null
212149
ports:
213-
- name: faqgen
214-
containerPort: 8888
215-
protocol: TCP
216-
resources: null
150+
- containerPort: 7777
151+
serviceAccountName: default
152+
---
153+
kind: Service
154+
apiVersion: v1
155+
metadata:
156+
name: faq-mega-server-cpu-svc
157+
spec:
158+
type: NodePort
159+
selector:
160+
app: faq-mega-server-cpu-deploy
161+
ports:
162+
- name: service
163+
port: 7778
164+
targetPort: 7777
165+
nodePort: 30778

0 commit comments

Comments
 (0)
Please sign in to comment.