forked from opea-project/GenAIComps
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Support Embedding Microservice with Llama Index (opea-project#150)
* fix stream=false doesn't work issue Signed-off-by: letonghan <[email protected]> * support embedding comp with llama_index Signed-off-by: letonghan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add More Contents to the Table of MicroService (opea-project#141) * Add More Contents to the Table MicroService Signed-off-by: zehao-intel <[email protected]> * reorder Signed-off-by: zehao-intel <[email protected]> * Update README.md * refine structure Signed-off-by: zehao-intel <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix model Signed-off-by: zehao-intel <[email protected]> * refine table Signed-off-by: zehao-intel <[email protected]> * put llm to the ground Signed-off-by: zehao-intel <[email protected]> --------- Signed-off-by: zehao-intel <[email protected]> Co-authored-by: Sihan Chen <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Use common security content for OPEA projects (opea-project#151) * add python coverage Signed-off-by: chensuyue <[email protected]> * docs update Signed-off-by: chensuyue <[email protected]> * Revert "add python coverage" This reverts commit 69615b1. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: chensuyue <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * Enable vLLM Gaudi support for LLM service based on officially habana vllm release (opea-project#137) Signed-off-by: tianyil1 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> * support embedding comp with llama_index Signed-off-by: letonghan <[email protected]> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add test script for embedding llama_inde Signed-off-by: letonghan <[email protected]> * remove conflict requirements Signed-off-by: letonghan <[email protected]> * update test script Signed-off-by: letonghan <[email protected]> * udpate Signed-off-by: letonghan <[email protected]> * update Signed-off-by: letonghan <[email protected]> * update Signed-off-by: letonghan <[email protected]> * fix ut issue Signed-off-by: letonghan <[email protected]> --------- Signed-off-by: letonghan <[email protected]> Signed-off-by: zehao-intel <[email protected]> Signed-off-by: chensuyue <[email protected]> Signed-off-by: tianyil1 <[email protected]> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: zehao-intel <[email protected]> Co-authored-by: Sihan Chen <[email protected]> Co-authored-by: chen, suyue <[email protected]> Co-authored-by: Tianyi Liu <[email protected]> Signed-off-by: sharanshirodkar7 <[email protected]>
- Loading branch information
1 parent
fae330b
commit c3a0dca
Showing
8 changed files
with
217 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,2 @@ | ||
# Copyright (C) 2024 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
|
||
# Copyright (C) 2024 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
FROM ubuntu:22.04 | ||
|
||
RUN apt-get update -y && apt-get install -y --no-install-recommends --fix-missing \ | ||
libgl1-mesa-glx \ | ||
libjemalloc-dev \ | ||
vim \ | ||
python3 \ | ||
python3-pip | ||
|
||
RUN useradd -m -s /bin/bash user && \ | ||
mkdir -p /home/user && \ | ||
chown -R user /home/user/ | ||
|
||
USER user | ||
|
||
COPY comps /home/user/comps | ||
|
||
RUN pip install --no-cache-dir --upgrade pip && \ | ||
pip install --no-cache-dir -r /home/user/comps/embeddings/llama_index/requirements.txt | ||
|
||
ENV PYTHONPATH=$PYTHONPATH:/home/user | ||
|
||
WORKDIR /home/user/comps/embeddings/llama_index | ||
|
||
ENTRYPOINT ["python3", "embedding_tei_gaudi.py"] | ||
|
23 changes: 23 additions & 0 deletions
23
comps/embeddings/llama_index/docker/docker_compose_embedding.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
# Copyright (C) 2024 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
version: "3.8" | ||
|
||
services: | ||
embedding: | ||
image: opea/embedding-tei:latest | ||
container_name: embedding-tei-server | ||
ports: | ||
- "6000:6000" | ||
ipc: host | ||
environment: | ||
http_proxy: ${http_proxy} | ||
https_proxy: ${https_proxy} | ||
TEI_EMBEDDING_ENDPOINT: ${TEI_EMBEDDING_ENDPOINT} | ||
TEI_EMBEDDING_MODEL_NAME: ${TEI_EMBEDDING_MODEL_NAME} | ||
LANGCHAIN_API_KEY: ${LANGCHAIN_API_KEY} | ||
restart: unless-stopped | ||
|
||
networks: | ||
default: | ||
driver: bridge |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,34 @@ | ||
# Copyright (C) 2024 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
import os | ||
|
||
from langsmith import traceable | ||
from llama_index.embeddings.text_embeddings_inference import TextEmbeddingsInference | ||
|
||
from comps import EmbedDoc768, ServiceType, TextDoc, opea_microservices, register_microservice | ||
|
||
|
||
@register_microservice( | ||
name="opea_service@embedding_tgi_gaudi", | ||
service_type=ServiceType.EMBEDDING, | ||
endpoint="/v1/embeddings", | ||
host="0.0.0.0", | ||
port=6000, | ||
input_datatype=TextDoc, | ||
output_datatype=EmbedDoc768, | ||
) | ||
@traceable(run_type="embedding") | ||
def embedding(input: TextDoc) -> EmbedDoc768: | ||
embed_vector = embeddings._get_query_embedding(input.text) | ||
embed_vector = embed_vector[:768] # Keep only the first 768 elements | ||
res = EmbedDoc768(text=input.text, embedding=embed_vector) | ||
return res | ||
|
||
|
||
if __name__ == "__main__": | ||
tei_embedding_model_name = os.getenv("TEI_EMBEDDING_MODEL_NAME", "BAAI/bge-large-en-v1.5") | ||
tei_embedding_endpoint = os.getenv("TEI_EMBEDDING_ENDPOINT", "http://localhost:8090") | ||
embeddings = TextEmbeddingsInference(model_name=tei_embedding_model_name, base_url=tei_embedding_endpoint) | ||
print("TEI Gaudi Embedding initialized.") | ||
opea_microservices["opea_service@embedding_tgi_gaudi"].start() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
# Copyright (C) 2024 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
from langsmith import traceable | ||
from llama_index.embeddings.huggingface import HuggingFaceEmbedding | ||
|
||
from comps import EmbedDoc1024, ServiceType, TextDoc, opea_microservices, register_microservice | ||
|
||
|
||
@register_microservice( | ||
name="opea_service@local_embedding", | ||
service_type=ServiceType.EMBEDDING, | ||
endpoint="/v1/embeddings", | ||
host="0.0.0.0", | ||
port=6000, | ||
input_datatype=TextDoc, | ||
output_datatype=EmbedDoc1024, | ||
) | ||
@traceable(run_type="embedding") | ||
def embedding(input: TextDoc) -> EmbedDoc1024: | ||
embed_vector = embeddings.get_text_embedding(input.text) | ||
res = EmbedDoc1024(text=input.text, embedding=embed_vector) | ||
return res | ||
|
||
|
||
if __name__ == "__main__": | ||
embeddings = HuggingFaceEmbedding(model_name="BAAI/bge-large-en-v1.5") | ||
opea_microservices["opea_service@local_embedding"].start() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
docarray[full] | ||
fastapi | ||
huggingface_hub | ||
langsmith | ||
llama-index-embeddings-text-embeddings-inference | ||
opentelemetry-api | ||
opentelemetry-exporter-otlp | ||
opentelemetry-sdk | ||
shortuuid |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,70 @@ | ||
#!/bin/bash | ||
# Copyright (C) 2024 Intel Corporation | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
set -xe | ||
|
||
WORKPATH=$(dirname "$PWD") | ||
LOG_PATH="$WORKPATH/tests" | ||
ip_address=$(hostname -I | awk '{print $1}') | ||
|
||
function build_docker_images() { | ||
cd $WORKPATH | ||
echo $(pwd) | ||
docker build --no-cache -t opea/embedding-tei:comps --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/embeddings/llama_index/docker/Dockerfile . | ||
} | ||
|
||
function start_service() { | ||
tei_endpoint=5001 | ||
model="BAAI/bge-large-en-v1.5" | ||
revision="refs/pr/5" | ||
docker run -d --name="test-comps-embedding-tei-endpoint" -p $tei_endpoint:80 -v ./data:/data -e http_proxy=$http_proxy -e https_proxy=$https_proxy --pull always ghcr.io/huggingface/text-embeddings-inference:cpu-1.2 --model-id $model --revision $revision | ||
export TEI_EMBEDDING_ENDPOINT="http://${ip_address}:${tei_endpoint}" | ||
tei_service_port=5010 | ||
docker run -d --name="test-comps-embedding-tei-server" -e http_proxy=$http_proxy -e https_proxy=$https_proxy -p ${tei_service_port}:6000 --ipc=host -e TEI_EMBEDDING_ENDPOINT=$TEI_EMBEDDING_ENDPOINT opea/embedding-tei:comps | ||
sleep 3m | ||
} | ||
|
||
function validate_microservice() { | ||
tei_service_port=5010 | ||
URL="http://${ip_address}:$tei_service_port/v1/embeddings" | ||
docker logs test-comps-embedding-tei-server >> ${LOG_PATH}/embedding.log | ||
HTTP_STATUS=$(curl -s -o /dev/null -w "%{http_code}" -X POST -d '{"text":"What is Deep Learning?"}' -H 'Content-Type: application/json' "$URL") | ||
if [ "$HTTP_STATUS" -eq 200 ]; then | ||
echo "[ embedding - llama_index ] HTTP status is 200. Checking content..." | ||
local CONTENT=$(curl -s -X POST -d '{"text":"What is Deep Learning?"}' -H 'Content-Type: application/json' "$URL" | tee ${LOG_PATH}/embedding.log) | ||
|
||
if echo '"text":"What is Deep Learning?","embedding":\[' | grep -q "$EXPECTED_RESULT"; then | ||
echo "[ embedding - llama_index ] Content is as expected." | ||
else | ||
echo "[ embedding - llama_index ] Content does not match the expected result: $CONTENT" | ||
docker logs test-comps-embedding-tei-server >> ${LOG_PATH}/embedding.log | ||
exit 1 | ||
fi | ||
else | ||
echo "[ embedding - llama_index ] HTTP status is not 200. Received status was $HTTP_STATUS" | ||
docker logs test-comps-embedding-tei-server >> ${LOG_PATH}/embedding.log | ||
exit 1 | ||
fi | ||
} | ||
|
||
function stop_docker() { | ||
cid=$(docker ps -aq --filter "name=test-comps-embedding-*") | ||
if [[ ! -z "$cid" ]]; then docker stop $cid && docker rm $cid && sleep 1s; fi | ||
} | ||
|
||
function main() { | ||
|
||
stop_docker | ||
|
||
build_docker_images | ||
start_service | ||
|
||
validate_microservice | ||
|
||
stop_docker | ||
echo y | docker system prune | ||
|
||
} | ||
|
||
main |