diff --git a/comps/cores/mega/README.md b/comps/cores/mega/README.md deleted file mode 100644 index e69de29bb2..0000000000 diff --git a/comps/knowledgegraphs/langchain/README.md b/comps/knowledgegraphs/langchain/README.md deleted file mode 100755 index e69de29bb2..0000000000 diff --git a/comps/llms/summarization/tgi/README.md b/comps/llms/summarization/tgi/README.md index e69de29bb2..9e5858b4b7 100644 --- a/comps/llms/summarization/tgi/README.md +++ b/comps/llms/summarization/tgi/README.md @@ -0,0 +1,96 @@ +# Document Summary TGI Microservice + +In this microservice, we utilize LangChain to implement summarization strategies and facilitate LLM inference using Text Generation Inference on Intel Xeon and Gaudi2 processors. +[Text Generation Inference](https://github.com/huggingface/text-generation-inference) (TGI) is a toolkit for deploying and serving Large Language Models (LLMs). TGI enables high-performance text generation for the most popular open-source LLMs, including Llama, Falcon, StarCoder, BLOOM, GPT-NeoX, and more. + +# 🚀1. Start Microservice with Python (Option 1) + +To start the LLM microservice, you need to install python packages first. + +## 1.1 Install Requirements + +```bash +pip install -r requirements.txt +``` + +## 1.2 Start LLM Service + +```bash +export HF_TOKEN=${your_hf_api_token} +docker run -p 8008:80 -v ./data:/data --name llm-docsum-tgi --shm-size 1g ghcr.io/huggingface/text-generation-inference:2.1.0 --model-id ${your_hf_llm_model} +``` + +## 1.3 Verify the TGI Service + +```bash +curl http://${your_ip}:8008/generate \ + -X POST \ + -d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17, "do_sample": true}}' \ + -H 'Content-Type: application/json' +``` + +## 1.4 Start LLM Service with Python Script + +```bash +export TGI_LLM_ENDPOINT="http://${your_ip}:8008" +python llm.py +``` + +# 🚀2. Start Microservice with Docker (Option 2) + +If you start an LLM microservice with docker, the `docker_compose_llm.yaml` file will automatically start a TGI/vLLM service with docker. + +## 2.1 Setup Environment Variables + +In order to start TGI and LLM services, you need to setup the following environment variables first. + +```bash +export HF_TOKEN=${your_hf_api_token} +export TGI_LLM_ENDPOINT="http://${your_ip}:8008" +export LLM_MODEL_ID=${your_hf_llm_model} +``` + +## 2.2 Build Docker Image + +```bash +cd ../../ +docker build -t opea/llm-docsum-tgi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/summarization/tgi/Dockerfile . +``` + +To start a docker container, you have two options: + +- A. Run Docker with CLI +- B. Run Docker with Docker Compose + +You can choose one as needed. + +## 2.3 Run Docker with CLI (Option A) + +```bash +docker run -d --name="llm-docsum-tgi-server" -p 9000:9000 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e TGI_LLM_ENDPOINT=$TGI_LLM_ENDPOINT -e HF_TOKEN=$HF_TOKEN opea/llm-docsum-tgi:latest +``` + +## 2.4 Run Docker with Docker Compose (Option B) + +```bash +docker compose -f docker_compose_llm.yaml up -d +``` + +# 🚀3. Consume LLM Service + +## 3.1 Check Service Status + +```bash +curl http://${your_ip}:9000/v1/health_check\ + -X GET \ + -H 'Content-Type: application/json' +``` + +## 3.2 Consume LLM Service + +```bash +curl http://${your_ip}:9000/v1/chat/docsum \ + -X POST \ + -d '{"query":"Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5."}' \ + -H 'Content-Type: application/json' +``` diff --git a/comps/llms/summarization/tgi/docker_compose_llm.yaml b/comps/llms/summarization/tgi/docker_compose_llm.yaml index 41ae5d0768..2b517333e1 100644 --- a/comps/llms/summarization/tgi/docker_compose_llm.yaml +++ b/comps/llms/summarization/tgi/docker_compose_llm.yaml @@ -5,7 +5,7 @@ version: "3.8" services: tgi_service: - image: ghcr.io/huggingface/text-generation-inference:1.4 + image: ghcr.io/huggingface/text-generation-inference:2.1.0 container_name: tgi-service ports: - "8008:80" @@ -16,8 +16,8 @@ services: shm_size: 1g command: --model-id ${LLM_MODEL_ID} llm: - image: opea/gen-ai-comps:llm-tgi-server - container_name: llm-tgi-server + image: opea/llm-docsum-tgi:latest + container_name: llm-docsum-tgi-server ports: - "9000:9000" ipc: host @@ -27,7 +27,6 @@ services: https_proxy: ${https_proxy} TGI_LLM_ENDPOINT: ${TGI_LLM_ENDPOINT} HUGGINGFACEHUB_API_TOKEN: ${HF_TOKEN} - LANGCHAIN_API_KEY: ${LANGCHAIN_API_KEY} restart: unless-stopped networks: diff --git a/comps/vectorstores/langchain/chroma/README.md b/comps/vectorstores/langchain/chroma/README.md index e69de29bb2..d7399b8fb7 100644 --- a/comps/vectorstores/langchain/chroma/README.md +++ b/comps/vectorstores/langchain/chroma/README.md @@ -0,0 +1,36 @@ +# Introduction + +Chroma is a AI-native open-source vector database focused on developer productivity and happiness. Chroma is licensed under Apache 2.0. Chroma runs in various modes, we can deploy it as a server running your local machine or in the cloud. + +# Getting Started + +## Start Chroma Server + +To start the Chroma server on your local machine, follow these steps: + +```bash +git clone https://github.com/chroma-core/chroma.git +cd chroma +docker compose up -d +``` + +## Start Log Output + +Upon starting the server, you should see log outputs similar to the following: + +```log +server-1 | Starting 'uvicorn chromadb.app:app' with args: --workers 1 --host 0.0.0.0 --port 8000 --proxy-headers --log-config chromadb/log_config.yml --timeout-keep-alive 30 +server-1 | INFO: [02-08-2024 07:03:19] Set chroma_server_nofile to 65536 +server-1 | INFO: [02-08-2024 07:03:19] Anonymized telemetry enabled. See https://docs.trychroma.com/telemetry for more information. +server-1 | DEBUG: [02-08-2024 07:03:19] Starting component System +server-1 | DEBUG: [02-08-2024 07:03:19] Starting component OpenTelemetryClient +server-1 | DEBUG: [02-08-2024 07:03:19] Starting component SqliteDB +server-1 | DEBUG: [02-08-2024 07:03:19] Starting component QuotaEnforcer +server-1 | DEBUG: [02-08-2024 07:03:19] Starting component Posthog +server-1 | DEBUG: [02-08-2024 07:03:19] Starting component LocalSegmentManager +server-1 | DEBUG: [02-08-2024 07:03:19] Starting component SegmentAPI +server-1 | INFO: [02-08-2024 07:03:19] Started server process [1] +server-1 | INFO: [02-08-2024 07:03:19] Waiting for application startup. +server-1 | INFO: [02-08-2024 07:03:19] Application startup complete. +server-1 | INFO: [02-08-2024 07:03:19] Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit) +```