Skip to content

Commit

Permalink
Update readme and remove empty readme (#396)
Browse files Browse the repository at this point in the history
* remove habana

Signed-off-by: lvliang-intel <[email protected]>

* Update REAME and remove empty README

Signed-off-by: lvliang-intel <[email protected]>

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Signed-off-by: lvliang-intel <[email protected]>
Co-authored-by: Sihan Chen <[email protected]>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
  • Loading branch information
3 people authored Aug 6, 2024
1 parent da19c5d commit a61e434
Show file tree
Hide file tree
Showing 5 changed files with 135 additions and 4 deletions.
Empty file removed comps/cores/mega/README.md
Empty file.
Empty file.
96 changes: 96 additions & 0 deletions comps/llms/summarization/tgi/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
# Document Summary TGI Microservice

In this microservice, we utilize LangChain to implement summarization strategies and facilitate LLM inference using Text Generation Inference on Intel Xeon and Gaudi2 processors.
[Text Generation Inference](https://github.com/huggingface/text-generation-inference) (TGI) is a toolkit for deploying and serving Large Language Models (LLMs). TGI enables high-performance text generation for the most popular open-source LLMs, including Llama, Falcon, StarCoder, BLOOM, GPT-NeoX, and more.

# 🚀1. Start Microservice with Python (Option 1)

To start the LLM microservice, you need to install python packages first.

## 1.1 Install Requirements

```bash
pip install -r requirements.txt
```

## 1.2 Start LLM Service

```bash
export HF_TOKEN=${your_hf_api_token}
docker run -p 8008:80 -v ./data:/data --name llm-docsum-tgi --shm-size 1g ghcr.io/huggingface/text-generation-inference:2.1.0 --model-id ${your_hf_llm_model}
```

## 1.3 Verify the TGI Service

```bash
curl http://${your_ip}:8008/generate \
-X POST \
-d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17, "do_sample": true}}' \
-H 'Content-Type: application/json'
```

## 1.4 Start LLM Service with Python Script

```bash
export TGI_LLM_ENDPOINT="http://${your_ip}:8008"
python llm.py
```

# 🚀2. Start Microservice with Docker (Option 2)

If you start an LLM microservice with docker, the `docker_compose_llm.yaml` file will automatically start a TGI/vLLM service with docker.

## 2.1 Setup Environment Variables

In order to start TGI and LLM services, you need to setup the following environment variables first.

```bash
export HF_TOKEN=${your_hf_api_token}
export TGI_LLM_ENDPOINT="http://${your_ip}:8008"
export LLM_MODEL_ID=${your_hf_llm_model}
```

## 2.2 Build Docker Image

```bash
cd ../../
docker build -t opea/llm-docsum-tgi:latest --build-arg https_proxy=$https_proxy --build-arg http_proxy=$http_proxy -f comps/llms/summarization/tgi/Dockerfile .
```

To start a docker container, you have two options:

- A. Run Docker with CLI
- B. Run Docker with Docker Compose

You can choose one as needed.

## 2.3 Run Docker with CLI (Option A)

```bash
docker run -d --name="llm-docsum-tgi-server" -p 9000:9000 --ipc=host -e http_proxy=$http_proxy -e https_proxy=$https_proxy -e TGI_LLM_ENDPOINT=$TGI_LLM_ENDPOINT -e HF_TOKEN=$HF_TOKEN opea/llm-docsum-tgi:latest
```

## 2.4 Run Docker with Docker Compose (Option B)

```bash
docker compose -f docker_compose_llm.yaml up -d
```

# 🚀3. Consume LLM Service

## 3.1 Check Service Status

```bash
curl http://${your_ip}:9000/v1/health_check\
-X GET \
-H 'Content-Type: application/json'
```

## 3.2 Consume LLM Service

```bash
curl http://${your_ip}:9000/v1/chat/docsum \
-X POST \
-d '{"query":"Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5."}' \
-H 'Content-Type: application/json'
```
7 changes: 3 additions & 4 deletions comps/llms/summarization/tgi/docker_compose_llm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ version: "3.8"

services:
tgi_service:
image: ghcr.io/huggingface/text-generation-inference:1.4
image: ghcr.io/huggingface/text-generation-inference:2.1.0
container_name: tgi-service
ports:
- "8008:80"
Expand All @@ -16,8 +16,8 @@ services:
shm_size: 1g
command: --model-id ${LLM_MODEL_ID}
llm:
image: opea/gen-ai-comps:llm-tgi-server
container_name: llm-tgi-server
image: opea/llm-docsum-tgi:latest
container_name: llm-docsum-tgi-server
ports:
- "9000:9000"
ipc: host
Expand All @@ -27,7 +27,6 @@ services:
https_proxy: ${https_proxy}
TGI_LLM_ENDPOINT: ${TGI_LLM_ENDPOINT}
HUGGINGFACEHUB_API_TOKEN: ${HF_TOKEN}
LANGCHAIN_API_KEY: ${LANGCHAIN_API_KEY}
restart: unless-stopped

networks:
Expand Down
36 changes: 36 additions & 0 deletions comps/vectorstores/langchain/chroma/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
# Introduction

Chroma is a AI-native open-source vector database focused on developer productivity and happiness. Chroma is licensed under Apache 2.0. Chroma runs in various modes, we can deploy it as a server running your local machine or in the cloud.

# Getting Started

## Start Chroma Server

To start the Chroma server on your local machine, follow these steps:

```bash
git clone https://github.com/chroma-core/chroma.git
cd chroma
docker compose up -d
```

## Start Log Output

Upon starting the server, you should see log outputs similar to the following:

```log
server-1 | Starting 'uvicorn chromadb.app:app' with args: --workers 1 --host 0.0.0.0 --port 8000 --proxy-headers --log-config chromadb/log_config.yml --timeout-keep-alive 30
server-1 | INFO: [02-08-2024 07:03:19] Set chroma_server_nofile to 65536
server-1 | INFO: [02-08-2024 07:03:19] Anonymized telemetry enabled. See https://docs.trychroma.com/telemetry for more information.
server-1 | DEBUG: [02-08-2024 07:03:19] Starting component System
server-1 | DEBUG: [02-08-2024 07:03:19] Starting component OpenTelemetryClient
server-1 | DEBUG: [02-08-2024 07:03:19] Starting component SqliteDB
server-1 | DEBUG: [02-08-2024 07:03:19] Starting component QuotaEnforcer
server-1 | DEBUG: [02-08-2024 07:03:19] Starting component Posthog
server-1 | DEBUG: [02-08-2024 07:03:19] Starting component LocalSegmentManager
server-1 | DEBUG: [02-08-2024 07:03:19] Starting component SegmentAPI
server-1 | INFO: [02-08-2024 07:03:19] Started server process [1]
server-1 | INFO: [02-08-2024 07:03:19] Waiting for application startup.
server-1 | INFO: [02-08-2024 07:03:19] Application startup complete.
server-1 | INFO: [02-08-2024 07:03:19] Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)
```

0 comments on commit a61e434

Please sign in to comment.