Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Long context for DocSum #1255

Merged
merged 24 commits into from
Dec 20, 2024
Merged

Conversation

XinyaoWa
Copy link
Collaborator

@XinyaoWa XinyaoWa commented Dec 17, 2024

Description

Support Long context for DocSum with five modes:
• Auto(Default mode): switch to stuff mode if input token < max_input_token, otherwise switch to "refine" mode
• Stuff (Default mode): input actual tokens, need to increase the max_input_token if want to use large context
• Truncate: truncate the tokens exceed the limitation
• Map_reduce: split the inputs into multiple chunks, map each document to an individual summary, then consolidate those summaries into a single global summary
• Refine: split the inputs into multiple chunks, generate summary for the first one, then combine with the second, loops over every remaining chunks to get the final summary

Related PR: opea-project/GenAIComps#981 opea-project/GenAIComps#1046

Issues

List the issue or RFC link this PR is working on. If there is no such link, please mark it as n/a.

Type of change

List the type of change like below. Please delete options that are not relevant.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds new functionality)
  • Breaking change (fix or feature that would break existing design and interface)
  • Others (enhancement, documentation, validation, etc.)

Dependencies

List the newly introduced 3rd party dependency if exists.

Tests

Describe the tests that you ran to verify your changes.

Copy link

github-actions bot commented Dec 17, 2024

Dependency Review

✅ No vulnerabilities or license issues found.

Scanned Files

@eero-t
Copy link
Contributor

eero-t commented Dec 17, 2024

@lvliang-intel Dockerfile check fails to a pre-existing "MultimodalQnA" docs issue, this PR does not touch them:

Missing Dockerfile: GenAIComps/comps/retrievers/multimodal/redis/langchain/Dockerfile (Referenced in GenAIExamples/./MultimodalQnA/docker_compose/intel/cpu/xeon/README.md:127)
Missing Dockerfile: GenAIComps/comps/retrievers/multimodal/redis/langchain/Dockerfile (Referenced in GenAIExamples/./MultimodalQnA/docker_compose/intel/hpu/gaudi/README.md:78)
Error: Process completed with exit code 1.

"DocSum, xeon" CI test failed to:

...
[ docsum-xeon-backend-server ] Content is as expected.
...
[ docsum-gaudi-backend-server ] Content is as expected.
...
[ docsum-gaudi-backend-server ] Content is as expected.
...
[ docsum-gaudi-backend-server ] Content does not match the expected result: 
Error response from daemon: No such container: docsum-gaudi-backend-server

Which is rather surprising when test is supposed to run on Xeon, and there's a separate test that runs same things.

"DocSum, gaudi" test fails due to backend error exit:

...
[ docsum-gaudi-backend-server ] Content is as expected.
...
[ docsum-gaudi-backend-server ] Content is as expected.
...
[ docsum-gaudi-backend-server ] Content does not match the expected result: 
 Error: Process completed with exit code 1.

"DocSum, rocm" test fails to something that looks like CI issue:
aiohttp.client_exceptions.ClientConnectorError: Cannot connect to host 0.0.0.0:7079 ssl:default [Connect call failed ('0.0.0.0', 7079)]

Signed-off-by: Xinyao Wang <[email protected]>
Signed-off-by: Xinyao Wang <[email protected]>
Signed-off-by: Xinyao Wang <[email protected]>
Signed-off-by: Xinyao Wang <[email protected]>
@XinyaoWa
Copy link
Collaborator Author

CICD pending for this PR: opea-project/GenAIComps#1046

@XinyaoWa XinyaoWa mentioned this pull request Dec 20, 2024
4 tasks
@lvliang-intel lvliang-intel merged commit 50dd959 into opea-project:main Dec 20, 2024
29 checks passed
@mkbhanda
Copy link
Collaborator

mkbhanda commented Jan 8, 2025

@XinyaoWa thank you for this PR! I am curious on a couple of issues. 1) what happens with map-reduce mode if the number of summaries totals an input token length greater than max-input-length .. hierarchical map-reduce or refine? 2) Also if possible would you explain the formulas used, 50 etc. 3) Are there speed/accuracy trade-offs to using smaller chunks?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants