-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Standardize CRITIC/Self-Refine Few-shots for Math, Standardize error types in Reflexion with CRITIC/SR, Code for Self-Refine (HumanEval, MBPP) #228
Conversation
WalkthroughThe recent changes enhance the self-refinement tasks in the Changes
Poem
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (invoked as PR comments)
Additionally, you can add CodeRabbit Configration File (
|
Codecov ReportAll modified and coverable lines are covered by tests β
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (7)
- agential/cog/reflexion/agent.py (2 hunks)
- agential/cog/reflexion/strategies/code.py (1 hunks)
- agential/cog/reflexion/strategies/math.py (1 hunks)
- agential/cog/reflexion/strategies/qa.py (1 hunks)
- tests/cog/reflexion/strategies/test_code.py (1 hunks)
- tests/cog/reflexion/strategies/test_math.py (1 hunks)
- tests/cog/reflexion/strategies/test_qa.py (1 hunks)
Additional comments not posted (5)
tests/cog/reflexion/strategies/test_qa.py (1)
72-72
: LGTM!The change to set
max_trials
to3
aligns with the updated default value in the strategy class.tests/cog/reflexion/strategies/test_math.py (1)
74-74
: LGTM! The change aligns with the updated class definition.The
max_trials
parameter is now correctly set to3
in the test, reflecting the changes made in theReflexionCoTMathStrategy
class.agential/cog/reflexion/strategies/math.py (1)
95-103
: LGTM!The change to update
max_trials
from1
to3
is consistent and aligns with the provided context.tests/cog/reflexion/strategies/test_code.py (1)
97-97
: LGTM!The change to update
max_trials
from1
to3
in the test function is consistent and aligns with the provided context.agential/cog/reflexion/strategies/code.py (1)
95-103
: LGTM!The change to update
max_trials
from1
to3
is consistent and aligns with the provided context.
@@ -99,7 +99,7 @@ def generate( | |||
additional_keys: Dict[str, str] = {}, | |||
reflect_additional_keys: Dict[str, str] = {}, | |||
fewshot_type: str = "", | |||
patience: int = 1, | |||
patience: int = 3, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update the docstring to reflect the new default value for patience
.
The patience
parameter's default value has been changed from 1
to 3
, and this should be updated in the docstring to maintain consistency.
- patience (int, optional): The patience for the agent. Defaults to 1.
+ patience (int, optional): The patience for the agent. Defaults to 3.
Also applies to: 123-123
@@ -60,15 +60,15 @@ class ReflexionCoTQAStrategy(ReflexionCoTBaseStrategy): | |||
llm (BaseChatModel): The language model used for generating answers and critiques. | |||
reflector (Optional[ReflexionCoTReflector]): The reflector used for generating reflections. Defaults to None. | |||
max_reflections (int): The maximum number of reflections allowed. Defaults to 3. | |||
max_trials (int): The maximum number of trials allowed. Defaults to 1. | |||
max_trials (int): The maximum number of trials allowed. Defaults to 3. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update the docstring to reflect the new default value for max_trials
.
The max_trials
parameter's default value has been changed from 1
to 3
, and this should be updated in the docstring to maintain consistency.
- max_trials (int): The maximum number of trials allowed. Defaults to 1.
+ max_trials (int): The maximum number of trials allowed. Defaults to 3.
Also applies to: 71-71
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Outside diff range, codebase verification and nitpick comments (6)
agential/cog/self_refine/strategies/code.py (2)
5-5
: The import statement is not used in the file.
- The import
from langchain_core.language_models.chat_models import BaseChatModel
is present but not used inagential/cog/self_refine/strategies/code.py
.Consider adding tests to cover the imports or remove the unused import if it is not necessary.
Analysis chain
Add tests to cover the imports.
The import statement is not covered by tests.
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify if the imported modules are used in the file. # Test: Search for the usage of imported modules. Expect: All imported modules are used. rg --type python -A 5 $'from langchain_core.language_models.chat_models import BaseChatModel'Length of output: 125
Script:
#!/bin/bash # Description: Verify if the imported module BaseChatModel is used in the file agential/cog/self_refine/strategies/code.py # Test: Search for the usage of BaseChatModel in the file. ast-grep --lang python --pattern 'BaseChatModel' agential/cog/self_refine/strategies/code.pyLength of output: 207
Tools
GitHub Check: codecov/patch
[warning] 5-5: agential/cog/self_refine/strategies/code.py#L5
Added line #L5 was not covered by tests
12-13
: Add tests to cover the imports or remove unused imports.The imported modules
SelfRefineBaseStrategy
andEM
are not used in the fileagential/cog/self_refine/strategies/code.py
. Consider adding tests to cover these imports or removing them if they are unnecessary.
agential/cog/self_refine/strategies/code.py
: Lines 12-13Analysis chain
Add tests to cover the imports.
The import statement is not covered by tests.
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify if the imported modules are used in the file. # Test: Search for the usage of imported modules. Expect: All imported modules are used. rg --type python -A 5 $'from agential.cog.self_refine.strategies.base import SelfRefineBaseStrategy\nfrom agential.eval.em import EM'Length of output: 325
Script:
#!/bin/bash # Description: Verify if the imported modules are used in the file. # Test: Search for the usage of imported modules. Expect: All imported modules are used. rg --type python --multiline -A 5 $'from agential.cog.self_refine.strategies.base import SelfRefineBaseStrategy\nfrom agential.eval.em import EM'Length of output: 176
Script:
#!/bin/bash # Description: Verify if the imported modules are used in the file `agential/cog/self_refine/strategies/code.py`. # Test: Search for the usage of imported modules. Expect: All imported modules are used. rg --multiline -A 5 $'from agential.cog.self_refine.strategies.base import SelfRefineBaseStrategy\nfrom agential.eval.em import EM' agential/cog/self_refine/strategies/code.pyLength of output: 284
Tools
GitHub Check: codecov/patch
[warning] 12-13: agential/cog/self_refine/strategies/code.py#L12-L13
Added lines #L12 - L13 were not covered by testsagential/cog/self_refine/prompts.py (4)
2358-2358
: Add a TODO comment for HUMANEVAL instructions.The instruction set for HUMANEVAL is currently empty. Consider adding a TODO comment to indicate that content needs to be added.
+ SELF_REFINE_INSTRUCTION_HUMANEVAL = "" # TODO: Add instructions for HUMANEVAL
2376-2376
: Add a TODO comment for MBPP instructions.The instruction set for MBPP is currently empty. Consider adding a TODO comment to indicate that content needs to be added.
+ SELF_REFINE_INSTRUCTION_MBPP = "" # TODO: Add instructions for MBPP
2361-2370
: Add TODO comments for HUMANEVAL few-shot examples and critique instructions.The few-shot examples and critique instructions for HUMANEVAL are currently empty. Consider adding TODO comments to indicate that content needs to be added.
+ HUMANEVAL_CRITIQUE_FEWSHOT_EXAMPLES = "" # TODO: Add few-shot examples for HUMANEVAL + SELF_REFINE_CRITIQUE_INSTRUCTION_HUMANEVAL = "" # TODO: Add critique instructions for HUMANEVAL + HUMANEVAL_REFINE_FEWSHOT_EXAMPLES = "" # TODO: Add refine few-shot examples for HUMANEVAL + SELF_REFINE_REFINE_INSTRUCTION_HUMANEVAL = "" # TODO: Add refine instructions for HUMANEVAL
2379-2388
: Add TODO comments for MBPP few-shot examples and critique instructions.The few-shot examples and critique instructions for MBPP are currently empty. Consider adding TODO comments to indicate that content needs to be added.
+ MBPP_CRITIQUE_FEWSHOT_EXAMPLES = "" # TODO: Add few-shot examples for MBPP + SELF_REFINE_CRITIQUE_INSTRUCTION_MBPP = "" # TODO: Add critique instructions for MBPP + MBPP_REFINE_FEWSHOT_EXAMPLES = "" # TODO: Add refine few-shot examples for MBPP + SELF_REFINE_REFINE_INSTRUCTION_MBPP = "" # TODO: Add refine instructions for MBPP
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (3)
- agential/cog/self_refine/prompts.py (3 hunks)
- agential/cog/self_refine/strategies/code.py (1 hunks)
- notebooks/self_refine.ipynb (3 hunks)
Additional context used
GitHub Check: codecov/patch
agential/cog/self_refine/strategies/code.py
[warning] 3-3: agential/cog/self_refine/strategies/code.py#L3
Added line #L3 was not covered by tests
[warning] 5-5: agential/cog/self_refine/strategies/code.py#L5
Added line #L5 was not covered by tests
[warning] 7-7: agential/cog/self_refine/strategies/code.py#L7
Added line #L7 was not covered by tests
[warning] 12-13: agential/cog/self_refine/strategies/code.py#L12-L13
Added lines #L12 - L13 were not covered by tests
Additional comments not posted (15)
agential/cog/self_refine/strategies/code.py (4)
7-7
: Add tests to cover the imports.The import statement is not covered by tests.
Tools
GitHub Check: codecov/patch
[warning] 7-7: agential/cog/self_refine/strategies/code.py#L7
Added line #L7 was not covered by tests
Line range hint
21-35
:
Add tests to cover thegenerate
method.The
generate
method is not covered by tests.Tools
GitHub Check: codecov/patch
[warning] 3-3: agential/cog/self_refine/strategies/code.py#L3
Added line #L3 was not covered by tests
[warning] 5-5: agential/cog/self_refine/strategies/code.py#L5
Added line #L5 was not covered by tests
[warning] 7-7: agential/cog/self_refine/strategies/code.py#L7
Added line #L7 was not covered by tests
[warning] 12-13: agential/cog/self_refine/strategies/code.py#L12-L13
Added lines #L12 - L13 were not covered by tests
Line range hint
15-19
:
Add tests to cover the__init__
method.The
__init__
method is not covered by tests.Verification successful
Add tests to cover the
__init__
method.The
__init__
method of theSelfRefineCodeStrategy
class is not covered by tests. Ensure that tests are added to verify the initialization parameters and behavior.
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify if the `__init__` method is tested. # Test: Search for the usage of `SelfRefineCodeStrategy`. Expect: The `__init__` method is tested. rg --type python -A 5 $'SelfRefineCodeStrategy'Length of output: 77
Script:
#!/bin/bash # Description: Verify if the `__init__` method is tested. # Test: Search for the usage of `SelfRefineCodeStrategy`. Expect: The `__init__` method is tested. rg --type py -A 5 $'SelfRefineCodeStrategy'Length of output: 42
Script:
#!/bin/bash # Description: Locate potential test files and search for references to `SelfRefineCodeStrategy`. # Step 1: Locate potential test files. test_files=$(fd --type f --extension py test) # Step 2: Search within those files for references to `SelfRefineCodeStrategy`. if [ -n "$test_files" ]; then rg --type py -A 5 $'SelfRefineCodeStrategy' $test_files else echo "No test files found." fiLength of output: 3665
Tools
GitHub Check: codecov/patch
[warning] 3-3: agential/cog/self_refine/strategies/code.py#L3
Added line #L3 was not covered by tests
[warning] 5-5: agential/cog/self_refine/strategies/code.py#L5
Added line #L5 was not covered by tests
[warning] 7-7: agential/cog/self_refine/strategies/code.py#L7
Added line #L7 was not covered by tests
[warning] 12-13: agential/cog/self_refine/strategies/code.py#L12-L13
Added lines #L12 - L13 were not covered by tests
3-3
: Add tests to cover the imports.The import statement is not covered by tests.
Verification successful
The imports are used and necessary.
The imported modules
BaseChatModel
,_prompt_agent
,_prompt_critique
,_prompt_refine
,SelfRefineBaseStrategy
, andEM
are utilized within the codebase, including in tests and functional files. Therefore, the import statement is valid and necessary.
Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify if the imported modules are used in the file. # Test: Search for the usage of imported modules. Expect: All imported modules are used. rg --type python -A 5 $'from typing import Any, Dict\nfrom langchain_core.language_models.chat_models import BaseChatModel\nfrom agential.cog.self_refine.functional import (\n _prompt_agent,\n _prompt_critique,\n _prompt_refine,\n)\nfrom agential.cog.self_refine.strategies.base import SelfRefineBaseStrategy\nfrom agential.eval.em import EM'Length of output: 536
Script:
#!/bin/bash # Description: Verify if the imported modules are used in the file. # Test: Search for the usage of imported modules. Expect: All imported modules are used. rg --multiline --type python -A 5 $'from typing import Any, Dict\nfrom langchain_core.language_models.chat_models import BaseChatModel\nfrom agential.cog.self_refine.functional import (\n _prompt_agent,\n _prompt_critique,\n _prompt_refine,\n)\nfrom agential.cog.self_refine.strategies.base import SelfRefineBaseStrategy\nfrom agential.eval.em import EM'Length of output: 387
Script:
#!/bin/bash # Description: Verify if the imported modules are used in the file. # Test: Search for the usage of imported modules. Expect: All imported modules are used. rg --multiline -A 5 $'from typing import Any, Dict\nfrom langchain_core.language_models.chat_models import BaseChatModel\nfrom agential.cog.self_refine.functional import (\n _prompt_agent,\n _prompt_critique,\n _prompt_refine,\n)\nfrom agential.cog.self_refine.strategies.base import SelfRefineBaseStrategy\nfrom agential.eval.em import EM'Length of output: 342
Script:
#!/bin/bash # Description: Verify if the imported modules are used in the file. # Test: Search for the usage of each imported module individually. # Check usage of 'Any' and 'Dict' from 'typing' rg --multiline -A 5 'Any\|Dict' # Check usage of 'BaseChatModel' from 'langchain_core.language_models.chat_models' rg --multiline -A 5 'BaseChatModel' # Check usage of '_prompt_agent', '_prompt_critique', and '_prompt_refine' from 'agential.cog.self_refine.functional' rg --multiline -A 5 '_prompt_agent\|_prompt_critique\|_prompt_refine' # Check usage of 'SelfRefineBaseStrategy' from 'agential.cog.self_refine.strategies.base' rg --multiline -A 5 'SelfRefineBaseStrategy' # Check usage of 'EM' from 'agential.eval.em' rg --multiline -A 5 'EM'Length of output: 132746
Tools
GitHub Check: codecov/patch
[warning] 3-3: agential/cog/self_refine/strategies/code.py#L3
Added line #L3 was not covered by testsnotebooks/self_refine.ipynb (9)
98-98
: Add tests to cover theSelfRefineAgent
initialization.The initialization of
SelfRefineAgent
is not covered by tests.
97-97
: Add tests to cover theSelfRefineAgent
initialization.The initialization of
SelfRefineAgent
is not covered by tests.Verification successful
Add tests to cover the
SelfRefineAgent
initialization.The initialization of
SelfRefineAgent
is already covered by tests intests/cog/self_refine/test_agent.py
.
tests/cog/self_refine/test_agent.py
contains multiple instances whereSelfRefineAgent
is initialized and tested.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify if the `SelfRefineAgent` initialization is tested. # Test: Search for the usage of `SelfRefineAgent`. Expect: The initialization is tested. rg --type python -A 5 $'SelfRefineAgent'Length of output: 70
Script:
#!/bin/bash # Description: Verify if the `SelfRefineAgent` initialization is tested. # Test: Search for the usage of `SelfRefineAgent`. Expect: The initialization is tested. rg --glob '*.py' -A 5 $'SelfRefineAgent'Length of output: 3676
92-92
: Add tests to cover theSelfRefineAgent
initialization.The initialization of
SelfRefineAgent
is not covered by tests.Verification successful
Add tests to cover the
SelfRefineAgent
initialization.The initialization of
SelfRefineAgent
is already covered by tests intests/cog/self_refine/test_agent.py
.
tests/cog/self_refine/test_agent.py
: Multiple instances ofSelfRefineAgent
initialization and corresponding assertions are present.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify if the `SelfRefineAgent` initialization is tested. # Test: Search for the usage of `SelfRefineAgent`. Expect: The initialization is tested. rg --type python -A 5 $'SelfRefineAgent'Length of output: 70
Script:
#!/bin/bash # Description: Verify if the `SelfRefineAgent` initialization is tested. # Test: Search for the usage of `SelfRefineAgent`. Expect: The initialization is tested. rg -A 5 'SelfRefineAgent'Length of output: 6219
96-96
: Add tests to cover theSelfRefineAgent
initialization.The initialization of
SelfRefineAgent
is not covered by tests.
95-95
: Add tests to cover theSelfRefineAgent
initialization.The initialization of
SelfRefineAgent
is not covered by tests.
91-91
: Add tests to cover theSelfRefineAgent
initialization.The initialization of
SelfRefineAgent
is not covered by tests.
93-93
: Add tests to cover theSelfRefineAgent
initialization.The initialization of
SelfRefineAgent
is not covered by tests.
94-94
: Add tests to cover theSelfRefineAgent
initialization.The initialization of
SelfRefineAgent
is not covered by tests.Verification successful
Add tests to cover the
SelfRefineAgent
initialization.The initialization of
SelfRefineAgent
is already covered by tests in thetests/cog/self_refine/test_agent.py
file. Here are the relevant instances:
tests/cog/self_refine/test_agent.py: agent = SelfRefineAgent(llm=FakeListChatModel(responses=[]), benchmark="gsm8k")
tests/cog/self_refine/test_agent.py: agent = SelfRefineAgent(llm=FakeListChatModel(responses=responses), benchmark="gsm8k")
These lines indicate that the initialization of
SelfRefineAgent
is indeed tested.Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash # Description: Verify if the `SelfRefineAgent` initialization is tested. # Test: Search for the usage of `SelfRefineAgent`. Expect: The initialization is tested. rg --type python -A 5 $'SelfRefineAgent'Length of output: 70
Script:
#!/bin/bash # Description: Verify if the `SelfRefineAgent` initialization is tested. # Find all Python files and search for the usage of `SelfRefineAgent`. fd -e py -x rg 'SelfRefineAgent' {}Length of output: 414
Script:
#!/bin/bash # Description: Verify if the `SelfRefineAgent` initialization is tested. # Find all Python test files and search for the usage of `SelfRefineAgent`. fd -e py -p 'test' -x rg 'SelfRefineAgent' {} # Additionally, search for `SelfRefineAgent` within functions that follow test naming conventions. rg -e 'def test_' -A 5 'SelfRefineAgent'Length of output: 485
Script:
#!/bin/bash # Description: Verify if the `SelfRefineAgent` initialization is tested. # Find all Python test files in typical test directories and search for the usage of `SelfRefineAgent`. fd -e py --search-path tests -x rg 'SelfRefineAgent' {} # Additionally, search for `SelfRefineAgent` within functions that follow test naming conventions. rg 'def test_' -A 5 'SelfRefineAgent'Length of output: 494
Script:
#!/bin/bash # Description: Verify if the `SelfRefineAgent` initialization is tested. # Find all occurrences of `SelfRefineAgent` within the entire codebase and include more context lines. rg 'SelfRefineAgent' -A 10Length of output: 11979
Line range hint
1-1
:
Add tests to cover the imports.The import statements are not covered by tests.
agential/cog/self_refine/prompts.py (2)
1148-1150
: Improved clarity in instructions.The updated instruction for GSM8K now includes specific guidelines for writing Python code and storing the result in a variable named 'answer'. This enhances clarity.
1554-1556
: Improved clarity in instructions.The updated instruction for SVAMP now includes specific guidelines for writing Python code and storing the result in a variable named 'answer'. This enhances clarity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (3)
- agential/cog/critic/prompts.py (4 hunks)
- agential/cog/self_refine/prompts.py (7 hunks)
- docs/docs/index.md (2 hunks)
Files skipped from review due to trivial changes (1)
- agential/cog/critic/prompts.py
Files skipped from review as they are similar to previous changes (1)
- agential/cog/self_refine/prompts.py
Additional comments not posted (1)
docs/docs/index.md (1)
111-111
: LGTM! Ensure consistency and clarity.The new section "CRITIC, Self-Refine" provides detailed error types for various datasets. The changes are clear and consistent with the rest of the document.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 10
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (2)
- agential/cog/critic/prompts.py (5 hunks)
- agential/cog/self_refine/prompts.py (28 hunks)
Files not summarized due to errors (1)
- agential/cog/self_refine/prompts.py: Error: Message exceeds token limit
Files skipped from review as they are similar to previous changes (1)
- agential/cog/critic/prompts.py
Additional comments not posted (2)
agential/cog/self_refine/prompts.py (2)
1148-1149
: Approved: Improved instruction clarity.The addition of the guideline to store the result in a variable named 'answer' enhances clarity and consistency.
1564-1565
: Approved: Improved instruction clarity.The addition of the guideline to store the result in a variable named 'answer' enhances clarity and consistency.
agential/cog/self_refine/prompts.py
Outdated
# ======================================================================== HUMANEVAL ======================================================================== # | ||
|
||
|
||
SELF_REFINE_INSTRUCTION_HUMANEVAL = """""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add content to the instruction.
The SELF_REFINE_INSTRUCTION_HUMANEVAL
is currently empty and needs to be populated with appropriate instructions.
agential/cog/self_refine/prompts.py
Outdated
SELF_REFINE_INSTRUCTION_HUMANEVAL = """""" | ||
|
||
|
||
HUMANEVAL_CRITIQUE_FEWSHOT_EXAMPLES = """""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add content to the few-shot examples.
The HUMANEVAL_CRITIQUE_FEWSHOT_EXAMPLES
is currently empty and needs to be populated with appropriate few-shot examples.
agential/cog/self_refine/prompts.py
Outdated
HUMANEVAL_CRITIQUE_FEWSHOT_EXAMPLES = """""" | ||
|
||
|
||
SELF_REFINE_CRITIQUE_INSTRUCTION_HUMANEVAL = """""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add content to the critique instruction.
The SELF_REFINE_CRITIQUE_INSTRUCTION_HUMANEVAL
is currently empty and needs to be populated with appropriate critique instructions.
agential/cog/self_refine/prompts.py
Outdated
SELF_REFINE_CRITIQUE_INSTRUCTION_HUMANEVAL = """""" | ||
|
||
|
||
HUMANEVAL_REFINE_FEWSHOT_EXAMPLES = """""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add content to the few-shot examples.
The HUMANEVAL_REFINE_FEWSHOT_EXAMPLES
is currently empty and needs to be populated with appropriate few-shot examples.
agential/cog/self_refine/prompts.py
Outdated
HUMANEVAL_REFINE_FEWSHOT_EXAMPLES = """""" | ||
|
||
|
||
SELF_REFINE_REFINE_INSTRUCTION_HUMANEVAL = """""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add content to the refinement instruction.
The SELF_REFINE_REFINE_INSTRUCTION_HUMANEVAL
is currently empty and needs to be populated with appropriate refinement instructions.
agential/cog/self_refine/prompts.py
Outdated
# ======================================================================== MBPP ======================================================================== # | ||
|
||
|
||
SELF_REFINE_INSTRUCTION_MBPP = """""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add content to the instruction.
The SELF_REFINE_INSTRUCTION_MBPP
is currently empty and needs to be populated with appropriate instructions.
agential/cog/self_refine/prompts.py
Outdated
SELF_REFINE_INSTRUCTION_MBPP = """""" | ||
|
||
|
||
MBPP_CRITIQUE_FEWSHOT_EXAMPLES = """""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add content to the few-shot examples.
The MBPP_CRITIQUE_FEWSHOT_EXAMPLES
is currently empty and needs to be populated with appropriate few-shot examples.
agential/cog/self_refine/prompts.py
Outdated
MBPP_CRITIQUE_FEWSHOT_EXAMPLES = """""" | ||
|
||
|
||
SELF_REFINE_CRITIQUE_INSTRUCTION_MBPP = """""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add content to the critique instruction.
The SELF_REFINE_CRITIQUE_INSTRUCTION_MBPP
is currently empty and needs to be populated with appropriate critique instructions.
agential/cog/self_refine/prompts.py
Outdated
SELF_REFINE_CRITIQUE_INSTRUCTION_MBPP = """""" | ||
|
||
|
||
MBPP_REFINE_FEWSHOT_EXAMPLES = """""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add content to the few-shot examples.
The MBPP_REFINE_FEWSHOT_EXAMPLES
is currently empty and needs to be populated with appropriate few-shot examples.
agential/cog/self_refine/prompts.py
Outdated
MBPP_REFINE_FEWSHOT_EXAMPLES = """""" | ||
|
||
|
||
SELF_REFINE_REFINE_INSTRUCTION_MBPP = """""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add content to the refinement instruction.
The SELF_REFINE_REFINE_INSTRUCTION_MBPP
is currently empty and needs to be populated with appropriate refinement instructions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
Outside diff range, codebase verification and nitpick comments (1)
docs/docs/index.md (1)
127-139
: Reword repetitive sentences for better readability.Consider rewording the sentences to avoid repetitive beginnings and improve readability.
- | **HotpotQA** | 1. Misinterpretation<br>2. Incorrect assumption<br>3. Misinterpretation<br>4. Misinterpretation<br>5. Misinterpretation | 1. Misled action<br>2. Misled action<br>3. Misread context<br>4. Wrong answer<br>5. Logical error | + | **HotpotQA** | 1. Misinterpretation<br>2. Incorrect assumption<br>3. Misinterpretation<br>4. Misinterpretation<br>5. Misinterpretation | 1. Misled action<br>2. Misled action<br>3. Misread context<br>4. Wrong answer<br>5. Logical error | + | **FEVER** | 1. Insufficient info<br>2. Misinterpretation<br>3. Insufficient info<br>4. Insufficient info<br>5. Misinterpretation | 1. Ignored context<br>2. Insufficient info<br>3. Insufficient info<br>4. Ignore context<br>5. Ignore context | + | **AmbigNQ** | 1. Knowledge error<br>2. Knowledge error<br>3. Knowledge error<br>4. Misinterpret question<br>5. Knowledge error | 1. Incorrect assumption/Insufficient info<br>2. Insufficient info<br>3. Knowledge error<br>4. Incorrect answer format<br>5. Misread context | + | **TriviaQA** | 1. Incorrect assumption<br>2. Incorrect assumption<br>3. Incorrect assumption<br>4. Misinterpretation<br>5. Incorrect assumption | 1. Ignore context<br>2. Ignore context<br>3. Ignore context<br>4. Ignore context<br>5. Ignore context | + | **GSM8K** | 1. Logical error<br>2. Logical error<br>3. Misinterpret question<br>4. Logical error<br>5. Misinterpret question | 1. Logical error/Misinterpret question<br>2. Logical error/Misinterpret question<br>3. Logical error/Re-calculation error<br>4. Logical error/Re-calculation error<br>5. Logical error/Misinterpret question | + | **SVAMP** | 1. Logical error<br>2. Logical error<br>3. Logical error<br>4. Logical error<br>5. Logical error | 1. Misinterpret question<br>2. Logical error<br>3. Logical error<br>4. Logical error<br>5. Logical error | + | **TabMWP** | 1. Incorrect operator<br>2. Incorrect operator<br>3. Misinterpret question<br>4. Incorrect operator<br>5. Logical error | 1. Misinterpret question<br>2. Logical error<br>3. Logical error<br>4. Re-calculation error<br>5. Logical error | + | **HumanEval** | 1. Conceptual error<br>2. Logical error<br>3. Logical error<br>4. Logical error<br>5. Logical error | 1. Logical error<br>2. Logical error<br>3. Logical error<br>4. Logical error<br>5. Logical error | + | **MBPP** | 1. Logical error<br>2. Logical error<br>3. Incorrect function usage<br>4. Logical error<br>5. Logical error | 1. Incorrect function implementation<br>2. Logical error<br>3. Incorrect function usage<br>4. Logical error<br>5. Logical error |Tools
LanguageTool
[style] ~131-~131: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...pretation
4. Misinterpretation
5. Misinterpretation | 1. Misled action
2. Misled action<...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~133-~133: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ledge error
2. Knowledge error
3. Knowledge error
4. Misinterpret question
5....(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~134-~134: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...mption
2. Incorrect assumption
3. Incorrect assumption
4. Misinterpretation
5...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~134-~134: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...nore context
2. Ignore context
3. Ignore context
4. Ignore context
5. Igno...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~134-~134: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...nore context
3. Ignore context
4. Ignore context
5. Ignore context | | **GSM8...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~134-~134: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...nore context
4. Ignore context
5. Ignore context | | GSM8K | 1. Logical ...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~135-~135: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...gical error/Misinterpret question
3. Logical error/Re-calculation error
4. Logica...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~135-~135: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ogical error/Re-calculation error
4. Logical error/Re-calculation error
5. Logica...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~135-~135: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ogical error/Re-calculation error
5. Logical error/Misinterpret question | | **SVAMP...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~136-~136: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... SVAMP | 1. Logical error
2. Logical error
3. Logical error
4. Logical...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~136-~136: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...Logical error
2. Logical error
3. Logical error
4. Logical error
5. Logical...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~136-~136: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...Logical error
3. Logical error
4. Logical error
5. Logical error | 1. Misinter...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~136-~136: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...Logical error
4. Logical error
5. Logical error | 1. Misinterpret question
2. ...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~136-~136: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...Logical error
3. Logical error
4. Logical error
5. Logical error | | *TabMWP...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~136-~136: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...Logical error
4. Logical error
5. Logical error | | TabMWP | 1. Incorrect ...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~138-~138: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...Logical error
3. Logical error
4. Logical error
5. Logical error | 1. Logical ...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~138-~138: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...Logical error
4. Logical error
5. Logical error | 1. Logical error
2. Logical ...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~138-~138: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... Logical error
5. Logical error | 1. Logical error
2. Logical error
3. Logical...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~138-~138: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... Logical error | 1. Logical error
2. Logical error
3. Logical error
4. Logical...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~138-~138: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...Logical error
2. Logical error
3. Logical error
4. Logical error
5. Logical...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~138-~138: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...Logical error
3. Logical error
4. Logical error
5. Logical error | | MBPP ...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~138-~138: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...Logical error
4. Logical error
5. Logical error | | MBPP | 1. Logical er...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~139-~139: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... MBPP | 1. Logical error
2. Logical error
3. Incorrect function usage<br...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (5)
- agential/cog/reflexion/prompts.py (3 hunks)
- agential/cog/self_refine/functional.py (3 hunks)
- agential/cog/self_refine/prompts.py (28 hunks)
- docs/docs/index.md (2 hunks)
- notebooks/self_refine.ipynb (3 hunks)
Files not summarized due to errors (1)
- agential/cog/self_refine/prompts.py: Error: Message exceeds token limit
Files skipped from review as they are similar to previous changes (1)
- notebooks/self_refine.ipynb
Additional context used
LanguageTool
docs/docs/index.md
[style] ~131-~131: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...pretation
4. Misinterpretation
5. Misinterpretation | 1. Misled action
2. Misled action<...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~133-~133: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ledge error
2. Knowledge error
3. Knowledge error
4. Misinterpret question
5....(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~134-~134: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...mption
2. Incorrect assumption
3. Incorrect assumption
4. Misinterpretation
5...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~134-~134: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...nore context
2. Ignore context
3. Ignore context
4. Ignore context
5. Igno...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~134-~134: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...nore context
3. Ignore context
4. Ignore context
5. Ignore context | | **GSM8...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~134-~134: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...nore context
4. Ignore context
5. Ignore context | | GSM8K | 1. Logical ...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~135-~135: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...gical error/Misinterpret question
3. Logical error/Re-calculation error
4. Logica...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~135-~135: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ogical error/Re-calculation error
4. Logical error/Re-calculation error
5. Logica...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~135-~135: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...ogical error/Re-calculation error
5. Logical error/Misinterpret question | | **SVAMP...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~136-~136: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... SVAMP | 1. Logical error
2. Logical error
3. Logical error
4. Logical...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~136-~136: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...Logical error
2. Logical error
3. Logical error
4. Logical error
5. Logical...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~136-~136: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...Logical error
3. Logical error
4. Logical error
5. Logical error | 1. Misinter...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~136-~136: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...Logical error
4. Logical error
5. Logical error | 1. Misinterpret question
2. ...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~136-~136: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...Logical error
3. Logical error
4. Logical error
5. Logical error | | *TabMWP...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~136-~136: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...Logical error
4. Logical error
5. Logical error | | TabMWP | 1. Incorrect ...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~138-~138: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...Logical error
3. Logical error
4. Logical error
5. Logical error | 1. Logical ...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~138-~138: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...Logical error
4. Logical error
5. Logical error | 1. Logical error
2. Logical ...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~138-~138: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... Logical error
5. Logical error | 1. Logical error
2. Logical error
3. Logical...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~138-~138: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... Logical error | 1. Logical error
2. Logical error
3. Logical error
4. Logical...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~138-~138: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...Logical error
2. Logical error
3. Logical error
4. Logical error
5. Logical...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~138-~138: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...Logical error
3. Logical error
4. Logical error
5. Logical error | | MBPP ...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~138-~138: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...Logical error
4. Logical error
5. Logical error | | MBPP | 1. Logical er...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
[style] ~139-~139: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ... MBPP | 1. Logical error
2. Logical error
3. Incorrect function usage<br...(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
Additional comments not posted (6)
docs/docs/index.md (1)
Line range hint
111-125
:
LGTM!The section on CRITIC and Self-Refine error types is clear and well-organized.
Tools
Markdownlint
109-109: null
Multiple headings with the same content(MD024, no-duplicate-heading)
agential/cog/self_refine/prompts.py (2)
1148-1150
: Improved Instructional Clarity.The updated
SELF_REFINE_INSTRUCTION_GSM8K
now includes specific guidelines for writing Python code and storing the result in a variable named 'answer'. This enhances clarity and standardization.
1564-1566
: Improved Instructional Clarity.The updated
SELF_REFINE_INSTRUCTION_SVAMP
now includes specific guidelines for writing Python code and storing the result in a variable named 'answer'. This enhances clarity and standardization.agential/cog/reflexion/prompts.py (3)
110-110
: Ensure the example is comprehensive and accurate.The few-shot example for HOTPOTQA using Reflexion-React-Reflect strategy should be comprehensive and accurate to guide the model effectively.
Ensure that the example covers a variety of scenarios and provides clear reflections to improve the model's reasoning.
1716-1716
: Ensure the example is comprehensive and accurate.The few-shot example for HUMANEVAL using Reflexion-React-Reflect strategy should be comprehensive and accurate to guide the model effectively.
Ensure that the example covers a variety of scenarios and provides clear reflections to improve the model's reasoning.
1726-1726
: Ensure the example is comprehensive and accurate.The few-shot example for MBPP using Reflexion-React-Reflect strategy should be comprehensive and accurate to guide the model effectively.
Ensure that the example covers a variety of scenarios and provides clear reflections to improve the model's reasoning.
print("<OUT CRITIQUE=======================================================>") | ||
print(repr(out)) | ||
print("<OUT CRITIQUE=======================================================>") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using logging instead of print statements.
While print statements help with debugging, using a logging framework would be more flexible and appropriate for production code.
- print("<OUT CRITIQUE=======================================================>")
- print(repr(out))
- print("<OUT CRITIQUE=======================================================>")
+ import logging
+ logger = logging.getLogger(__name__)
+ logger.debug("<OUT CRITIQUE=======================================================>")
+ logger.debug(repr(out))
+ logger.debug("<OUT CRITIQUE=======================================================>")
Committable suggestion
βΌοΈ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
print("<OUT CRITIQUE=======================================================>") | |
print(repr(out)) | |
print("<OUT CRITIQUE=======================================================>") | |
import logging | |
logger = logging.getLogger(__name__) | |
logger.debug("<OUT CRITIQUE=======================================================>") | |
logger.debug(repr(out)) | |
logger.debug("<OUT CRITIQUE=======================================================>") |
print("<OUT AGENT=======================================================>") | ||
print(repr(out)) | ||
print("<OUT AGENT=======================================================>") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using logging instead of print statements.
While print statements help with debugging, using a logging framework would be more flexible and appropriate for production code.
- print("<OUT AGENT=======================================================>")
- print(repr(out))
- print("<OUT AGENT=======================================================>")
+ import logging
+ logger = logging.getLogger(__name__)
+ logger.debug("<OUT AGENT=======================================================>")
+ logger.debug(repr(out))
+ logger.debug("<OUT AGENT=======================================================>")
Committable suggestion was skipped due to low confidence.
print("<PROMPT REFINE=======================================================>") | ||
print(prompt) | ||
print("<PROMPT REFINE=======================================================>") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using logging instead of print statements.
While print statements help with debugging, using a logging framework would be more flexible and appropriate for production code.
- print("<PROMPT REFINE=======================================================>")
- print(prompt)
- print("<PROMPT REFINE=======================================================>")
+ import logging
+ logger = logging.getLogger(__name__)
+ logger.debug("<PROMPT REFINE=======================================================>")
+ logger.debug(prompt)
+ logger.debug("<PROMPT REFINE=======================================================>")
Committable suggestion
βΌοΈ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
print("<PROMPT REFINE=======================================================>") | |
print(prompt) | |
print("<PROMPT REFINE=======================================================>") | |
import logging | |
logger = logging.getLogger(__name__) | |
logger.debug("<PROMPT REFINE=======================================================>") | |
logger.debug(prompt) | |
logger.debug("<PROMPT REFINE=======================================================>") |
print("<PROMPT CRITIQUE=======================================================>") | ||
print(prompt) | ||
print("<PROMPT CRITIQUE=======================================================>") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using logging instead of print statements.
While print statements help with debugging, using a logging framework would be more flexible and appropriate for production code.
- print("<PROMPT CRITIQUE=======================================================>")
- print(prompt)
- print("<PROMPT CRITIQUE=======================================================>")
+ import logging
+ logger = logging.getLogger(__name__)
+ logger.debug("<PROMPT CRITIQUE=======================================================>")
+ logger.debug(prompt)
+ logger.debug("<PROMPT CRITIQUE=======================================================>")
Committable suggestion
βΌοΈ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
print("<PROMPT CRITIQUE=======================================================>") | |
print(prompt) | |
print("<PROMPT CRITIQUE=======================================================>") | |
import logging | |
logger = logging.getLogger(__name__) | |
logger.debug("<PROMPT CRITIQUE=======================================================>") | |
logger.debug(prompt) | |
logger.debug("<PROMPT CRITIQUE=======================================================>") |
print("<PROMPT AGENT=======================================================>") | ||
print(prompt) | ||
print("<PROMPT AGENT=======================================================>") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using logging instead of print statements.
While print statements help with debugging, using a logging framework would be more flexible and appropriate for production code.
- print("<PROMPT AGENT=======================================================>")
- print(prompt)
- print("<PROMPT AGENT=======================================================>")
+ import logging
+ logger = logging.getLogger(__name__)
+ logger.debug("<PROMPT AGENT=======================================================>")
+ logger.debug(prompt)
+ logger.debug("<PROMPT AGENT=======================================================>")
Committable suggestion
βΌοΈ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
print("<PROMPT AGENT=======================================================>") | |
print(prompt) | |
print("<PROMPT AGENT=======================================================>") | |
import logging | |
logger = logging.getLogger(__name__) | |
logger.debug("<PROMPT AGENT=======================================================>") | |
logger.debug(prompt) | |
logger.debug("<PROMPT AGENT=======================================================>") |
print("<OUT REFINE=======================================================>") | ||
print(repr(out)) | ||
print("<OUT REFINE=======================================================>") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider using logging instead of print statements.
While print statements help with debugging, using a logging framework would be more flexible and appropriate for production code.
- print("<OUT REFINE=======================================================>")
- print(repr(out))
- print("<OUT REFINE=======================================================>")
+ import logging
+ logger = logging.getLogger(__name__)
+ logger.debug("<OUT REFINE=======================================================>")
+ logger.debug(repr(out))
+ logger.debug("<OUT REFINE=======================================================>")
Committable suggestion
βΌοΈ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
print("<OUT REFINE=======================================================>") | |
print(repr(out)) | |
print("<OUT REFINE=======================================================>") | |
import logging | |
logger = logging.getLogger(__name__) | |
logger.debug("<OUT REFINE=======================================================>") | |
logger.debug(repr(out)) | |
logger.debug("<OUT REFINE=======================================================>") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- agential/cog/self_refine/strategies/code.py (1 hunks)
Additional context used
GitHub Check: codecov/patch
agential/cog/self_refine/strategies/code.py
[warning] 3-3: agential/cog/self_refine/strategies/code.py#L3
Added line #L3 was not covered by tests
[warning] 5-5: agential/cog/self_refine/strategies/code.py#L5
Added line #L5 was not covered by tests
[warning] 7-7: agential/cog/self_refine/strategies/code.py#L7
Added line #L7 was not covered by tests
[warning] 12-13: agential/cog/self_refine/strategies/code.py#L12-L13
Added lines #L12 - L13 were not covered by tests
[warning] 16-16: agential/cog/self_refine/strategies/code.py#L16
Added line #L16 was not covered by tests
[warning] 25-25: agential/cog/self_refine/strategies/code.py#L25
Added line #L25 was not covered by tests
[warning] 27-31: agential/cog/self_refine/strategies/code.py#L27-L31
Added lines #L27 - L31 were not covered by tests
[warning] 33-33: agential/cog/self_refine/strategies/code.py#L33
Added line #L33 was not covered by tests
[warning] 53-53: agential/cog/self_refine/strategies/code.py#L53
Added line #L53 was not covered by tests
[warning] 60-60: agential/cog/self_refine/strategies/code.py#L60
Added line #L60 was not covered by tests
[warning] 62-62: agential/cog/self_refine/strategies/code.py#L62
Added line #L62 was not covered by tests
[warning] 64-64: agential/cog/self_refine/strategies/code.py#L64
Added line #L64 was not covered by tests
[warning] 87-87: agential/cog/self_refine/strategies/code.py#L87
Added line #L87 was not covered by tests
[warning] 96-99: agential/cog/self_refine/strategies/code.py#L96-L99
Added lines #L96 - L99 were not covered by tests
[warning] 101-101: agential/cog/self_refine/strategies/code.py#L101
Added line #L101 was not covered by tests
[warning] 103-103: agential/cog/self_refine/strategies/code.py#L103
Added line #L103 was not covered by tests
[warning] 105-105: agential/cog/self_refine/strategies/code.py#L105
Added line #L105 was not covered by tests
[warning] 115-115: agential/cog/self_refine/strategies/code.py#L115
Added line #L115 was not covered by tests
[warning] 117-117: agential/cog/self_refine/strategies/code.py#L117
Added line #L117 was not covered by tests
[warning] 139-139: agential/cog/self_refine/strategies/code.py#L139
Added line #L139 was not covered by tests
[warning] 148-148: agential/cog/self_refine/strategies/code.py#L148
Added line #L148 was not covered by tests
[warning] 150-150: agential/cog/self_refine/strategies/code.py#L150
Added line #L150 was not covered by tests
[warning] 152-152: agential/cog/self_refine/strategies/code.py#L152
Added line #L152 was not covered by tests
[warning] 160-160: agential/cog/self_refine/strategies/code.py#L160
Added line #L160 was not covered by tests
[warning] 162-162: agential/cog/self_refine/strategies/code.py#L162
Added line #L162 was not covered by tests
[warning] 170-172: agential/cog/self_refine/strategies/code.py#L170-L172
Added lines #L170 - L172 were not covered by tests
[warning] 175-175: agential/cog/self_refine/strategies/code.py#L175
Added line #L175 was not covered by tests
[warning] 178-178: agential/cog/self_refine/strategies/code.py#L178
Added line #L178 was not covered by tests
[warning] 181-181: agential/cog/self_refine/strategies/code.py#L181
Added line #L181 was not covered by tests
[warning] 184-184: agential/cog/self_refine/strategies/code.py#L184
Added line #L184 was not covered by tests
Additional comments not posted (9)
agential/cog/self_refine/strategies/code.py (9)
25-31
: Constructor InitializationThe constructor properly initializes the
SelfRefineCodeStrategy
class, setting default values forpatience
,_prev_code_answer
,patience_counter
, and_halt
. The attributes are well-documented.Tools
GitHub Check: codecov/patch
[warning] 25-25: agential/cog/self_refine/strategies/code.py#L25
Added line #L25 was not covered by tests
[warning] 27-31: agential/cog/self_refine/strategies/code.py#L27-L31
Added lines #L27 - L31 were not covered by tests
33-62
: Verify the format of the answer stringThe
generate
method extracts Python code from the answer string by splitting on "python" and "
". Ensure that the format of the answer string is consistent and that edge cases are handled.Tools
GitHub Check: codecov/patch
[warning] 33-33: agential/cog/self_refine/strategies/code.py#L33
Added line #L33 was not covered by tests
[warning] 53-53: agential/cog/self_refine/strategies/code.py#L53
Added line #L53 was not covered by tests
[warning] 60-60: agential/cog/self_refine/strategies/code.py#L60
Added line #L60 was not covered by tests
[warning] 62-62: agential/cog/self_refine/strategies/code.py#L62
Added line #L62 was not covered by tests
64-103
: Verify the usage ofEM
and halting logicThe
generate_critique
method usesEM
to check if the answer remains the same and increments thepatience_counter
. Verify thatEM
is the appropriate method for this comparison and that the halting logic works as intended.Tools
GitHub Check: codecov/patch
[warning] 64-64: agential/cog/self_refine/strategies/code.py#L64
Added line #L64 was not covered by tests
[warning] 87-87: agential/cog/self_refine/strategies/code.py#L87
Added line #L87 was not covered by tests
[warning] 96-99: agential/cog/self_refine/strategies/code.py#L96-L99
Added lines #L96 - L99 were not covered by tests
[warning] 101-101: agential/cog/self_refine/strategies/code.py#L101
Added line #L101 was not covered by tests
[warning] 103-103: agential/cog/self_refine/strategies/code.py#L103
Added line #L103 was not covered by tests
105-115
: LGTM!The
create_output_dict
method is straightforward and correctly implemented.Tools
GitHub Check: codecov/patch
[warning] 105-105: agential/cog/self_refine/strategies/code.py#L105
Added line #L105 was not covered by tests
[warning] 115-115: agential/cog/self_refine/strategies/code.py#L115
Added line #L115 was not covered by tests
117-150
: Verify the format of the updated answer stringThe
update_answer_based_on_critique
method extracts Python code from the updated answer string by splitting on "python" and "
". Ensure that the format of the updated answer string is consistent and that edge cases are handled.Tools
GitHub Check: codecov/patch
[warning] 117-117: agential/cog/self_refine/strategies/code.py#L117
Added line #L117 was not covered by tests
[warning] 139-139: agential/cog/self_refine/strategies/code.py#L139
Added line #L139 was not covered by tests
[warning] 148-148: agential/cog/self_refine/strategies/code.py#L148
Added line #L148 was not covered by tests
[warning] 150-150: agential/cog/self_refine/strategies/code.py#L150
Added line #L150 was not covered by tests
152-160
: LGTM!The
halting_condition
method is straightforward and correctly implemented.Tools
GitHub Check: codecov/patch
[warning] 152-152: agential/cog/self_refine/strategies/code.py#L152
Added line #L152 was not covered by tests
[warning] 160-160: agential/cog/self_refine/strategies/code.py#L160
Added line #L160 was not covered by tests
162-172
: LGTM!The
reset
method is straightforward and correctly implemented.Tools
GitHub Check: codecov/patch
[warning] 162-162: agential/cog/self_refine/strategies/code.py#L162
Added line #L162 was not covered by tests
[warning] 170-172: agential/cog/self_refine/strategies/code.py#L170-L172
Added lines #L170 - L172 were not covered by tests
175-178
: LGTM!The
SelfRefineHEvalStrategy
class inherits fromSelfRefineCodeStrategy
and does not add any new functionality.Tools
GitHub Check: codecov/patch
[warning] 175-175: agential/cog/self_refine/strategies/code.py#L175
Added line #L175 was not covered by tests
[warning] 178-178: agential/cog/self_refine/strategies/code.py#L178
Added line #L178 was not covered by tests
181-184
: LGTM!The
SelfRefineMBPPStrategy
class inherits fromSelfRefineCodeStrategy
and does not add any new functionality.Tools
GitHub Check: codecov/patch
[warning] 181-181: agential/cog/self_refine/strategies/code.py#L181
Added line #L181 was not covered by tests
[warning] 184-184: agential/cog/self_refine/strategies/code.py#L184
Added line #L184 was not covered by tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- agential/cog/self_refine/factory.py (6 hunks)
Additional comments not posted (6)
agential/cog/self_refine/factory.py (6)
43-52
: Imports for HUMANEVAL and MBPP constants look good.The added import statements for HUMANEVAL and MBPP related constants are correct and necessary for the functionality.
66-69
: Imports for HUMANEVAL and MBPP strategies look good.The added import statements for HUMANEVAL and MBPP strategies are correct and necessary for the functionality.
80-81
: Additions toSELF_REFINE_BENCHMARK_FEWSHOTS
look good.The HUMANEVAL and MBPP benchmarks were correctly added to the
SELF_REFINE_BENCHMARK_FEWSHOTS
dictionary.
121-128
: Additions toSELF_REFINE_PROMPTS
look good.The HUMANEVAL and MBPP benchmarks were correctly added to the
SELF_REFINE_PROMPTS
dictionary with their respective prompts.
161-168
: Additions toSELF_REFINE_FEWSHOTS
look good.The HUMANEVAL and MBPP benchmarks were correctly added to the
SELF_REFINE_FEWSHOTS
dictionary with their respective few-shot examples.
179-180
: Additions toSELF_REFINE_STRATEGIES
look good.The HUMANEVAL and MBPP strategies were correctly added to the
SELF_REFINE_STRATEGIES
dictionary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 12
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (6)
- agential/cog/self_refine/factory.py (6 hunks)
- agential/cog/self_refine/prompts.py (28 hunks)
- agential/cog/self_refine/strategies/code.py (1 hunks)
- notebooks/critic.ipynb (1 hunks)
- notebooks/reflexion.ipynb (1 hunks)
- tests/cog/self_refine/strategies/test_code.py (1 hunks)
Files not summarized due to errors (1)
- agential/cog/self_refine/prompts.py: Error: Message exceeds token limit
Files skipped from review due to trivial changes (2)
- notebooks/critic.ipynb
- notebooks/reflexion.ipynb
Files skipped from review as they are similar to previous changes (1)
- agential/cog/self_refine/factory.py
Additional context used
GitHub Check: codecov/patch
agential/cog/self_refine/strategies/code.py
[warning] 27-31: agential/cog/self_refine/strategies/code.py#L27-L31
Added lines #L27 - L31 were not covered by tests
[warning] 53-53: agential/cog/self_refine/strategies/code.py#L53
Added line #L53 was not covered by tests
[warning] 60-60: agential/cog/self_refine/strategies/code.py#L60
Added line #L60 was not covered by tests
[warning] 62-62: agential/cog/self_refine/strategies/code.py#L62
Added line #L62 was not covered by tests
[warning] 87-87: agential/cog/self_refine/strategies/code.py#L87
Added line #L87 was not covered by tests
[warning] 96-99: agential/cog/self_refine/strategies/code.py#L96-L99
Added lines #L96 - L99 were not covered by tests
[warning] 101-101: agential/cog/self_refine/strategies/code.py#L101
Added line #L101 was not covered by tests
[warning] 103-103: agential/cog/self_refine/strategies/code.py#L103
Added line #L103 was not covered by tests
[warning] 115-115: agential/cog/self_refine/strategies/code.py#L115
Added line #L115 was not covered by tests
[warning] 139-139: agential/cog/self_refine/strategies/code.py#L139
Added line #L139 was not covered by tests
[warning] 148-148: agential/cog/self_refine/strategies/code.py#L148
Added line #L148 was not covered by tests
[warning] 150-150: agential/cog/self_refine/strategies/code.py#L150
Added line #L150 was not covered by tests
[warning] 160-160: agential/cog/self_refine/strategies/code.py#L160
Added line #L160 was not covered by tests
[warning] 170-172: agential/cog/self_refine/strategies/code.py#L170-L172
Added lines #L170 - L172 were not covered by tests
Additional comments not posted (10)
tests/cog/self_refine/strategies/test_code.py (1)
1-16
: LGTM! Import statements are correct.The import statements are necessary for the functionality of the tests and are correctly included.
agential/cog/self_refine/strategies/code.py (1)
105-116
: LGTM! The method correctly creates the output dictionary.The
create_output_dict
method is straightforward and correctly creates the output dictionary.Tools
GitHub Check: codecov/patch
[warning] 115-115: agential/cog/self_refine/strategies/code.py#L115
Added line #L115 was not covered by testsagential/cog/self_refine/prompts.py (8)
1148-1150
: Approved: Improved clarity in instructions.The updated instruction provides clearer guidance for writing Python code to solve the questions.
1161-1206
: Approved: Correct identification of inefficiency in example code.The critique correctly identifies the inefficiency in the example code and suggests a better solution.
1214-1222
: Approved: Correct identification of variable naming error.The critique correctly identifies the variable naming error in the example code and suggests a better solution.
1232-1245
: Approved: Correct identification of logical error.The critique correctly identifies the logical error in the example code and suggests a better solution.
1261-1289
: Approved: Correct identification of logical error.The critique correctly identifies the logical error in the example code and suggests a better solution.
1303-1318
: Approved: Correct identification of logical error.The critique correctly identifies the logical error in the example code and suggests a better solution.
1338-1383
: Approved: Correct identification of inefficiency in example code.The critique correctly identifies the inefficiency in the example code and suggests a better solution.
1399-1407
: Approved: Correct identification of variable naming error.The critique correctly identifies the variable naming error in the example code and suggests a better solution.
def test_init() -> None: | ||
"""Test SelfRefineCodeStrategy initialization.""" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implement the test for initialization.
The function test_init
is defined but not implemented. It should be implemented to test the initialization of SelfRefineCodeStrategy
.
def test_init() -> None:
"""Test SelfRefineCodeStrategy initialization."""
llm = FakeListChatModel()
strategy = SelfRefineCodeStrategy(llm=llm, patience=2)
assert strategy.llm == llm
assert strategy.patience == 2
assert strategy._prev_code_answer == ""
assert strategy.patience_counter == 0
assert strategy._halt == False
def test_generate() -> None: | ||
"""Tests SelfRefineCodeStrategy generate.""" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implement the test for generate
method.
The function test_generate
is defined but not implemented. It should be implemented to test the generate
method of SelfRefineCodeStrategy
.
def test_generate() -> None:
"""Tests SelfRefineCodeStrategy generate."""
llm = FakeListChatModel()
strategy = SelfRefineCodeStrategy(llm=llm)
question = "What is 2 + 2?"
examples = "Example: 1 + 1 = 2"
prompt = "Solve the following math problem."
additional_keys = {}
answer = strategy.generate(question, examples, prompt, additional_keys)
assert answer == "4"
def test_generate_critique() -> None: | ||
"""Tests SelfRefineCodeStrategy generate_critique.""" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implement the test for generate_critique
method.
The function test_generate_critique
is defined but not implemented. It should be implemented to test the generate_critique
method of SelfRefineCodeStrategy
.
def test_generate_critique() -> None:
"""Tests SelfRefineCodeStrategy generate_critique."""
llm = FakeListChatModel()
strategy = SelfRefineCodeStrategy(llm=llm)
question = "What is 2 + 2?"
examples = "Example: 1 + 1 = 2"
answer = "4"
prompt = "Critique the following answer."
additional_keys = {}
critique = strategy.generate_critique(question, examples, answer, prompt, additional_keys)
assert critique == "The answer is correct."
def test_create_output_dict() -> None: | ||
"""Tests SelfRefineCodeStrategy create_output_dict.""" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implement the test for create_output_dict
method.
The function test_create_output_dict
is defined but not implemented. It should be implemented to test the create_output_dict
method of SelfRefineCodeStrategy
.
def test_create_output_dict() -> None:
"""Tests SelfRefineCodeStrategy create_output_dict."""
llm = FakeListChatModel()
strategy = SelfRefineCodeStrategy(llm=llm)
answer = "4"
critique = "The answer is correct."
output_dict = strategy.create_output_dict(answer, critique)
assert output_dict == {"answer": answer, "critique": critique}
def test_update_answer_based_on_critique() -> None: | ||
"""Tests SelfRefineCodeStrategy update_answer_based_on_critique.""" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implement the test for update_answer_based_on_critique
method.
The function test_update_answer_based_on_critique
is defined but not implemented. It should be implemented to test the update_answer_based_on_critique
method of SelfRefineCodeStrategy
.
def test_update_answer_based_on_critique() -> None:
"""Tests SelfRefineCodeStrategy update_answer_based_on_critique."""
llm = FakeListChatModel()
strategy = SelfRefineCodeStrategy(llm=llm)
question = "What is 2 + 2?"
examples = "Example: 1 + 1 = 2"
answer = "4"
critique = "The answer is correct."
prompt = "Refine the following answer."
additional_keys = {}
new_answer = strategy.update_answer_based_on_critique(question, examples, answer, critique, prompt, additional_keys)
assert new_answer == "4"
def test_instantiate_strategies() -> None: | ||
"""Test instantiate all Code strategies.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implement the test for instantiating strategies.
The function test_instantiate_strategies
is defined but not implemented. It should be implemented to test the instantiation of all code strategies.
def test_instantiate_strategies() -> None:
"""Test instantiate all Code strategies."""
llm = FakeListChatModel()
heval_strategy = SelfRefineHEvalStrategy(llm=llm)
mbpp_strategy = SelfRefineMBPPStrategy(llm=llm)
assert isinstance(heval_strategy, SelfRefineHEvalStrategy)
assert isinstance(mbpp_strategy, SelfRefineMBPPStrategy)
def __init__(self, llm: BaseChatModel, patience: int = 1) -> None: | ||
"""Initialization.""" | ||
super().__init__(llm) | ||
self.patience = patience | ||
self._prev_code_answer = "" | ||
self.patience_counter = 0 | ||
self._halt = False | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add type check for llm
argument.
The __init__
method correctly initializes the attributes. However, a type check for the llm
argument should be added to ensure it is an instance of BaseChatModel
.
def __init__(self, llm: BaseChatModel, patience: int = 1) -> None:
"""Initialization."""
if not isinstance(llm, BaseChatModel):
raise TypeError("llm must be an instance of BaseChatModel")
super().__init__(llm)
self.patience = patience
self._prev_code_answer = ""
self.patience_counter = 0
self._halt = False
Tools
GitHub Check: codecov/patch
[warning] 27-31: agential/cog/self_refine/strategies/code.py#L27-L31
Added lines #L27 - L31 were not covered by tests
def generate( | ||
self, | ||
question: str, | ||
examples: str, | ||
prompt: str, | ||
additional_keys: Dict[str, str], | ||
**kwargs: Dict[str, Any], | ||
) -> str: | ||
"""Generates an answer for the given question using the provided prompt and examples. | ||
|
||
Args: | ||
question (str): The math question to generate an answer for. | ||
examples (str): Few-shot examples to guide the language model. | ||
prompt (str): The prompt to generate an answer. | ||
additional_keys (Dict[str, str]): Additional keys for the prompt. | ||
**kwargs (Dict[str, Any]): Additional arguments. | ||
|
||
Returns: | ||
str: The generated answer. | ||
""" | ||
answer = _prompt_agent( | ||
llm=self.llm, | ||
question=question, | ||
examples=examples, | ||
prompt=prompt, | ||
additional_keys=additional_keys, | ||
) | ||
answer = answer.split("```python")[-1].split("```")[0].strip() | ||
|
||
return answer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improve answer extraction logic.
The generate
method processes the answer to extract the Python code. The extraction logic can be improved for clarity and robustness.
def generate(
self,
question: str,
examples: str,
prompt: str,
additional_keys: Dict[str, str],
**kwargs: Dict[str, Any],
) -> str:
"""Generates an answer for the given question using the provided prompt and examples."""
answer = _prompt_agent(
llm=self.llm,
question=question,
examples=examples,
prompt=prompt,
additional_keys=additional_keys,
)
# Extract the Python code from the answer
python_code = answer.partition("```python")[-1].partition("```")[0].strip()
return python_code
Tools
GitHub Check: codecov/patch
[warning] 53-53: agential/cog/self_refine/strategies/code.py#L53
Added line #L53 was not covered by tests
[warning] 60-60: agential/cog/self_refine/strategies/code.py#L60
Added line #L60 was not covered by tests
[warning] 62-62: agential/cog/self_refine/strategies/code.py#L62
Added line #L62 was not covered by tests
def generate_critique( | ||
self, | ||
question: str, | ||
examples: str, | ||
answer: str, | ||
prompt: str, | ||
additional_keys: Dict[str, str], | ||
) -> str: | ||
"""Generates a critique for the provided answer using the given prompt and examples. | ||
|
||
Stops early if patience is reached and answer remains the same. | ||
|
||
Args: | ||
question (str): The math question that was answered. | ||
examples (str): Few-shot examples to guide the language model in generating the critique. | ||
answer (str): The answer to be critiqued. | ||
prompt (str): The prompt to generate a critique. | ||
additional_keys (Dict[str, str]): Additional keys for the prompt. | ||
|
||
Returns: | ||
str: The generated critique. If the same incorrect answer is repeated for the number of | ||
interactions specified by patience, the halting condition is triggered. | ||
""" | ||
critique = _prompt_critique( | ||
llm=self.llm, | ||
question=question, | ||
examples=examples, | ||
answer=answer, | ||
prompt=prompt, | ||
additional_keys=additional_keys, | ||
) | ||
|
||
if EM(answer.strip(), self._prev_code_answer, normalize=False): | ||
self.patience_counter += 1 | ||
if self.patience_counter == self.patience: | ||
self._halt = True | ||
else: | ||
self._prev_code_answer = answer.strip() | ||
|
||
return critique |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improve halting condition logic.
The generate_critique
method checks for halting conditions based on the patience attribute. The logic for checking the halting condition can be improved for clarity.
def generate_critique(
self,
question: str,
examples: str,
answer: str,
prompt: str,
additional_keys: Dict[str, str],
) -> str:
"""Generates a critique for the provided answer using the given prompt and examples."""
critique = _prompt_critique(
llm=self.llm,
question=question,
examples=examples,
answer=answer,
prompt=prompt,
additional_keys=additional_keys,
)
# Check for halting condition
if EM(answer.strip(), self._prev_code_answer, normalize=False):
self.patience_counter += 1
if self.patience_counter >= self.patience:
self._halt = True
else:
self._prev_code_answer = answer.strip()
self.patience_counter = 0 # Reset counter if the answer changes
return critique
Tools
GitHub Check: codecov/patch
[warning] 87-87: agential/cog/self_refine/strategies/code.py#L87
Added line #L87 was not covered by tests
[warning] 96-99: agential/cog/self_refine/strategies/code.py#L96-L99
Added lines #L96 - L99 were not covered by tests
[warning] 101-101: agential/cog/self_refine/strategies/code.py#L101
Added line #L101 was not covered by tests
[warning] 103-103: agential/cog/self_refine/strategies/code.py#L103
Added line #L103 was not covered by tests
def update_answer_based_on_critique( | ||
self, | ||
question: str, | ||
examples: str, | ||
answer: str, | ||
critique: str, | ||
prompt: str, | ||
additional_keys: Dict[str, str], | ||
) -> str: | ||
"""Updates the answer based on the given critique. | ||
|
||
Args: | ||
question: The question that was answered by the language model. | ||
examples: Few-shot examples to guide the language model. | ||
answer: The answer provided by the language model. | ||
critique: The critique of the answer. | ||
prompt: The prompt to be used for generating the updated answer. | ||
additional_keys: Additional context or parameters to include in the critique prompt. | ||
|
||
Returns: | ||
str: The updated answer. | ||
""" | ||
new_answer = _prompt_refine( | ||
llm=self.llm, | ||
question=question, | ||
examples=examples, | ||
answer=answer, | ||
critique=critique, | ||
prompt=prompt, | ||
additional_keys=additional_keys, | ||
) | ||
new_answer = new_answer.split("```python")[-1].split("```")[0].strip() | ||
|
||
return new_answer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Improve new answer extraction logic.
The update_answer_based_on_critique
method processes the new answer to extract the Python code. The extraction logic can be improved for clarity and robustness.
def update_answer_based_on_critique(
self,
question: str,
examples: str,
answer: str,
critique: str,
prompt: str,
additional_keys: Dict[str, str],
) -> str:
"""Updates the answer based on the given critique."""
new_answer = _prompt_refine(
llm=self.llm,
question=question,
examples=examples,
answer=answer,
critique=critique,
prompt=prompt,
additional_keys=additional_keys,
)
# Extract the Python
<details>
<summary>Tools</summary>
<details>
<summary>GitHub Check: codecov/patch</summary><blockquote>
[warning] 139-139: agential/cog/self_refine/strategies/code.py#L139
Added line #L139 was not covered by tests
---
[warning] 148-148: agential/cog/self_refine/strategies/code.py#L148
Added line #L148 was not covered by tests
---
[warning] 150-150: agential/cog/self_refine/strategies/code.py#L150
Added line #L150 was not covered by tests
</blockquote></details>
</details>
<!-- This is an auto-generated comment by CodeRabbit -->
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (2)
- notebooks/self_refine.ipynb (3 hunks)
- tests/cog/self_refine/strategies/test_code.py (1 hunks)
Files skipped from review as they are similar to previous changes (2)
- notebooks/self_refine.ipynb
- tests/cog/self_refine/strategies/test_code.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Files selected for processing (1)
- tests/cog/self_refine/strategies/test_code.py (1 hunks)
Files skipped from review as they are similar to previous changes (1)
- tests/cog/self_refine/strategies/test_code.py
π€ Reasoning
Explain the purpose of this PR...
π§ Changes
Describe the changes made...
β PR Checklist
Summary by CodeRabbit
New Features
HUMANEVAL
,MBPP
, and improved instructions forAMBIGNQ
.Tests