diff --git a/changelog.md b/changelog.md index d64ae92a2..f3d69bef4 100644 --- a/changelog.md +++ b/changelog.md @@ -2,6 +2,20 @@ ## Unversioned - In Main, Not Released +### Added + +- None + +### Fixed + +- None + +### Changed + +- None + +## Version 0.9.16 - Date: 2024-01-20 + This release is going to focus on getting the feature list complete for a version 1.0 release in early 2024. To a large extent, this involves adding the "fix" feature for some rules, and double checking diff --git a/docs/rules/rule_md014.md b/docs/rules/rule_md014.md index a272d3f8b..c3578ab51 100644 --- a/docs/rules/rule_md014.md +++ b/docs/rules/rule_md014.md @@ -33,7 +33,7 @@ commands provided. ### Failure Scenarios This rule triggers if every line within a Code Block element begins with -the `$` indicator, after any leading whitespace has been removed. +the `$` indicator, after any leading space characters have been removed. ````Markdown ```shell diff --git a/docs/rules/rule_md025.md b/docs/rules/rule_md025.md index 1d8425952..e83729f25 100644 --- a/docs/rules/rule_md025.md +++ b/docs/rules/rule_md025.md @@ -103,9 +103,13 @@ to check against for multiples. | Value Name | Type | Default | Description | | -- | -- | -- | -- | | `enabled` | `boolean` | `True` | Whether the plugin rule is enabled. | -| `front_matter_title` | `string` | `title` | Name of the front-matter field that has the title associated with the document. | +| `front_matter_title` | `string` | `title` | Name of the front-matter field that has the title associated with the document.** | | `level` | `integer` | `1` | Heading level to be considered as the top-level. | +** Any leading or trailing space characters are removed from the `front_matter_title` +during processing. This value is expected not to have the `:` at the end. Therefore, +a header value of `subject:` would be entered as `subject`. + ## Origination of Rule This rule is largely inspired by the MarkdownLint rule diff --git a/docs/rules/rule_md033.md b/docs/rules/rule_md033.md index f8f1aff1e..572f42434 100644 --- a/docs/rules/rule_md033.md +++ b/docs/rules/rule_md033.md @@ -67,15 +67,17 @@ image tags than the default `!--` (HTML comment) are strongly discouraged. | Value Name | Type | Default | Description | | -- | -- | -- | -- | | `enabled` | `boolean` | `True` | Whether the plugin rule is enabled. | -| `allowed_elements` | `string` | `!--,![CDATA[,!DOCTYPE` | Comma separated list of tag starts that are allowable. | +| `allowed_elements` | `string` | `!--,![CDATA[,!DOCTYPE` | Comma separated list of tag starts that are allowable.** | | `allow_first_image_element` | `boolean` | `True` | Whether to allow an image HTML block. | -To be clear, if using the `allowed_elements` configuration value, the supplied -value is a comma separated list of allowable element sequences. Those -element names are derived by taking the start of the tag and skipping -over the start character `<`. -From that point, the parser collects the contents of the tag up to one of the -following: +** The comma-separated list of items is a string with a format of `{item},...,{item}`. +Any leading or trailing space characters surrounding the `{item}` are trimmed during +processing. Empty `{item}` values after this trimming has been applied will generate +a configuration error. + +The element names in the list are derived by taking the start of the tag and skipping +over the start character `<`. From that point, the parser collects the contents +of the tag up to one of the following: - the first whitespace character - the close HTML tag character (`/`) diff --git a/docs/rules/rule_md035.md b/docs/rules/rule_md035.md index 698d1f906..31380b416 100644 --- a/docs/rules/rule_md035.md +++ b/docs/rules/rule_md035.md @@ -75,8 +75,10 @@ is made, so that the following example will not trigger this rule: | `enabled` | `boolean` | `True` | Whether the plugin rule is enabled. | | `style` | `string` | `consistent` | `consistent` for consistent, or a specific marker** | -** If a specific marker is configured, it must be valid multiples (three or more) of either the -`-` character, the `_` character, or the `*` character, with optional whitespace between them. +** If a specific marker is configured, it must be valid multiples (three or more) +of either the `-` character, the `_` character, or the `*` character, with optional +whitespace between them. The specific marker cannot start or end with a space +character. ## Origination of Rule diff --git a/docs/rules/rule_md037.md b/docs/rules/rule_md037.md index b2ab618b4..9ae5e897d 100644 --- a/docs/rules/rule_md037.md +++ b/docs/rules/rule_md037.md @@ -31,7 +31,7 @@ such as `***` for combining an italics emphasis with a bold emphasis. ### Failure Scenarios This rule triggers if a pair of matching emphasis characters occur -within the same paragraph with space around either of the emphasis +within the same paragraph with unicode whitespace around either of the emphasis characters. ```Markdown diff --git a/docs/rules/rule_md041.md b/docs/rules/rule_md041.md index ce8f970e7..04833da44 100644 --- a/docs/rules/rule_md041.md +++ b/docs/rules/rule_md041.md @@ -103,7 +103,11 @@ document will not trigger this rule: | -- | -- | -- | -- | | `enabled` | `boolean` | `True` | Whether the plugin rule is enabled. | | `level` | `integer` | `1` | Level that is expected from the first heading (Atx or SetExt) in the document. | -| `front_matter_title` | `string` | `title` | Name of the front-matter field that has the title associated with the document. | +| `front_matter_title` | `string` | `title` | Name of the front-matter field that has the title associated with the document.** | + +** Any leading or trailing space characters are removed from the `front_matter_title` +during processing. This value is expected not to have the `:` at the end. Therefore, +a header value of `subject:` would be entered as `subject`. ## Origination of Rule diff --git a/docs/rules/rule_md043.md b/docs/rules/rule_md043.md index a87848cad..b83b78666 100644 --- a/docs/rules/rule_md043.md +++ b/docs/rules/rule_md043.md @@ -100,7 +100,12 @@ sequence is not followed by anything; it cannot be followed by any headings. | Value Name | Type | Default | Description | | -- | -- | -- | -- | | `enabled` | `boolean` | `True` | Whether the plugin rule is enabled. | -| `required_headings` | `string` | `""` | Comma separated list of headings to require the document to have. | +| `required_headings` | `string` | `""` | Comma separated list of headings to require the document to have.** | + +** The comma-separated list of items is a string with a format of `{item},...,{item}`. +Any leading or trailing space characters surrounding the `{item}` are trimmed during +processing. Empty `{item}` values after this trimming has been applied will generate +a configuration error. For the `required_headings` list, each element is expected to be in one of two forms. The first form is that of a uncomplicated text Atx Heading, such as diff --git a/docs/rules/rule_md044.md b/docs/rules/rule_md044.md index 176e61f63..a6a902af6 100644 --- a/docs/rules/rule_md044.md +++ b/docs/rules/rule_md044.md @@ -87,9 +87,14 @@ this is a reparagraph | Value Name | Type | Default | Description | | -- | -- | -- | -- | | `enabled` | `boolean` | `True` | Whether the plugin rule is enabled. | -| `names` | `string` | None | Comma-separated list of proper nouns to preserve capitalization on. | +| `names` | `string` | None | Comma-separated list of proper nouns to preserve capitalization on.** | | `code_blocks` | `boolean` | `True` | Search in Fenced Code Block elements and Indented Code Block elements. | +** The comma-separated list of items is a string with a format of `{item},...,{item}`. +Any leading or trailing space characters surrounding the `{item}` are trimmed during +processing. Empty `{item}` values after this trimming has been applied will generate +a configuration error. + ## Origination of Rule This rule is largely inspired by the MarkdownLint rule diff --git a/docs/rules/rule_md045.md b/docs/rules/rule_md045.md index b1e2841b7..eee98b98c 100644 --- a/docs/rules/rule_md045.md +++ b/docs/rules/rule_md045.md @@ -28,7 +28,9 @@ sight impaired people. ### Failure Scenarios This rule triggers when the link label for an image has no characters or only -whitespace characters: +whitespace characters. As the focus of this rule is to provide text to help +identify the image, the whitespace characters compared against are the set +of Unicode whitespace characters. ````Markdown [](/url) diff --git a/publish/coverage.json b/publish/coverage.json index becab3292..2fbaf49c9 100644 --- a/publish/coverage.json +++ b/publish/coverage.json @@ -2,12 +2,12 @@ "projectName": "pymarkdown", "reportSource": "pytest", "branchLevel": { - "totalMeasured": 4783, - "totalCovered": 4783 + "totalMeasured": 4785, + "totalCovered": 4785 }, "lineLevel": { - "totalMeasured": 19312, - "totalCovered": 19312 + "totalMeasured": 19327, + "totalCovered": 19327 } } diff --git a/publish/test-results.json b/publish/test-results.json index b8b03ef80..75bbda23b 100644 --- a/publish/test-results.json +++ b/publish/test-results.json @@ -236,7 +236,7 @@ }, { "name": "test.extensions.test_markdown_front_matter", - "totalTests": 29, + "totalTests": 33, "failedTests": 0, "errorTests": 0, "skippedTests": 0, @@ -244,7 +244,7 @@ }, { "name": "test.extensions.test_markdown_pragma_parsing", - "totalTests": 12, + "totalTests": 14, "failedTests": 0, "errorTests": 0, "skippedTests": 0, @@ -252,7 +252,7 @@ }, { "name": "test.extensions.test_markdown_pragmas", - "totalTests": 31, + "totalTests": 32, "failedTests": 0, "errorTests": 0, "skippedTests": 0, @@ -332,7 +332,7 @@ }, { "name": "test.gfm.test_markdown_code_spans", - "totalTests": 37, + "totalTests": 38, "failedTests": 0, "errorTests": 0, "skippedTests": 0, @@ -532,7 +532,7 @@ }, { "name": "test.gfm.test_markdown_list_blocks", - "totalTests": 133, + "totalTests": 135, "failedTests": 0, "errorTests": 0, "skippedTests": 4, @@ -572,7 +572,7 @@ }, { "name": "test.gfm.test_markdown_reference_links", - "totalTests": 97, + "totalTests": 104, "failedTests": 0, "errorTests": 0, "skippedTests": 0, @@ -1364,7 +1364,7 @@ }, { "name": "test.rules.test_md033", - "totalTests": 17, + "totalTests": 18, "failedTests": 0, "errorTests": 0, "skippedTests": 0, @@ -1444,7 +1444,7 @@ }, { "name": "test.rules.test_md043", - "totalTests": 30, + "totalTests": 32, "failedTests": 0, "errorTests": 0, "skippedTests": 0, @@ -1460,7 +1460,7 @@ }, { "name": "test.rules.test_md045", - "totalTests": 6, + "totalTests": 7, "failedTests": 0, "errorTests": 0, "skippedTests": 0, @@ -1596,7 +1596,7 @@ }, { "name": "test.test_markdown_extra", - "totalTests": 110, + "totalTests": 114, "failedTests": 0, "errorTests": 0, "skippedTests": 0, diff --git a/pymarkdown/block_quotes/block_quote_non_fenced_helper.py b/pymarkdown/block_quotes/block_quote_non_fenced_helper.py index ce5a85117..615f53728 100644 --- a/pymarkdown/block_quotes/block_quote_non_fenced_helper.py +++ b/pymarkdown/block_quotes/block_quote_non_fenced_helper.py @@ -6,6 +6,7 @@ from pymarkdown.block_quotes.block_quote_count_helper import BlockQuoteCountHelper from pymarkdown.block_quotes.block_quote_data import BlockQuoteData +from pymarkdown.general.constants import Constants from pymarkdown.general.parser_helper import ParserHelper from pymarkdown.general.parser_logger import ParserLogger from pymarkdown.general.parser_state import ParserState @@ -195,7 +196,7 @@ def __handle_non_fenced_code_section_no_requeue( ) POGGER.debug("text_removed_by_container=[$]", removed_text) POGGER.debug("removed_text=[$]", removed_text) - if line_to_parse.strip(): + if line_to_parse.strip(Constants.ascii_whitespace): return ( line_to_parse, start_index, diff --git a/pymarkdown/block_quotes/block_quote_processor.py b/pymarkdown/block_quotes/block_quote_processor.py index 6845b9942..7d4dce407 100644 --- a/pymarkdown/block_quotes/block_quote_processor.py +++ b/pymarkdown/block_quotes/block_quote_processor.py @@ -10,6 +10,7 @@ BlockQuoteNonFencedHelper, ) from pymarkdown.container_blocks.container_grab_bag import ContainerGrabBag +from pymarkdown.general.constants import Constants from pymarkdown.general.parser_logger import ParserLogger from pymarkdown.general.parser_state import ParserState from pymarkdown.general.position_marker import PositionMarker @@ -326,10 +327,9 @@ def __handle_block_quote_block_kludges( POGGER.debug( "token_stack[x]>$", parser_state.token_stack[adjusted_current_count] ) - if ( - parser_state.token_stack[adjusted_current_count].is_list - and adjusted_text_to_parse.strip() - ): + if parser_state.token_stack[ + adjusted_current_count + ].is_list and adjusted_text_to_parse.strip(Constants.ascii_whitespace): POGGER.debug("\n\nBOOM\n\n") parser_state.nested_list_start = cast( ListStackToken, parser_state.token_stack[adjusted_current_count] diff --git a/pymarkdown/container_blocks/container_block_nested_processor.py b/pymarkdown/container_blocks/container_block_nested_processor.py index e9933b62d..debe2cb73 100644 --- a/pymarkdown/container_blocks/container_block_nested_processor.py +++ b/pymarkdown/container_blocks/container_block_nested_processor.py @@ -10,6 +10,7 @@ from pymarkdown.block_quotes.block_quote_data import BlockQuoteData from pymarkdown.container_blocks.container_grab_bag import ContainerGrabBag from pymarkdown.container_blocks.container_indices import ContainerIndices +from pymarkdown.general.constants import Constants from pymarkdown.general.parser_helper import ParserHelper from pymarkdown.general.parser_logger import ParserLogger from pymarkdown.general.parser_state import ParserState @@ -428,7 +429,9 @@ def __check_for_nested_list_start( POGGER.debug( "parser_state.token_document>>$<<", parser_state.token_document ) - if parser_state.nested_list_start and grab_bag.adj_line_to_parse.strip(): + if parser_state.nested_list_start and grab_bag.adj_line_to_parse.strip( + Constants.ascii_whitespace + ): ( grab_bag.start_index, indent_level, diff --git a/pymarkdown/container_blocks/container_block_non_leaf_processor.py b/pymarkdown/container_blocks/container_block_non_leaf_processor.py index 05a731a2f..21ebeace7 100644 --- a/pymarkdown/container_blocks/container_block_non_leaf_processor.py +++ b/pymarkdown/container_blocks/container_block_non_leaf_processor.py @@ -215,7 +215,9 @@ def __handle_trailing_indent_with_block_quote( if inner_token.is_block_quote_start: block_quote_token = cast(BlockQuoteMarkdownToken, inner_token) assert block_quote_token.bleading_spaces is not None - split_spaces = block_quote_token.bleading_spaces.split("\n") + split_spaces = block_quote_token.bleading_spaces.split( + ParserHelper.newline_character + ) grab_bag.indent_already_processed = len(split_spaces[-1]) else: assert inner_token.is_list_start diff --git a/pymarkdown/extensions/disallowed_raw_html.py b/pymarkdown/extensions/disallowed_raw_html.py index d0373400d..3138ef069 100644 --- a/pymarkdown/extensions/disallowed_raw_html.py +++ b/pymarkdown/extensions/disallowed_raw_html.py @@ -85,7 +85,7 @@ def apply_configuration( if modify_tag_names is not None: tag_config_name = f"extensions.{self.get_identifier()}.change_tag_names" for next_tag_part in modify_tag_names.split(","): - next_tag_part = next_tag_part.strip() + next_tag_part = next_tag_part.strip(" ") if not next_tag_part: raise ValueError( f"Configuration item '{tag_config_name}' contains at least one empty string." diff --git a/pymarkdown/extensions/front_matter_extension.py b/pymarkdown/extensions/front_matter_extension.py index 43d9b7239..c962ca2b0 100644 --- a/pymarkdown/extensions/front_matter_extension.py +++ b/pymarkdown/extensions/front_matter_extension.py @@ -14,6 +14,7 @@ ) from pymarkdown.extension_manager.parser_extension import ParserExtension from pymarkdown.extensions.front_matter_markdown_token import FrontMatterMarkdownToken +from pymarkdown.general.constants import Constants from pymarkdown.general.parser_logger import ParserLogger from pymarkdown.general.position_marker import PositionMarker from pymarkdown.general.source_providers import SourceProvider @@ -77,7 +78,7 @@ def process_header_if_present( Take care of processing eligibility and processing for front matter support. """ start_char, extracted_index = ThematicLeafBlockProcessor.is_thematic_break( - first_line_in_document.rstrip(), + first_line_in_document.rstrip(Constants.ascii_whitespace), 0, "", whitespace_allowed_between_characters=False, @@ -111,7 +112,7 @@ def __handle_document_front_matter( Optional[str], Optional[FrontMatterMarkdownToken], int, Optional[List[str]] ]: starting_line = token_to_use - clean_starting_line = starting_line.rstrip() + clean_starting_line = starting_line.rstrip(Constants.ascii_whitespace) repeat_again = True have_closing = False collected_lines: List[str] = [] @@ -119,15 +120,17 @@ def __handle_document_front_matter( next_line = None while repeat_again: next_line = source_provider.get_next_line() - if next_line and next_line.rstrip(): + if next_line and next_line.rstrip(Constants.ascii_whitespace): start_char, _ = ThematicLeafBlockProcessor.is_thematic_break( - next_line.rstrip(), + next_line.rstrip(Constants.ascii_whitespace), 0, "", whitespace_allowed_between_characters=False, ) - have_closing = ( - bool(start_char) and clean_starting_line == next_line.rstrip() + have_closing = bool( + start_char + ) and clean_starting_line == next_line.rstrip( + Constants.ascii_whitespace ) repeat_again = not have_closing elif not self.__allow_blank_lines: diff --git a/pymarkdown/extensions/pragma_token.py b/pymarkdown/extensions/pragma_token.py index b46e9b034..216bb3068 100644 --- a/pymarkdown/extensions/pragma_token.py +++ b/pymarkdown/extensions/pragma_token.py @@ -15,6 +15,7 @@ ExtensionManagerConstants, ) from pymarkdown.extension_manager.parser_extension import ParserExtension +from pymarkdown.general.constants import Constants from pymarkdown.general.parser_helper import ParserHelper from pymarkdown.general.parser_logger import ParserLogger from pymarkdown.general.position_marker import PositionMarker @@ -110,7 +111,9 @@ def look_for_pragmas( else PragmaToken.pragma_prefix ), ) - remaining_line = line_to_parse[start_index:].rstrip().lower() + remaining_line = ( + line_to_parse[start_index:].rstrip(Constants.ascii_whitespace).lower() + ) if remaining_line.startswith( PragmaToken.pragma_title ) and remaining_line.endswith(PragmaToken.pragma_suffix): @@ -146,12 +149,13 @@ def compile_single_pragma( prefix_length = len(PragmaToken.pragma_alternate_prefix) actual_line_number = -next_line_number - line_after_prefix = pragma_lines[next_line_number][prefix_length:].rstrip() + line_after_prefix = pragma_lines[next_line_number][prefix_length:] after_whitespace_index, _ = ParserHelper.extract_spaces(line_after_prefix, 0) assert after_whitespace_index is not None + xx = after_whitespace_index + len(PragmaToken.pragma_title) + after_whitespace_index, _ = ParserHelper.extract_spaces(line_after_prefix, xx) command_data = line_after_prefix[ - after_whitespace_index - + len(PragmaToken.pragma_title) : -len(PragmaToken.pragma_suffix) + after_whitespace_index : -len(PragmaToken.pragma_suffix) ] after_command_index, command = ParserHelper.extract_until_spaces( command_data, 0 @@ -211,7 +215,7 @@ def __handle_disable_next_line( ids_to_disable = command_data[after_command_index:].split(",") processed_ids = set() for next_id in ids_to_disable: - next_id = next_id.strip().lower() + next_id = next_id.strip(" ").lower() if not next_id: log_pragma_failure( scan_file, @@ -317,7 +321,7 @@ def __handle_disable_num_lines( ids_to_disable = command_data[after_number_index:].split(",") processed_ids = set() for next_id in ids_to_disable: - next_id = next_id.strip().lower() + next_id = next_id.strip(" ").lower() if not next_id: log_pragma_failure( scan_file, diff --git a/pymarkdown/general/tokenized_markdown.py b/pymarkdown/general/tokenized_markdown.py index 9107ebaf1..732d93a56 100644 --- a/pymarkdown/general/tokenized_markdown.py +++ b/pymarkdown/general/tokenized_markdown.py @@ -18,6 +18,7 @@ from pymarkdown.extensions.front_matter_extension import FrontMatterExtension from pymarkdown.extensions.pragma_token import PragmaToken from pymarkdown.general.bad_tokenization_error import BadTokenizationError +from pymarkdown.general.constants import Constants from pymarkdown.general.parser_helper import ParserHelper from pymarkdown.general.parser_logger import ParserLogger from pymarkdown.general.parser_state import ParserState @@ -337,7 +338,9 @@ def __main_pass_did_not_start_close( ) -> Tuple[Optional[List[MarkdownToken]], Optional[RequeueLineInfo]]: POGGER.debug(">>>>$", self.__tokenized_document) - if not next_line_in_document or not next_line_in_document.strip(): + if not next_line_in_document or not next_line_in_document.strip( + Constants.ascii_whitespace + ): POGGER.debug("call __parse_blocks_pass>>handle_blank_line") ( tokens_from_line, @@ -760,9 +763,10 @@ def __handle_blank_line_init( POGGER.debug("hbl>>close_only_these_blocks>>$", close_only_these_blocks) POGGER.debug("hbl>>do_include_block_quotes>>$", do_include_block_quotes) - non_whitespace_index, extracted_whitespace = ParserHelper.extract_spaces( - input_line, 0 - ) + ( + non_whitespace_index, + extracted_whitespace, + ) = ParserHelper.extract_ascii_whitespace(input_line, 0) assert extracted_whitespace is not None assert non_whitespace_index is not None return ( diff --git a/pymarkdown/inline/inline_backtick_helper.py b/pymarkdown/inline/inline_backtick_helper.py index 8608d34da..1ec3fcdd2 100644 --- a/pymarkdown/inline/inline_backtick_helper.py +++ b/pymarkdown/inline/inline_backtick_helper.py @@ -235,7 +235,7 @@ def __calculate_backtick_between_text( ] ): stripped_between_attempt = between_text[1:-1] - if len(stripped_between_attempt.strip()) != 0: + if len(stripped_between_attempt.strip(" ")) != 0: leading_whitespace, trailing_whitespace = ( between_text[0], between_text[-1], diff --git a/pymarkdown/links/link_parse_helper.py b/pymarkdown/links/link_parse_helper.py index 0c357a640..da4d95de6 100644 --- a/pymarkdown/links/link_parse_helper.py +++ b/pymarkdown/links/link_parse_helper.py @@ -101,10 +101,12 @@ def normalize_link_label(link_label: str) -> str: ) # Fold multiple spaces into a single space character. - link_label = ParserHelper.space_character.join(link_label.split()) + # x = link_label.split(ParserHelper.space_character) + split_label = [s for s in link_label.split(ParserHelper.space_character) if s] + link_label = ParserHelper.space_character.join(split_label) # Fold the case of any characters to their lower equivalent. - return link_label.casefold().strip() + return link_label.casefold().strip(ParserHelper.space_character) @staticmethod def look_up_link( diff --git a/pymarkdown/plugins/rule_md_014.py b/pymarkdown/plugins/rule_md_014.py index c3862686f..917f3dd66 100644 --- a/pymarkdown/plugins/rule_md_014.py +++ b/pymarkdown/plugins/rule_md_014.py @@ -55,5 +55,7 @@ def next_token(self, context: PluginScanContext, token: MarkdownToken) -> None: split_token_text = text_token.token_text.split( ParserHelper.newline_character ) - if all(next_line.strip().startswith("$") for next_line in split_token_text): + if all( + next_line.strip(" ").startswith("$") for next_line in split_token_text + ): self.report_next_token_error(context, token) diff --git a/pymarkdown/plugins/rule_md_025.py b/pymarkdown/plugins/rule_md_025.py index cf8242c4b..001f93911 100644 --- a/pymarkdown/plugins/rule_md_025.py +++ b/pymarkdown/plugins/rule_md_025.py @@ -44,7 +44,7 @@ def __validate_configuration_level(cls, found_value: int) -> None: @classmethod def __validate_configuration_title(cls, found_value: str) -> None: - found_value = found_value.strip() + found_value = found_value.strip(" ") if not found_value: raise ValueError("Empty strings are not allowable values.") if ":" in found_value: diff --git a/pymarkdown/plugins/rule_md_033.py b/pymarkdown/plugins/rule_md_033.py index c108a2fd0..e42aeaae6 100644 --- a/pymarkdown/plugins/rule_md_033.py +++ b/pymarkdown/plugins/rule_md_033.py @@ -54,9 +54,14 @@ def initialize_from_config(self) -> None: default_value="!--,![CDATA[,!DOCTYPE", ) self.__allowed_elements = [] - for next_element in allowed_elements.split(","): - if next_element := next_element.strip(): - self.__allowed_elements.append(next_element) + if allowed_elements := allowed_elements.strip(" "): + for next_element in allowed_elements.split(","): + if next_element := next_element.strip(" "): + self.__allowed_elements.append(next_element) + else: + raise ValueError( + "Elements in the comma-separated list cannot be empty." + ) def starting_new_file(self) -> None: """ diff --git a/pymarkdown/plugins/rule_md_035.py b/pymarkdown/plugins/rule_md_035.py index 097c16cf0..247ae2576 100644 --- a/pymarkdown/plugins/rule_md_035.py +++ b/pymarkdown/plugins/rule_md_035.py @@ -41,7 +41,7 @@ def get_details(self) -> PluginDetailsV2: def __validate_configuration_style(cls, found_value: str) -> None: if found_value == RuleMd035.__consistent_style: return - if found_value != found_value.strip(): + if found_value != found_value.strip(" "): raise ValueError( "Allowable values cannot including leading or trailing spaces." ) diff --git a/pymarkdown/plugins/rule_md_037.py b/pymarkdown/plugins/rule_md_037.py index d4b8e4b2f..e098a068c 100644 --- a/pymarkdown/plugins/rule_md_037.py +++ b/pymarkdown/plugins/rule_md_037.py @@ -3,6 +3,7 @@ """ from typing import List, Optional, Tuple, cast +from pymarkdown.general.constants import Constants from pymarkdown.plugin_manager.plugin_details import PluginDetailsV2 from pymarkdown.plugin_manager.plugin_scan_context import PluginScanContext from pymarkdown.plugin_manager.rule_plugin import RulePlugin @@ -56,9 +57,13 @@ def __fix( assert start_token is not None adjusted_token_text = start_token.token_text if did_first_start_with_space: - adjusted_token_text = adjusted_token_text.lstrip() + adjusted_token_text = adjusted_token_text.lstrip( + Constants.unicode_whitespace.value() + ) if did_last_end_with_space: - adjusted_token_text = adjusted_token_text.rstrip() + adjusted_token_text = adjusted_token_text.rstrip( + Constants.unicode_whitespace.value() + ) self.register_fix_token_request( context, start_token, @@ -69,7 +74,9 @@ def __fix( else: if did_first_start_with_space: assert start_token is not None - adjusted_token_text = start_token.token_text.lstrip() + adjusted_token_text = start_token.token_text.lstrip( + Constants.unicode_whitespace.value() + ) self.register_fix_token_request( context, start_token, @@ -79,7 +86,9 @@ def __fix( ) if did_last_end_with_space: assert end_token is not None - adjusted_token_text = end_token.token_text.rstrip() + adjusted_token_text = end_token.token_text.rstrip( + Constants.unicode_whitespace.value() + ) self.register_fix_token_request( context, end_token, diff --git a/pymarkdown/plugins/rule_md_039.py b/pymarkdown/plugins/rule_md_039.py index 7e0fa68de..3e24f8de9 100644 --- a/pymarkdown/plugins/rule_md_039.py +++ b/pymarkdown/plugins/rule_md_039.py @@ -3,6 +3,7 @@ """ from typing import cast +from pymarkdown.general.constants import Constants from pymarkdown.plugin_manager.plugin_details import PluginDetailsV2 from pymarkdown.plugin_manager.plugin_scan_context import PluginScanContext from pymarkdown.plugin_manager.rule_plugin import RulePlugin @@ -35,7 +36,9 @@ def next_token(self, context: PluginScanContext, token: MarkdownToken) -> None: """ if token.is_inline_link or token.is_inline_image: link_token = cast(LinkStartMarkdownToken, token) - stripped_text_from_blocks = link_token.text_from_blocks.strip() + stripped_text_from_blocks = link_token.text_from_blocks.strip( + Constants.ascii_whitespace + ) if link_token.text_from_blocks != stripped_text_from_blocks: if context.in_fix_mode: self.register_fix_token_request( diff --git a/pymarkdown/plugins/rule_md_040.py b/pymarkdown/plugins/rule_md_040.py index bae488c5f..79192f8ae 100644 --- a/pymarkdown/plugins/rule_md_040.py +++ b/pymarkdown/plugins/rule_md_040.py @@ -3,6 +3,7 @@ """ from typing import cast +from pymarkdown.general.constants import Constants from pymarkdown.plugin_manager.plugin_details import PluginDetails from pymarkdown.plugin_manager.plugin_scan_context import PluginScanContext from pymarkdown.plugin_manager.rule_plugin import RulePlugin @@ -39,5 +40,5 @@ def next_token(self, context: PluginScanContext, token: MarkdownToken) -> None: if token.is_fenced_code_block: fenced_token = cast(FencedCodeBlockMarkdownToken, token) # print(f":::>>{fenced_token.extracted_text}<<") - if not fenced_token.extracted_text.strip(): + if not fenced_token.extracted_text.strip(Constants.ascii_whitespace): self.report_next_token_error(context, token) diff --git a/pymarkdown/plugins/rule_md_041.py b/pymarkdown/plugins/rule_md_041.py index e278359bf..8f3e61a8c 100644 --- a/pymarkdown/plugins/rule_md_041.py +++ b/pymarkdown/plugins/rule_md_041.py @@ -46,7 +46,7 @@ def __validate_configuration_level(cls, found_value: int) -> None: @classmethod def __validate_configuration_title(cls, found_value: str) -> None: - found_value = found_value.strip() + found_value = found_value.strip(" ") if ":" in found_value: raise ValueError("Colons (:) are not allowed in the value.") @@ -66,7 +66,7 @@ def initialize_from_config(self) -> None: valid_value_fn=self.__validate_configuration_title, ) .lower() - .strip() + .strip(" ") ) def starting_new_file(self) -> None: @@ -96,7 +96,7 @@ def next_token(self, context: PluginScanContext, token: MarkdownToken) -> None: elif self.__seen_html_block_start: assert token.is_text text_token = cast(TextMarkdownToken, token) - html_block_contents = text_token.token_text.strip() + html_block_contents = text_token.token_text.strip(" ") if not html_block_contents.startswith( "

"): diff --git a/pymarkdown/plugins/rule_md_042.py b/pymarkdown/plugins/rule_md_042.py index ec886b71e..2ba5689fb 100644 --- a/pymarkdown/plugins/rule_md_042.py +++ b/pymarkdown/plugins/rule_md_042.py @@ -3,6 +3,7 @@ """ from typing import cast +from pymarkdown.general.constants import Constants from pymarkdown.plugin_manager.plugin_details import PluginDetails from pymarkdown.plugin_manager.plugin_scan_context import PluginScanContext from pymarkdown.plugin_manager.rule_plugin import RulePlugin @@ -35,6 +36,8 @@ def next_token(self, context: PluginScanContext, token: MarkdownToken) -> None: """ if token.is_inline_link or token.is_inline_image: link_token = cast(LinkStartMarkdownToken, token) - stripped_link_uri = link_token.active_link_uri.strip() + stripped_link_uri = link_token.active_link_uri.strip( + Constants.ascii_whitespace + ) if not stripped_link_uri or stripped_link_uri == "#": self.report_next_token_error(context, token) diff --git a/pymarkdown/plugins/rule_md_043.py b/pymarkdown/plugins/rule_md_043.py index 0552fdbfb..09351bc44 100644 --- a/pymarkdown/plugins/rule_md_043.py +++ b/pymarkdown/plugins/rule_md_043.py @@ -44,7 +44,7 @@ def get_details(self) -> PluginDetails: @classmethod def __validate_heading_pattern(cls, found_value: str) -> None: - if found_value.strip(): + if found_value.strip(" "): _, _, compile_error = cls.__compile(found_value) if compile_error: raise ValueError(f"Heading format not valid: {compile_error}") @@ -57,6 +57,7 @@ def __compile( compiled_lines: List[Union[str, Tuple[int, str]]] = [] are_any_wildcards = False for next_part in found_parts: + next_part = next_part.strip(" ") if next_part == "*": if compiled_lines and compiled_lines[-1] == "*": return ( @@ -83,17 +84,15 @@ def __compile( new_index, extracted_whitespace = ParserHelper.extract_ascii_whitespace( next_part, new_index ) - if not extracted_whitespace: + if ( + not extracted_whitespace + or len(extracted_whitespace) != 1 + or len(next_part) == new_index + ): return ( [], False, - "Element must have at least one space character after any hash characters (#).", - ) - if len(next_part) == new_index: - return ( - [], - False, - "Element must have at least one non-space character after any space characters.", + "Element must have exactly one space character and one non-space character after any hash characters (#).", ) compiled_lines.append((count, next_part[new_index:])) return compiled_lines, are_any_wildcards, None diff --git a/pymarkdown/plugins/rule_md_044.py b/pymarkdown/plugins/rule_md_044.py index d1c1bf0af..0c614e2be 100644 --- a/pymarkdown/plugins/rule_md_044.py +++ b/pymarkdown/plugins/rule_md_044.py @@ -64,10 +64,10 @@ def initialize_from_config(self) -> None: if names := self.plugin_configuration.get_string_property( "names", default_value="", - ).strip(): + ).strip(" "): lower_list: List[str] = [] for next_name in names.split(","): - next_name = next_name.strip() + next_name = next_name.strip(" ") if not next_name: raise ValueError( "Elements in the comma-separated list cannot be empty." diff --git a/pymarkdown/plugins/rule_md_045.py b/pymarkdown/plugins/rule_md_045.py index 9c250ebf1..64e4593b5 100644 --- a/pymarkdown/plugins/rule_md_045.py +++ b/pymarkdown/plugins/rule_md_045.py @@ -3,6 +3,7 @@ """ from typing import cast +from pymarkdown.general.constants import Constants from pymarkdown.plugin_manager.plugin_details import PluginDetails from pymarkdown.plugin_manager.plugin_scan_context import PluginScanContext from pymarkdown.plugin_manager.rule_plugin import RulePlugin @@ -35,5 +36,7 @@ def next_token(self, context: PluginScanContext, token: MarkdownToken) -> None: """ if token.is_inline_image: image_token = cast(ImageStartMarkdownToken, token) - if not image_token.text_from_blocks.strip(): + if not image_token.text_from_blocks.strip( + Constants.unicode_whitespace.value() + ): self.report_next_token_error(context, token) diff --git a/pymarkdown/plugins/rule_pml_100.py b/pymarkdown/plugins/rule_pml_100.py index 15e49ff30..f33009632 100644 --- a/pymarkdown/plugins/rule_pml_100.py +++ b/pymarkdown/plugins/rule_pml_100.py @@ -65,7 +65,7 @@ def initialize_from_config(self) -> None: if modify_tag_names is not None: tag_config_name = "plugins.disallowed-html.change_tag_names" for next_tag_part in modify_tag_names.split(","): - next_tag_part = next_tag_part.strip() + next_tag_part = next_tag_part.strip(" ") if not next_tag_part: raise ValueError( f"Configuration item '{tag_config_name}' contains at least one empty string." diff --git a/pymarkdown/tokens/text_markdown_token.py b/pymarkdown/tokens/text_markdown_token.py index 7aef34003..316856701 100644 --- a/pymarkdown/tokens/text_markdown_token.py +++ b/pymarkdown/tokens/text_markdown_token.py @@ -577,13 +577,17 @@ def __handle_text_token_normal( ) arrays_to_combine: List[List[str]] = [] if newlines_in_adjusted == newlines_in_whitespace: - arrays_to_combine.append(adjusted_text_token.split("\n")) + arrays_to_combine.append( + adjusted_text_token.split(ParserHelper.newline_character) + ) else: TextMarkdownToken.__handle_text_token_normal_enhanced( arrays_to_combine, text_token ) - arrays_to_combine.append(resolved_whitespace.split("\n")) + arrays_to_combine.append( + resolved_whitespace.split(ParserHelper.newline_character) + ) assert len(arrays_to_combine[0]) == len(arrays_to_combine[1]) POGGER.debug("arrays_to_combine>:$:<", arrays_to_combine) final_parts: List[str] = [] diff --git a/test/api/test_api_scan.py b/test/api/test_api_scan.py index c441385ce..8a7499cdd 100644 --- a/test/api/test_api_scan.py +++ b/test/api/test_api_scan.py @@ -234,7 +234,7 @@ def test_api_scan_recursive_for_directory(): for i in scan_result.scan_failures: itemized_scan_failures = itemized_scan_failures + "\n" + str(i) print(itemized_scan_failures) - assert len(scan_result.scan_failures) == 53 + assert len(scan_result.scan_failures) == 54 scan_failures = [] for i in scan_result.scan_failures: diff --git a/test/extensions/test_markdown_front_matter.py b/test/extensions/test_markdown_front_matter.py index 225fa9bbd..a3735e907 100644 --- a/test/extensions/test_markdown_front_matter.py +++ b/test/extensions/test_markdown_front_matter.py @@ -407,7 +407,7 @@ def test_front_matter_13(): @pytest.mark.gfm -def test_front_matter_14(): +def test_front_matter_14x(): """ Any whitespace after the three - characters in the start boundary is acceptable. """ @@ -431,6 +431,55 @@ def test_front_matter_14(): ) +@pytest.mark.gfm +def test_front_matter_14a(): + """ + 14 - variant + """ + + # Arrange + source_markdown = """---\x0c\x0c +Title: my document +--- +""" + expected_tokens = [ + "[front-matter(1,1):---\x0c\x0c:---:['Title: my document']:{'Title': 'my document'}]", + "[BLANK(4,1):]", + ] + expected_gfm = """""" + + # Act & Assert + act_and_assert( + source_markdown, expected_gfm, expected_tokens, config_map=config_map + ) + + +@pytest.mark.gfm +def test_front_matter_14b(): + """ + 14 - variant, but with \u00a0 which is unicode ws, but not normal whitespace + """ + + # Arrange + source_markdown = """---\u00a0\u00a0 +Title: my document +--- +""" + expected_tokens = [ + "[setext(3,1):-:3::(1,1)]", + "[text(1,1):---\u00a0\u00a0\nTitle: my document::\n]", + "[end-setext::]", + "[BLANK(4,1):]", + ] + expected_gfm = """

---\u00a0\u00a0 +Title: my document

""" + + # Act & Assert + act_and_assert( + source_markdown, expected_gfm, expected_tokens, config_map=config_map + ) + + @pytest.mark.gfm def test_front_matter_15(): """ @@ -456,6 +505,55 @@ def test_front_matter_15(): ) +@pytest.mark.gfm +def test_front_matter_15a(): + """ + Any whitespace after the three - characters in the end boundary is acceptable. + """ + + # Arrange + source_markdown = """--- +Title: my document +---\x0c\x0c +""" + expected_tokens = [ + "[front-matter(1,1):---:---\x0c\x0c:['Title: my document']:{'Title': 'my document'}]", + "[BLANK(4,1):]", + ] + expected_gfm = """""" + + # Act & Assert + act_and_assert( + source_markdown, expected_gfm, expected_tokens, config_map=config_map + ) + + +@pytest.mark.gfm +def test_front_matter_15b(): + """ + Any whitespace after the three - characters in the end boundary is acceptable. + """ + + # Arrange + source_markdown = """--- +Title: my document +---\u00a0\u00a0 +""" + expected_tokens = [ + "[tbreak(1,1):-::---]", + "[para(2,1):\n]", + "[text(2,1):Title: my document\n---\u00a0\u00a0::\n]", + "[end-para:::True]", + "[BLANK(4,1):]", + ] + expected_gfm = """
\n

Title: my document\n---\u00a0\u00a0

""" + + # Act & Assert + act_and_assert( + source_markdown, expected_gfm, expected_tokens, config_map=config_map + ) + + @pytest.mark.gfm def test_front_matter_16(): """ diff --git a/test/extensions/test_markdown_pragma_parsing.py b/test/extensions/test_markdown_pragma_parsing.py index 5fec6dae9..68c437986 100644 --- a/test/extensions/test_markdown_pragma_parsing.py +++ b/test/extensions/test_markdown_pragma_parsing.py @@ -276,3 +276,52 @@ def test_pragma_parsing_011(): # Act & Assert act_and_assert(source_markdown, expected_gfm, expected_tokens) + + +@pytest.mark.gfm +def test_pragma_parsing_011a(): + """ + Test case 11: Pragma heading, but with extra spacing after the closing comment. + """ + + # Arrange + source_markdown = """\x0c\x0c\x0c +this is a paragraph +""" + expected_tokens = [ + "[para(2,1):]", + "[text(2,1):this is a paragraph:]", + "[end-para:::True]", + "[BLANK(3,1):]", + "[pragma:1:\x0c\x0c\x0c]", + ] + expected_gfm = "

this is a paragraph

" + + # Act & Assert + act_and_assert(source_markdown, expected_gfm, expected_tokens) + + +@pytest.mark.gfm +def test_pragma_parsing_011b(): + """ + Test case 11: Pragma heading, but with extra spacing after the closing comment. + """ + + # Arrange + source_markdown = """\u00a0\u00a0 +this is a paragraph +""" + expected_tokens = [ + "[html-block(1,1)]", + "[text(1,1):\u00a0\u00a0:]", + "[end-html-block:::False]", + "[para(2,1):]", + "[text(2,1):this is a paragraph:]", + "[end-para:::True]", + "[BLANK(3,1):]", + ] + expected_gfm = """\u00a0\u00a0 +

this is a paragraph

""" + + # Act & Assert + act_and_assert(source_markdown, expected_gfm, expected_tokens) diff --git a/test/extensions/test_markdown_pragmas.py b/test/extensions/test_markdown_pragmas.py index 82d2daaf3..51ae17eb6 100644 --- a/test/extensions/test_markdown_pragmas.py +++ b/test/extensions/test_markdown_pragmas.py @@ -120,6 +120,44 @@ def test_pragmas_disable_next_line_no_id(): ) +@pytest.mark.gfm +def test_pragmas_disable_next_line_no_id_more_spaces(): + """ + Test the case where we specify a 'disable-next-line' pragma, but specify no id to disable. + """ + + # Arrange + scanner = MarkdownScanner() + source_path = os.path.join( + "test", + "resources", + "pragmas", + "atx_heading_with_multiple_spaces_disable_with_no_id_ms.md", + ) + supplied_arguments = [ + "scan", + source_path, + ] + + expected_return_code = 1 + expected_output = ( + f"{source_path}:2:1: " + + "MD019: Multiple spaces are present after hash character on Atx Heading. (no-multiple-space-atx)\n" + ) + expected_error = ( + f"{source_path}:1:1: " + + "INLINE: Inline configuration command 'disable-next-line' specified a plugin with a blank id.\n" + ) + + # Act + execute_results = scanner.invoke_main(arguments=supplied_arguments) + + # Assert + execute_results.assert_results( + expected_output, expected_error, expected_return_code + ) + + @pytest.mark.gfm def test_pragmas_disable_next_line_bad_id(): """ @@ -414,14 +452,9 @@ def test_pragmas_disable_next_line_valid_id_extra_ws_after_pragma(): source_path, ] - expected_return_code = 1 - expected_output = """{source_path}:2:1: MD019: Multiple spaces are present after hash character on Atx Heading. (no-multiple-space-atx) -""".replace( - "{source_path}", source_path - ) - expected_error = "{source_path}:1:1: INLINE: Inline configuration specified without command.".replace( - "{source_path}", source_path - ) + expected_return_code = 0 + expected_output = "" + expected_error = "" # Act execute_results = scanner.invoke_main(arguments=supplied_arguments) @@ -450,14 +483,9 @@ def test_pragmas_disable_next_line_valid_id_extra_ws_after(): source_path, ] - expected_return_code = 1 - expected_output = """[pso[[psf[{source_path}:2:1: MD019: Multiple spaces are present after hash character on Atx Heading. (no-multiple-space-atx)]]]] -""".replace( - "{source_path}", source_path - ) - expected_error = "[pse[[ppf[{source_path}:1:1: INLINE: Inline configuration specified without command.]]]]".replace( - "{source_path}", source_path - ) + expected_return_code = 0 + expected_output = "" + expected_error = "" # Act execute_results = scanner.invoke_main(arguments=supplied_arguments) diff --git a/test/gfm/test_markdown_code_spans.py b/test/gfm/test_markdown_code_spans.py index 5d1f67805..65072742d 100644 --- a/test/gfm/test_markdown_code_spans.py +++ b/test/gfm/test_markdown_code_spans.py @@ -123,6 +123,25 @@ def test_code_spans_343(): act_and_assert(source_markdown, expected_gfm, expected_tokens) +@pytest.mark.gfm +def test_code_spans_343a(): + """ + Test case 343: variant + """ + + # Arrange + source_markdown = """`\u000cb\u000c`""" + expected_tokens = [ + "[para(1,1):]", + "[icode-span(1,1):\u000cb\u000c:`::]", + "[end-para:::True]", + ] + expected_gfm = """

\u000cb\u000c

""" + + # Act & Assert + act_and_assert(source_markdown, expected_gfm, expected_tokens) + + @pytest.mark.gfm def test_code_spans_344(): """ diff --git a/test/gfm/test_markdown_list_blocks.py b/test/gfm/test_markdown_list_blocks.py index 99f62b8e3..5ebd6bc0a 100644 --- a/test/gfm/test_markdown_list_blocks.py +++ b/test/gfm/test_markdown_list_blocks.py @@ -8,7 +8,7 @@ @pytest.mark.gfm -def test_list_blocks_231(): +def test_list_blocks_231x(): """ Test case 231: If the list item is ordered, then it is also assigned a start number, based on the ordered list marker. """ @@ -47,6 +47,75 @@ def test_list_blocks_231(): act_and_assert(source_markdown, expected_gfm, expected_tokens) +@pytest.mark.gfm +def test_list_blocks_231a(): + """ + Test case 231: If the list item is ordered, then it is also assigned a start number, based on the ordered list marker. + """ + + # Arrange + source_markdown = """A paragraph +with two lines. +\x0c + indented code +\x0c +> A block quote.""" + expected_tokens = [ + "[para(1,1):\n]", + "[text(1,1):A paragraph\nwith two lines.::\n]", + "[end-para:::True]", + "[BLANK(3,1):\x0c]", + "[icode-block(4,5): :]", + "[text(4,5):indented code:]", + "[end-icode-block:::True]", + "[BLANK(5,1):\x0c]", + "[block-quote(6,1)::> ]", + "[para(6,3):]", + "[text(6,3):A block quote.:]", + "[end-para:::True]", + "[end-block-quote:::True]", + ] + expected_gfm = """

A paragraph +with two lines.

+
indented code
+
+
+

A block quote.

+
""" + + # Act & Assert + act_and_assert(source_markdown, expected_gfm, expected_tokens) + + +@pytest.mark.gfm +def test_list_blocks_231b(): + """ + Test case 231: variant + """ + + # Arrange + source_markdown = """A paragraph +with two lines. +\u00a0 + indented code +\u00a0 +> A block quote.""" + expected_tokens = [ + "[para(1,1):\n\n\n \n]", + "[text(1,1):A paragraph\nwith two lines.\n\u00a0\nindented code\n\u00a0::\n\n\n\n]", + "[end-para:::True]", + "[block-quote(6,1)::> ]", + "[para(6,3):]", + "[text(6,3):A block quote.:]", + "[end-para:::True]", + "[end-block-quote:::True]", + ] + expected_gfm = """

A paragraph\nwith two lines.\n\u00a0\nindented code\n\u00a0

\n
\n

A block quote.

\n
""" + + # Act & Assert + act_and_assert(source_markdown, expected_gfm, expected_tokens) + + @pytest.mark.gfm def test_list_blocks_232(): """ diff --git a/test/gfm/test_markdown_reference_links.py b/test/gfm/test_markdown_reference_links.py index c0c8c42be..1fb5c238d 100644 --- a/test/gfm/test_markdown_reference_links.py +++ b/test/gfm/test_markdown_reference_links.py @@ -480,6 +480,190 @@ def test_reference_links_549(): act_and_assert(source_markdown, expected_gfm, expected_tokens) +@pytest.mark.gfm +def test_reference_links_549ax(): + """ + Test case 549: variant + """ + + # Arrange + source_markdown = """[ Foo bar ]: /url + +[Baz][Foo bar]""" + expected_tokens = [ + "[link-ref-def(1,1):True::foo bar: Foo bar : :/url:::::]", + "[BLANK(2,1):]", + "[para(3,1):]", + "[link(3,1):full:/url::::Foo bar:Baz:False::::]", + "[text(3,2):Baz:]", + "[end-link::]", + "[end-para:::True]", + ] + expected_gfm = """

Baz

""" + + # Act & Assert + act_and_assert(source_markdown, expected_gfm, expected_tokens) + + +@pytest.mark.gfm +def test_reference_links_549aa(): + """ + Test case 549: variant + """ + + # Arrange + source_markdown = """[\x0cFoo\x0c\x0cbar\x0c]: /url + +[Baz][Foo bar]""" + expected_tokens = [ + "[link-ref-def(1,1):True::foo bar:\x0cFoo\x0c\x0cbar\x0c: :/url:::::]", + "[BLANK(2,1):]", + "[para(3,1):]", + "[link(3,1):full:/url::::Foo bar:Baz:False::::]", + "[text(3,2):Baz:]", + "[end-link::]", + "[end-para:::True]", + ] + expected_gfm = """

Baz

""" + + # Act & Assert + act_and_assert(source_markdown, expected_gfm, expected_tokens) + + +@pytest.mark.gfm +def test_reference_links_549bx(): + """ + Test case 549: variant + """ + + # Arrange + source_markdown = """[Foo bar]: /url + +[Baz][ Foo bar ]""" + expected_tokens = [ + "[link-ref-def(1,1):True::foo bar:Foo bar: :/url:::::]", + "[BLANK(2,1):]", + "[para(3,1):]", + "[link(3,1):full:/url:::: Foo bar :Baz:False::::]", + "[text(3,2):Baz:]", + "[end-link::]", + "[end-para:::True]", + ] + expected_gfm = """

Baz

""" + + # Act & Assert + act_and_assert(source_markdown, expected_gfm, expected_tokens) + + +@pytest.mark.gfm +def test_reference_links_549ba(): + """ + Test case 549: variant + """ + + # Arrange + source_markdown = """[Foo bar]: /url + +[Baz][\x0cFoo\x0c\x0cbar\x0c]""" + expected_tokens = [ + "[link-ref-def(1,1):True::foo bar:Foo bar: :/url:::::]", + "[BLANK(2,1):]", + "[para(3,1):]", + "[link(3,1):full:/url::::\x0cFoo\x0c\x0cbar\x0c:Baz:False::::]", + "[text(3,2):Baz:]", + "[end-link::]", + "[end-para:::True]", + ] + expected_gfm = """

Baz

""" + + # Act & Assert + act_and_assert(source_markdown, expected_gfm, expected_tokens) + + +@pytest.mark.gfm +def test_reference_links_549c(): + """ + Test case 549: variant + """ + + # Arrange + source_markdown = """[Foo bar]: /url + +[Baz][\u00a0Foo bar ]""" + expected_tokens = [ + "[link-ref-def(1,1):True::foo bar:Foo bar: :/url:::::]", + "[BLANK(2,1):]", + "[para(3,1):]", + "[text(3,1):[:]", + "[text(3,2):Baz:]", + "[text(3,5):]:]", + "[text(3,6):[:]", + "[text(3,7):\u00a0Foo bar :]", + "[text(3,17):]:]", + "[end-para:::True]", + ] + expected_gfm = """

[Baz][\u00a0Foo bar ]

""" + + # Act & Assert + act_and_assert(source_markdown, expected_gfm, expected_tokens) + + +@pytest.mark.gfm +def test_reference_links_549d(): + """ + Test case 549: variant + """ + + # Arrange + source_markdown = """[Foo bar]: /url + +[Baz][ Foo bar\u00a0]""" + expected_tokens = [ + "[link-ref-def(1,1):True::foo bar:Foo bar: :/url:::::]", + "[BLANK(2,1):]", + "[para(3,1):]", + "[text(3,1):[:]", + "[text(3,2):Baz:]", + "[text(3,5):]:]", + "[text(3,6):[:]", + "[text(3,7): Foo bar\u00a0:]", + "[text(3,17):]:]", + "[end-para:::True]", + ] + expected_gfm = """

[Baz][ Foo bar\u00a0]

""" + + # Act & Assert + act_and_assert(source_markdown, expected_gfm, expected_tokens) + + +@pytest.mark.gfm +def test_reference_links_549e(): + """ + Test case 549: variant + """ + + # Arrange + source_markdown = """[Foo bar]: /url + +[Baz][ Foo\u00a0\u00a0bar ]""" + expected_tokens = [ + "[link-ref-def(1,1):True::foo bar:Foo bar: :/url:::::]", + "[BLANK(2,1):]", + "[para(3,1):]", + "[text(3,1):[:]", + "[text(3,2):Baz:]", + "[text(3,5):]:]", + "[text(3,6):[:]", + "[text(3,7): Foo\u00a0\u00a0bar :]", + "[text(3,17):]:]", + "[end-para:::True]", + ] + expected_gfm = """

[Baz][ Foo\u00a0\u00a0bar ]

""" + + # Act & Assert + act_and_assert(source_markdown, expected_gfm, expected_tokens) + + @pytest.mark.gfm def test_reference_links_550(): """ diff --git a/test/resources/pragmas/atx_heading_with_multiple_spaces_disable_with_no_id_ms.md b/test/resources/pragmas/atx_heading_with_multiple_spaces_disable_with_no_id_ms.md new file mode 100644 index 000000000..8d8a4e0a2 --- /dev/null +++ b/test/resources/pragmas/atx_heading_with_multiple_spaces_disable_with_no_id_ms.md @@ -0,0 +1,4 @@ + +# My Section + +one line paragraph diff --git a/test/resources/test-issue-945.md b/test/resources/test-issue-945.md new file mode 100644 index 000000000..e85d4436a --- /dev/null +++ b/test/resources/test-issue-945.md @@ -0,0 +1,7 @@ +# Title + +The next line contains UTF characters c2a0 (NO-BREAK SPACE): + +                                                                   + +This page should break pymarkdownlnt diff --git a/test/rules/test_md033.py b/test/rules/test_md033.py index 0132d24aa..e85f6fbca 100644 --- a/test/rules/test_md033.py +++ b/test/rules/test_md033.py @@ -43,6 +43,42 @@ def test_md033_bad_configuration_allowed_elements(): ) +@pytest.mark.rules +def test_md033_bad_configuration_allowed_elements_with_empty(): + """ + Test to verify that a configuration error is thrown when supplying the + allowed_elements value with an integer that is not a string. + """ + + # Arrange + scanner = MarkdownScanner() + source_path = os.path.join( + "test", "resources", "rules", "md004", "good_list_asterisk_single_level.md" + ) + supplied_arguments = [ + "--set", + "plugins.md033.allowed_elements=html,,a", + "--strict-config", + "scan", + source_path, + ] + + expected_return_code = 1 + expected_output = "" + expected_error = ( + "BadPluginError encountered while configuring plugins:\n" + + "Elements in the comma-separated list cannot be empty." + ) + + # Act + execute_results = scanner.invoke_main(arguments=supplied_arguments) + + # Assert + execute_results.assert_results( + expected_output, expected_error, expected_return_code + ) + + @pytest.mark.rules def test_md033_bad_configuration_allow_first_image_element(): """ diff --git a/test/rules/test_md043.py b/test/rules/test_md043.py index e97302b68..a98d2e5af 100644 --- a/test/rules/test_md043.py +++ b/test/rules/test_md043.py @@ -235,7 +235,7 @@ def test_md043_bad_configuration_headings_bad_whitespace(): expected_error = ( "BadPluginError encountered while configuring plugins:\n" + "The value for property 'plugins.md043.headings' is not valid: Heading format not valid: " - + "Element must have at least one space character after any hash characters (#)." + + "Element must have exactly one space character and one non-space character after any hash characters (#)." ) # Act @@ -274,9 +274,50 @@ def test_md043_bad_configuration_headings_bad_text(): expected_return_code = 1 expected_output = "" expected_error = ( - "BadPluginError encountered while configuring plugins:\n" + "\n\nBadPluginError encountered while configuring plugins:\n" + "The value for property 'plugins.md043.headings' is not valid: " - + "Heading format not valid: Element must have at least one non-space character after any space characters." + + "Heading format not valid: Element must have exactly one space character and one non-space character after any hash characters (#)." + ) + + # Act + execute_results = scanner.invoke_main( + arguments=supplied_arguments, suppress_first_line_heading_rule=False + ) + + # Assert + execute_results.assert_results( + expected_output, expected_error, expected_return_code + ) + + +@pytest.mark.rules +def test_md043_bad_configuration_headings_bad_text_2(): + """ + Test to make sure this rule does trigger with a document that + contains multiple headings and a pattern with no text. + """ + + # Arrange + scanner = MarkdownScanner() + source_path = os.path.join( + "test", "resources", "rules", "md043", "good_simple_headings.md" + ) + supplied_arguments = [ + "--disable-rules", + "md024", + "--set", + "plugins.md043.headings=###### a", + "--strict-config", + "scan", + source_path, + ] + + expected_return_code = 1 + expected_output = "" + expected_error = ( + "\n\nBadPluginError encountered while configuring plugins:\n" + + "The value for property 'plugins.md043.headings' is not valid: " + + "Heading format not valid: Element must have exactly one space character and one non-space character after any hash characters (#)." ) # Act @@ -472,6 +513,41 @@ def test_md043_good_double_heading_atx_with_double_rule(): ) +@pytest.mark.rules +def test_md043_good_double_heading_atx_with_double_rule_with_spaces_in_config(): + """ + Test to make sure this rule does trigger with a document that + contains two headings and a pattern of those two headings. + """ + + # Arrange + scanner = MarkdownScanner() + source_path = os.path.join( + "test", "resources", "rules", "md043", "good_double_heading_atx.md" + ) + supplied_arguments = [ + "--set", + "plugins.md043.headings= # This is a single heading , ## Another heading ", + "--strict-config", + "scan", + source_path, + ] + + expected_return_code = 0 + expected_output = "" + expected_error = "" + + # Act + execute_results = scanner.invoke_main( + arguments=supplied_arguments, suppress_first_line_heading_rule=False + ) + + # Assert + execute_results.assert_results( + expected_output, expected_error, expected_return_code + ) + + @pytest.mark.rules def test_md043_bad_double_heading_atx_with_double_rule_bad_level(): """ diff --git a/test/rules/test_md045.py b/test/rules/test_md045.py index e2319dded..8818ae028 100644 --- a/test/rules/test_md045.py +++ b/test/rules/test_md045.py @@ -3,6 +3,7 @@ """ import os from test.markdown_scanner import MarkdownScanner +from test.utils import create_temporary_configuration_file import pytest @@ -87,6 +88,7 @@ def test_md045_bad_inline_image_whitespace_only(): "test", "resources", "rules", "md045", "bad_inline_image_whitespace_only.md" ) supplied_arguments = [ + "--stack-trace", "--disable-rules", "md039", "--set", @@ -111,6 +113,44 @@ def test_md045_bad_inline_image_whitespace_only(): ) +@pytest.mark.rules +def test_md045_bad_inline_image_whitespace_only_2(): + """ + Test to make sure this rule does trigger with a document that + contains inline images with alt text that is whitespace only. + """ + + # Arrange + scanner = MarkdownScanner() + source_contents = """![\u00a0](image.png]) +""" + + with create_temporary_configuration_file( + source_contents, file_name_suffix=".md" + ) as source_path: + supplied_arguments = [ + "--disable-rules", + "md039", + "scan", + source_path, + ] + + expected_return_code = 1 + expected_output = ( + f"{source_path}:1:1: " + + "MD045: Images should have alternate text (alt text) (no-alt-text)" + ) + expected_error = "" + + # Act + execute_results = scanner.invoke_main(arguments=supplied_arguments) + + # Assert + execute_results.assert_results( + expected_output, expected_error, expected_return_code + ) + + @pytest.mark.rules def test_md045_good_full_image(): """ diff --git a/test/test_markdown_extra.py b/test/test_markdown_extra.py index 95d36ef86..ab5e34c2f 100644 --- a/test/test_markdown_extra.py +++ b/test/test_markdown_extra.py @@ -1,6 +1,7 @@ """ Extra tests. """ +import os from test.markdown_scanner import MarkdownScanner from test.utils import act_and_assert, create_temporary_configuration_file @@ -1818,7 +1819,7 @@ def test_extra_019b(): @pytest.mark.gfm -def test_extra_020(): +def test_extra_020x(): """ TBD """ @@ -1955,6 +1956,96 @@ def test_extra_020b(): act_and_assert(source_markdown, expected_gfm, expected_tokens) +@pytest.mark.gfm +def test_extra_020c(): + """ + TBD + """ + + # Arrange + source_markdown = """> 1. list +> this +> \x0c +> [abc]: /url +> 1. that +""" + expected_tokens = [ + "[block-quote(1,1)::> \n> \n> \n> \n> \n]", + "[olist(1,3):.:1:5:: \n\n \n]", + "[para(1,6):\n]", + "[text(1,6):list\nthis::\n]", + "[end-para:::True]", + "[BLANK(3,3):\x0c]", + "[link-ref-def(4,6):True::abc:: :/url:::::]", + "[li(5,3):5::1]", + "[para(5,6):]", + "[text(5,6):that:]", + "[end-para:::True]", + "[BLANK(6,1):]", + "[end-olist:::True]", + "[end-block-quote:::True]", + ] + expected_gfm = """
+
    +
  1. +

    list +this

    +
  2. +
  3. +

    that

    +
  4. +
+
""" + + # Act & Assert + act_and_assert(source_markdown, expected_gfm, expected_tokens) + + +@pytest.mark.gfm +def test_extra_020d(): + """ + TBD + """ + + # Arrange + source_markdown = """> 1. list +> this +> \u00a0 +> [abc]: /url +> 1. that +""" + expected_tokens = [ + "[block-quote(1,1)::> \n> \n> \n> \n> ]", + "[olist(1,3):.:1:5:: \n\n \n]", + "[para(1,6):\n\n\n]", + "[text(1,6):list\nthis\n\u00a0\n::\n\n\n]", + "[text(4,1):[:]", + "[text(4,2):abc:]", + "[text(4,5):]:]", + "[text(4,6):: /url:]", + "[end-para:::True]", + "[li(5,3):5::1]", + "[para(5,6):]", + "[text(5,6):that:]", + "[end-para:::True]", + "[BLANK(6,1):]", + "[end-olist:::True]", + "[end-block-quote:::True]", + ] + expected_gfm = """
+
    +
  1. list +this +\u00a0 +[abc]: /url
  2. +
  3. that
  4. +
+
""" + + # Act & Assert + act_and_assert(source_markdown, expected_gfm, expected_tokens) + + @pytest.mark.gfm def test_extra_021x(): """ @@ -4642,6 +4733,55 @@ def test_extra_034e(): ) +@pytest.mark.gfm +def test_extra_035x(): + """ + TBD - from https://github.com/jackdewinter/pymarkdown/issues/945 + """ + + # Arrange + scanner = MarkdownScanner() + input_path = os.path.join("test", "resources", "test-issue-945.md") + + supplied_arguments = ["--stack-trace", "scan", input_path] + + expected_return_code = 1 + expected_output = f"{input_path}:1:2: MD010: Hard tabs [Column: 2] (no-hard-tabs)" + expected_error = "" + + # Act + execute_results = scanner.invoke_main(arguments=supplied_arguments) + + # Assert + execute_results.assert_results( + expected_output, expected_error, expected_return_code + ) + + +@pytest.mark.gfm +def test_extra_035a(): + """ + This may look like a blank line, but according to GFM, no space characters above x20 + unless using emphasis. + """ + + # Arrange + source_markdown = """The next line contains UTF characters c2a0 (NO-BREAK SPACE): +\u00a0 +This page should not break pymarkdown""" + expected_tokens = [ + "[para(1,1):\n\n]", + "[text(1,1):The next line contains UTF characters c2a0 (NO-BREAK SPACE):\n\u00a0\nThis page should not break pymarkdown::\n\n]", + "[end-para:::True]", + ] + expected_gfm = """

The next line contains UTF characters c2a0 (NO-BREAK SPACE): +\u00a0 +This page should not break pymarkdown

""" + + # Act & Assert + act_and_assert(source_markdown, expected_gfm, expected_tokens) + + @pytest.mark.gfm def test_extra_999(): """ diff --git a/test/utils.py b/test/utils.py index f05aa7116..3f8a2f1eb 100644 --- a/test/utils.py +++ b/test/utils.py @@ -295,6 +295,7 @@ def write_temporary_configuration( dir=directory, suffix=file_name_suffix, prefix=file_name_prefix, + encoding="utf-8", ) as outfile: if isinstance(supplied_configuration, str): outfile.write(supplied_configuration)