-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add C++ autosummary extension #117
Conversation
b16593f
to
4266663
Compare
This is somewhat working, but perhaps we can work together to refine it. |
Is this related to overloads sharing the same docstring? |
203fbe4
to
5852fdf
Compare
Yes --- the result is a single I added an example of this: |
This is now updated to use |
I also noticed that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First impressions.
|
||
.. rst:directive:: .. cpp-apigen-entity-summary:: entity-name | ||
|
||
Generates a summary of a single Python entity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like a copy-n-paste artifact. 😄
Generates a summary of a single Python entity. | |
Generates a summary of a single C++ entity. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this example still needs to be fixed.
Also at the moment the entity is specified by the clang usr-derived id, but it should probably be something else friendlier.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I couldn't find any documentation about the USR generation, so I often look to the src code. Some USR ids use the line number, but IIRC that is only applied to locally scoped members (like param decls and vars).
input_path, | ||
unsaved_files=[(input_path, input_content)], | ||
args=tuple(config.compiler_flags) + ("-ferror-limit=0",), | ||
options=( # TranslationUnit.PARSE_SKIP_FUNCTION_BODIES + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not ignore the function bodies?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should add a comment. If function bodies are skipped, the extent does not include the body. That interferes with detecting if there are multiple entities defined with no space between (so that they will all be documented together).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh ok. Yeah, a comment would've satisfied that inquiry.
Macros matching any of these patterns are undocumented. | ||
""" | ||
|
||
ignore_diagnostics: List[Pattern] = dataclasses.field(default_factory=list) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this could be made an arbitrary bool. In my research, I had a conf.py option to output these diags to the build log (defaulting to false) using the severity of the diag as a logging level. If users really want to use static analysis, then they can use clang-tidy separately.
Diagnostics can also be specified to the compiler_flags
option (which are really just arbitrary clang args); the compilation database's path might be used in this way also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have found that any compiler errors are usually an indication that the result of parsing will be wrong, sometimes in subtle ways, and should normally not be ignored. However, for tensorstore I ran into some errors related to an undefined __builtin in a standard library header that I was not able to avoid, and did not seem to cause any problems, so I added this mechanism to suppress them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know that clang will fail fast if it can't find an included lib, thus no parsing at all. That's why I mentioned the compilation database path (-p
I think) can be used when third party libs are involved.
I haven't yet looked at how you're normalizing the path relative to the conf.py path, but that might also be a factor when passing certain args to clang.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Paths are normalized using include_directory_map
. We should probably try to support compilation databases, but one tricky thing about a compilation database is that it doesn't include the compiler built-in include paths. An alternative is to have the user preprocess the desired headers first using their normal compiler. The original file/line information is preserved by the added #line
directives.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think users can specify -working-directory
to clang. Or, maybe we should provide an option to do it.
Going forward, can we change NONITPICK to NO_NIT_PICK? I keep reading it as NON_IT_PICK 🤣 |
Makes sense --- or could be: nowarnxref or something --- nitpick is a sphinx-specific term that may be confusing to readers of the C++ source code. |
ok, I rebased this branch on latest main, and the CI is failing because it now tests for Windows and the fix for that is in my other branch (based on this one). |
I wonder if we should merge this as an experimental feature, since this PR also contains the default-literal-role and highlight changes that also apply to Python and JSON. |
We could do that. I've been think on ways to refactor this ext. My priorities are
The database and diagnostics are a lesser priority for me. And I assume unit testing will just develop organically. |
To be clear, by better parsing mechanism, is that in regards to the doxygen syntax, or in regards to the C++ source itself? For specifying an entity, we could allow it to be specified by its "object name", e.g. |
Yes. I'm not sure how the C++ parsing would need improvement. I still want to provide a config option to turn off the doxygen syntax parsing, so the docstrings can be written in pure RST.
This was my thinking as well, but it would be more natural to use the parameter datatypes types for overloads (instead of a docstring specific ID). I find the ability to document multiple overloads with a single docstring very appealing - not even doxygen is capable of that. |
Sounds good.
That could be done using the C++ domain symbol resolution used for cross-references, but is a bit tricky because currently the symbol tree only gets populated as the c++ domain directives are actually run, which is too late. We would need to somehow create a separate one and pre-populate it. I also think it is often inconvenient to specify the full signature if you have a lot of template or function parameters. |
I noticed that. Is there a reason you don't just save the parsed API to a builder env var (on a config-inited event)? |
There wasn't a need for the symbol tree except for normal xrefs so I just tried to keep it working closely to how it normally does. Note that the c++ domain symbol lookup is slow if you have a lot of symbols in the same namespace because it just does a linear search over every name --- it would be better if it used a dict. |
Well I'm ok with this getting merged but there should be a warning on both config and demo doc pages. Possibly even a warning in the build log like you did for graphviz, but that seems a little excessive to me -- you went above and beyond my expectations there (again). |
Instead of a task list in the thread OP, you could use a github project to track the progress of this extension. |
Also adds facilities for saving/restoring default role state.
… options This also ensures that the default roles and highlight language are restored after generated JSON/C++/Python object descriptions are inserted.
Co-authored-by: Brendan <[email protected]>
* fix cross-refs in docs * enable cpp.apigen verbosity and show number of found decls * adjust clang imports for quicker edits * fix windows path separator compensation * support `\` and `@` as cmd prefixes - support for param direction - support for retval cmd - add new strip_comment() to strip all forms of C++ comment syntax from comment tokens' text - use new strip_comment() instead of text.lstrip() This will also preserve consistent indentation that is needed for code-blocks (which isn't supported yet) - modify index_interva.h to test some of the new features * use explicit role for cross-refs * admonition importance of Linux path separators * change admonished text (per request) * change erroneous admonition text * allow blank docstr lines to get normalized - add multiline flag for re.sub(brief/details) call - ran black on api_parser.py * add some unit tests * add a blank line to array.h docstr * change array.h until we support multline comments The indentation removed here (on a subsequent line) is interpretted as a blockquote while still considered within the :param: field. This change also satisfies what the the Doxygen parser expects; meaning unexpected indentation is prohibited amidst single a multi-lined paragraph. * requested changes * try to get raw_comment first * update tests about comment stripping * only dedent multiline comments during stripping adjust tests about leading whitespace for single line comments * [no ci] remove outdated conditional statement * latest review requests - reverted xrefs in Config docs (now that roles are fixed) - removed old algorithm for parsing docstring line-by-line - added regex to remove non-docstring comments from a raw_comment - revised docstring parsing test with pytest.mark.paramtrize - added 2 tests to check for proper removal of non-docstring comments * return None when no raw_comment exists * replace non-doc comments with blank lines also added IDs to the growing parametrized test_comment_styles() * satisfy review request about demo src (`\returns`)
I added a warning that it is experimental. |
Did I mess up the rebase? |
Not sure what happened with |
I thought it was weird when the rebased push didn't trigger the CI (only RTD's CI ran). |
Fixes #92.
This is based on #116.
TODO:
explicit(bool)
auto-conversion:order:
option as in Python apigenpython_apigen_default_groups
/python_apigen_default_order
:group:
and:relates:
in addition to doxygen-style commands@
-prefixed doxygen commands in addition to\
-prefixed ones (e.g.@param
in addition to\param
)