-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
text: add checkbox to control server-side markdown conversion #5378
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another compromise we are making is that we cannot turn on/off markdown for a specific card since this is more of a global setting.
decode = lambda bs: bs.decode("utf-8") if isinstance(bs, bytes) else bs | ||
text_arr_str = np.array( | ||
[decode(bs) for bs in text_arr.reshape(-1)] | ||
).reshape(text_arr.shape) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do note that the markdown version makes sure it accounts for null characters resulting from decoding mismatch.
In tensorboard/plugin_util.py.
source_decoded = source.decode("utf-8")
# Remove null bytes and warn if there were any, since it probably means
# we were given a bad encoding.
source = source_decoded.replace("\x00", "")
total_null_bytes += len(source_decoded) - len(source)
...
warning = ""
if total_null_bytes:
warning = (
"<!-- WARNING: discarded %d null bytes in markdown string "
"after UTF-8 decoding -->\n"
) % total_null_bytes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I considered adding that to safe_html
but wasn't sure it was warranted since the sanitizer strips the null bytes anyway (confirmed with a unit test). The reason given in the markdown version for doing the stripping is that otherwise, the markdown conversion gets confused by the null bytes:
tensorboard/tensorboard/plugin_util_test.py
Lines 175 to 181 in ee9d15e
def test_null_bytes_stripped_before_markdown_processing(self): | |
# If this function is mistakenly called with UTF-16 or UTF-32 encoded text, | |
# there will probably be a bunch of null bytes. These would be stripped by | |
# the sanitizer no matter what, but make sure we remove them before markdown | |
# interpretation to avoid affecting output (e.g. middle-word underscores | |
# would generate erroneous <em> tags like "un<em>der</em>score") and add an | |
# HTML comment with a warning. |
So the only difference really is whether we insert the diagnostic HTML comment or not. That didn't quite seem worth any potential overhead to me, but I'm happy to add it back if you prefer?
PTAL re: null byte stripping
Yeah, this is a fair point, but I guess it seemed acceptable to me for now - I suspect most users will probably log either plain or markdown consistently depending on use case, and for those users, having to toggle each card separately would be a pain. For users who really do log several different varieties of text summary, it seems to me that they'd be best served anyway by introducing a write-time parameter to summary-metadata, since that way they don't have to go through and manually toggle the individual tags to the desired state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I definitely missed the null byte test. Thanks for pointing it out.
…flow#5378) * define plugin_util.safe_html() with no markdown interpretation * make plugin_util.safe_html() handle unicode vs bytes clearly * add markdown=false request parameter to disable markdown interpretation * fix bytes to str conversion bug in text plugin no-markdown codepath * text: add checkbox to control server-side markdown conversion * yarn fix-lint
…flow#5378) * define plugin_util.safe_html() with no markdown interpretation * make plugin_util.safe_html() handle unicode vs bytes clearly * add markdown=false request parameter to disable markdown interpretation * fix bytes to str conversion bug in text plugin no-markdown codepath * text: add checkbox to control server-side markdown conversion * yarn fix-lint
Workaround for #5369. Fixes #830.
This adds a simple
[√] Enable Markdown
checkbox to the Polymer text plugin, which if deselected (it's selected by default to preserve existing behavior) will skip the markdown conversion step on the server. To do this, we introduce amarkdown
query parameter to the/data/plugin/text/text
endpoint, and add aplugin_util.safe_html()
function that does only the sanitization step ofmarkdown_to_safe_html
.I've made the checkbox persist to localstorage (like a few other Polymer properties) since I expect most people will either want it always on or always off (and for working around the slow rendering, having to re-disable it on load each time would make this basically worthless as a workaround).
There are some limitations to this approach (it's Polymer-only; client-side markdown rendering would still be nicer). Ultimately, it might make more sense to add a summary API option to specify at write time whether the data should be treated as markdown or not (as suggested in #832 (comment)). But that requires more in-depth thought since it's persisted forever, and wouldn't help users on older versions of TensorFlow or users who might want to occasionally disable the Markdown rendering for debugging, etc. So even if we do add that option, I think we might still want to keep a UI control, and have it just become a try-state of
auto
(use summary metadata) in addition totrue
andfalse
.Test plan: tested with text demo data. Confirmed I can toggle it on and off and it appropriately re-fetches and re-renders the text summary data, and that local storage persistence behaves as expected. Googlers, see cl/402948742 for internal test sync and test case.
Screenshots
Enabled

Disabled
