-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WordCount: Add types #22077
WordCount: Add types #22077
Conversation
Size Change: -3.63 kB (0%) Total Size: 1.19 MB
ℹ️ View Unchanged
|
Thank you for the thorough review and comments, @aduth – I really appreciate it 🤩 Will be looking through it and addressing the feedback in the next day or two. |
@aduth the feedback should be address, feel free to have another look when you get a chance 🙇 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've left a few thoughts for consideration, nothing major or blocking. Thanks for joining us on this typed adventure, @nb! Great to have you along 😁
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like where you've taken this! It's so nice to see where we can refactor the internal interfaces supported by the type system. I've left some notes and comments, I'll follow up with a larger comment and a diff of some changes I've been exploring.
Thanks for continuing to iterate here!
I'll share some work I did exploring some ideas I alluded to in my review #22077 (review). Some notes:
/**
* @param {string} text The text being processed
* @param {WPWordCountStrategy} type The type of count. Accepts 'words', 'characters_excluding_spaces', or 'characters_including_spaces'.
* @param {WPWordCountUserSettings} userSettings Custom settings object.
* @return {number} The word or character count.
*/
export function count( text, type, userSettings ) {
// Here, the `type` parameter has a strict type:
// 'words' | 'characters_excluding_spaces' | 'characters_including_spaces'
// But this is the JavaScript public API… we don't _really_ know what we get.
// We _do_ want our public interface to claim the correct type, however!
// Widening the public interface to `string` is undesirable.
// We can widen this type internally:
/** @type {string} */
const userType = type;
// Then we let the type system go from the wider `string` type back to our well known type.
const validType =
userType === 'characters_excluding_spaces' ||
userType === 'characters_including_spaces'
? userType
: 'words';
// Now our strict `loadSettings` function happily accepts our well-typed `validType`.
const settings = loadSettings( validType, userSettings );
// snip…
} This is exploratory and I'm happy to discuss further. There are already some very nice changes here and I'm happy to do a more narrow review if you'd prefer. You did ask for all the nitpicking 🙂 Diff without spacediff --git a/packages/wordcount/src/defaultSettings.js b/packages/wordcount/src/defaultSettings.js
index 020f44162..90394832c 100644
--- a/packages/wordcount/src/defaultSettings.js
+++ b/packages/wordcount/src/defaultSettings.js
@@ -1,9 +1,13 @@
/** @typedef {import('./index').WPWordCountStrategy} WPWordCountStrategy */
-/** @typedef {Partial<{type: WPWordCountStrategy, shortcodes: string[]}>} WPWordCountL10n */
+/**
+ * @typedef WPWordCountL10n
+ * @property {string[]} shortcodes Localized shortcodes.
+ */
/**
- * @typedef WPWordCountSettingsFields
+ * @typedef WPWordCountSettings
+ *
* @property {RegExp} HTMLRegExp Regular expression that matches HTML tags
* @property {RegExp} HTMLcommentRegExp Regular expression that matches HTML comments
* @property {RegExp} spaceRegExp Regular expression that matches spaces in HTML
@@ -14,27 +18,18 @@
* @property {RegExp} wordsRegExp Regular expression that matches words
* @property {RegExp} characters_excluding_spacesRegExp Regular expression that matches characters excluding spaces
* @property {RegExp} characters_including_spacesRegExp Regular expression that matches characters including spaces
- * @property {RegExp} shortcodesRegExp Regular expression that matches WordPress shortcodes
+ * @property {RegExp|undefined} [shortcodesRegExp] Regular expression that matches WordPress shortcodes
* @property {string[]} shortcodes List of all shortcodes
* @property {WPWordCountStrategy} type Describes what and how are we counting
- * @property {WPWordCountL10n} l10n Object with human translations
+ * @property {WPWordCountL10n|undefined} [l10n] Object with human translations
*/
-// Disable reason: JSDoc linter doesn't seem to parse the union (`&`) correctly: https://github.com/jsdoc/jsdoc/issues/1285
-/* eslint-disable jsdoc/valid-types */
/**
* Lower-level settings for word counting that can be overridden.
*
- * @typedef {Partial<WPWordCountSettingsFields>} WPWordCountUserSettings
+ * @typedef {Partial<WPWordCountSettings>} WPWordCountUserSettings
*/
-/**
- * Word counting settings that include non-optional values we set if missing
- *
- * @typedef {WPWordCountUserSettings & typeof defaultSettings} WPWordCountDefaultSettings
- */
-/* eslint-enable jsdoc/valid-types */
-
export const defaultSettings = {
HTMLRegExp: /<\/?[a-z][^>]*?>/gi,
HTMLcommentRegExp: /<!--[\s\S]*?-->/g,
@@ -108,7 +103,4 @@ export const defaultSettings = {
* \u2029 = paragraph separator
*/
characters_including_spacesRegExp: /[^\f\n\r\t\v\u00AD\u2028\u2029]/g,
- l10n: {
- type: 'words',
- },
};
diff --git a/packages/wordcount/src/index.js b/packages/wordcount/src/index.js
index f1ccce2f7..eceb8d3c9 100644
--- a/packages/wordcount/src/index.js
+++ b/packages/wordcount/src/index.js
@@ -18,7 +18,7 @@ import stripSpaces from './stripSpaces';
import transposeHTMLEntitiesToCountableChars from './transposeHTMLEntitiesToCountableChars';
/**
- * @typedef {import('./defaultSettings').WPWordCountDefaultSettings} WPWordCountSettings
+ * @typedef {import('./defaultSettings').WPWordCountSettings} WPWordCountSettings
* @typedef {import('./defaultSettings').WPWordCountUserSettings} WPWordCountUserSettings
*/
@@ -37,26 +37,18 @@ import transposeHTMLEntitiesToCountableChars from './transposeHTMLEntitiesToCoun
* @return {WPWordCountSettings} The combined settings object to be used.
*/
function loadSettings( type, userSettings ) {
- const settings = extend( defaultSettings, userSettings );
+ const settings = extend( {}, defaultSettings, userSettings, {
+ shortcodes: userSettings.l10n?.shortcodes ?? [],
+ type,
+ } );
- settings.shortcodes = settings.l10n?.shortcodes ?? [];
-
- if ( settings.shortcodes && settings.shortcodes.length ) {
+ if ( settings.shortcodes?.length ) {
settings.shortcodesRegExp = new RegExp(
'\\[\\/?(?:' + settings.shortcodes.join( '|' ) + ')[^\\]]*?\\]',
'g'
);
}
- settings.type = type || settings.l10n?.type;
-
- if (
- settings.type !== 'characters_excluding_spaces' &&
- settings.type !== 'characters_including_spaces'
- ) {
- settings.type = 'words';
- }
-
return settings;
}
@@ -121,7 +113,16 @@ function countCharacters( text, regex, settings ) {
* @return {number} The word or character count.
*/
export function count( text, type, userSettings ) {
- const settings = loadSettings( type, userSettings );
+ /** @type {string} */
+ const userType = type;
+
+ const validType =
+ userType === 'characters_excluding_spaces' ||
+ userType === 'characters_including_spaces'
+ ? userType
+ : 'words';
+
+ const settings = loadSettings( validType, userSettings );
let matchRegExp;
switch ( settings.type ) {
case 'words': Full diffdiff --git a/packages/wordcount/src/defaultSettings.js b/packages/wordcount/src/defaultSettings.js
index 020f44162..90394832c 100644
--- a/packages/wordcount/src/defaultSettings.js
+++ b/packages/wordcount/src/defaultSettings.js
@@ -1,40 +1,35 @@
/** @typedef {import('./index').WPWordCountStrategy} WPWordCountStrategy */
-/** @typedef {Partial<{type: WPWordCountStrategy, shortcodes: string[]}>} WPWordCountL10n */
-
/**
- * @typedef WPWordCountSettingsFields
- * @property {RegExp} HTMLRegExp Regular expression that matches HTML tags
- * @property {RegExp} HTMLcommentRegExp Regular expression that matches HTML comments
- * @property {RegExp} spaceRegExp Regular expression that matches spaces in HTML
- * @property {RegExp} HTMLEntityRegExp Regular expression that matches HTML entities
- * @property {RegExp} connectorRegExp Regular expression that matches word connectors, like em-dash
- * @property {RegExp} removeRegExp Regular expression that matches various characters to be removed when counting
- * @property {RegExp} astralRegExp Regular expression that matches astral UTF-16 code points
- * @property {RegExp} wordsRegExp Regular expression that matches words
- * @property {RegExp} characters_excluding_spacesRegExp Regular expression that matches characters excluding spaces
- * @property {RegExp} characters_including_spacesRegExp Regular expression that matches characters including spaces
- * @property {RegExp} shortcodesRegExp Regular expression that matches WordPress shortcodes
- * @property {string[]} shortcodes List of all shortcodes
- * @property {WPWordCountStrategy} type Describes what and how are we counting
- * @property {WPWordCountL10n} l10n Object with human translations
+ * @typedef WPWordCountL10n
+ * @property {string[]} shortcodes Localized shortcodes.
+ */
+
+/**
+ * @typedef WPWordCountSettings
+ *
+ * @property {RegExp} HTMLRegExp Regular expression that matches HTML tags
+ * @property {RegExp} HTMLcommentRegExp Regular expression that matches HTML comments
+ * @property {RegExp} spaceRegExp Regular expression that matches spaces in HTML
+ * @property {RegExp} HTMLEntityRegExp Regular expression that matches HTML entities
+ * @property {RegExp} connectorRegExp Regular expression that matches word connectors, like em-dash
+ * @property {RegExp} removeRegExp Regular expression that matches various characters to be removed when counting
+ * @property {RegExp} astralRegExp Regular expression that matches astral UTF-16 code points
+ * @property {RegExp} wordsRegExp Regular expression that matches words
+ * @property {RegExp} characters_excluding_spacesRegExp Regular expression that matches characters excluding spaces
+ * @property {RegExp} characters_including_spacesRegExp Regular expression that matches characters including spaces
+ * @property {RegExp|undefined} [shortcodesRegExp] Regular expression that matches WordPress shortcodes
+ * @property {string[]} shortcodes List of all shortcodes
+ * @property {WPWordCountStrategy} type Describes what and how are we counting
+ * @property {WPWordCountL10n|undefined} [l10n] Object with human translations
*/
-// Disable reason: JSDoc linter doesn't seem to parse the union (`&`) correctly: https://github.com/jsdoc/jsdoc/issues/1285
-/* eslint-disable jsdoc/valid-types */
/**
* Lower-level settings for word counting that can be overridden.
*
- * @typedef {Partial<WPWordCountSettingsFields>} WPWordCountUserSettings
+ * @typedef {Partial<WPWordCountSettings>} WPWordCountUserSettings
*/
-/**
- * Word counting settings that include non-optional values we set if missing
- *
- * @typedef {WPWordCountUserSettings & typeof defaultSettings} WPWordCountDefaultSettings
- */
-/* eslint-enable jsdoc/valid-types */
-
export const defaultSettings = {
HTMLRegExp: /<\/?[a-z][^>]*?>/gi,
HTMLcommentRegExp: /<!--[\s\S]*?-->/g,
@@ -108,7 +103,4 @@ export const defaultSettings = {
* \u2029 = paragraph separator
*/
characters_including_spacesRegExp: /[^\f\n\r\t\v\u00AD\u2028\u2029]/g,
- l10n: {
- type: 'words',
- },
};
diff --git a/packages/wordcount/src/index.js b/packages/wordcount/src/index.js
index f1ccce2f7..eceb8d3c9 100644
--- a/packages/wordcount/src/index.js
+++ b/packages/wordcount/src/index.js
@@ -18,8 +18,8 @@ import stripSpaces from './stripSpaces';
import transposeHTMLEntitiesToCountableChars from './transposeHTMLEntitiesToCountableChars';
/**
- * @typedef {import('./defaultSettings').WPWordCountDefaultSettings} WPWordCountSettings
- * @typedef {import('./defaultSettings').WPWordCountUserSettings} WPWordCountUserSettings
+ * @typedef {import('./defaultSettings').WPWordCountSettings} WPWordCountSettings
+ * @typedef {import('./defaultSettings').WPWordCountUserSettings} WPWordCountUserSettings
*/
/**
@@ -37,26 +37,18 @@ import transposeHTMLEntitiesToCountableChars from './transposeHTMLEntitiesToCoun
* @return {WPWordCountSettings} The combined settings object to be used.
*/
function loadSettings( type, userSettings ) {
- const settings = extend( defaultSettings, userSettings );
+ const settings = extend( {}, defaultSettings, userSettings, {
+ shortcodes: userSettings.l10n?.shortcodes ?? [],
+ type,
+ } );
- settings.shortcodes = settings.l10n?.shortcodes ?? [];
-
- if ( settings.shortcodes && settings.shortcodes.length ) {
+ if ( settings.shortcodes?.length ) {
settings.shortcodesRegExp = new RegExp(
'\\[\\/?(?:' + settings.shortcodes.join( '|' ) + ')[^\\]]*?\\]',
'g'
);
}
- settings.type = type || settings.l10n?.type;
-
- if (
- settings.type !== 'characters_excluding_spaces' &&
- settings.type !== 'characters_including_spaces'
- ) {
- settings.type = 'words';
- }
-
return settings;
}
@@ -121,7 +113,16 @@ function countCharacters( text, regex, settings ) {
* @return {number} The word or character count.
*/
export function count( text, type, userSettings ) {
- const settings = loadSettings( type, userSettings );
+ /** @type {string} */
+ const userType = type;
+
+ const validType =
+ userType === 'characters_excluding_spaces' ||
+ userType === 'characters_including_spaces'
+ ? userType
+ : 'words';
+
+ const settings = loadSettings( validType, userSettings );
let matchRegExp;
switch ( settings.type ) {
case 'words': |
@sirreal I am having a hard time understanding what problem does this diff solve? I see that we simplified one function for the sake of moving the same logic to another at the same level of abstraction (even if one level down in indirection). We had the extra safety for the |
Thanks for your patience, I've had a lot come up that has prevented me from giving this the focus it deserves. I'll try to clarify my reasoning, but I'll preface a lengthy response by saying that, after additional scrutiny, it doesn't work as well as I expected 🙂 I'll elaborate…
My goals were:
The significant improvement I see is that we describe a single interface and use it to control several types. Rather than Unfortunately, while this appeared to work very well in the implementation I shared, I've continued testing and the type system doesn't provide the safety I'd hoped. For example: import { extend } from "lodash";
export type Options = {
type: string;
enabled: boolean;
};
// ⚠️ This type includes `{ enabled: string }`, it should be incompatible with `Options`
const defaultOptions = {
type: "default-type",
enabled: "baz"
};
// I'd expect a type error here, but there are none 😫
export function parseOptions(options: Partial<Options>): Options {
return extend({}, defaultOptions, options);
} The crux of this seems to be that function assignIn<TObject, TSource1, TSource2>(
object: TObject,
source1: TSource1,
source2: TSource2
): TObject & TSource1 & TSource2; It's surprising to me that (irrelevant properties omitted) This may be related to microsoft/TypeScript#12936. |
Regarding
We did: gutenberg/packages/wordcount/src/index.js Lines 51 to 58 in aa335aa
However, the type of Code sketchingThis will log:
TypeScript module: type CountType = "A" | "B";
interface Settings {
type: CountType;
}
// Public API
// This is great as long as it's invoked by well-typed code.
// However, we have very weak guarantees about JavaScript consumers.
export function takesType(type: CountType): CountType {
// Our type is already infected if type isn't actually CountType 😱
const s: Settings = makeSettings(type);
switch (s.type) {
case "A":
case "B":
return typeFromSettings(s);
}
// We're don't need to provide a return here according to the type system because all possible branches have returns!
// This section is flagged as unreachable!
// I'll include a bit more here…
console.log("If you see this, something unexpected happened…");
return typeFromSettings(s);
}
// Public API still claims type is CountType
export function safeTakesType(type: CountType): CountType {
// Here, we're defensive. We don't trust the user provided `type`, so we cast to `unknown`
// and rely on the type system to narrow the type.
const userType: unknown = type;
const safeType = userType === "A" ? userType : "B";
const s: Settings = makeSettings(safeType);
switch (s.type) {
case "A":
case "B":
return typeFromSettings(s);
}
}
function makeSettings(type: CountType): Settings {
return { type };
}
function typeFromSettings(s: Settings): CountType {
return s.type;
} Consumer (JavaScript) import { safeTakesType, takesType } from "./type";
console.log("Unsafe:", takesType("foo"));
console.log("Safe:", safeTakesType("foo")); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @nb! The changes are nice and I'm looking forward to ticking another package off the "typed" list.
This does need a rebase, but the changes look good and the tests continue to pass without modifications. All the code changes seem to be for the better 🙂
I engaged in more academic exercises in this review than I usually would and I appreciate your patience, I learned a few things and I hope it was informative for you as well. None of my recent comments are blocking; they're part of the investigative and exploratory review.
Please, land this at your leisure 👍
switch ( settings.type ) { | ||
case 'words': | ||
matchRegExp = settings.wordsRegExp; | ||
return countWords( text, matchRegExp, settings ); | ||
case 'characters_including_spaces': | ||
matchRegExp = settings.characters_including_spacesRegExp; | ||
return countCharacters( text, matchRegExp, settings ); | ||
case 'characters_excluding_spaces': | ||
matchRegExp = settings.characters_excluding_spacesRegExp; | ||
return countCharacters( text, matchRegExp, settings ); | ||
default: | ||
return 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like how this section turned out. It's nice that the count…()
functions return a number
result.
I don't think there's anything blocking. I'll try to circle around and finish this in the next few days. @nb let me know if you'd prefer to handle it. |
It’s a constant set of strings.
For consistency – all the rest of the strip* functions work like this. Those aren’t exported outside of the wordcount package, so should be safe to change the semantics.
Since the functions don't use this, we can safely use null. Also helps with type-checking :-) We still need the bind, even if it's not a class, because of `settings`. Not sure if syntax like `text => stripTags( settings, text )` will be more readable, considered it, but decided to liimt the surface of my changes.
This hides their complexity (including complex return types) inside themselves and makes makes the final `count()` function simpler. As for the `count()` function, I considered adding a TypeScript ignore, though the chance for adding a fourth type and not dealing with it here is real. Also, the three `if` clauses aren't so hard to read.
Honestly, mostly to make the linter happy, not sure how useful they are.
I totally forgot about the convention in this context. Of course, I would have expected some of the tools to have mentioned it along the way, but alas…
In order to do that the type descriptions moved to the source, instead of some of them living next to the import. One trade-off for me was that VSCode now doesn’t pick up the descriptions of the imported types.
Has better rhythm than the `if`s (for me, it’s subjective). I had tried the `switch` before, but I didn’t think of skipping the `break`s. We don’t need them with all the `returns. Technically we don’t need the `return 0` since we’re exhausting all options for `type` union. However, I’d rather err on the side of caution for cases when the codebase isn’t type checked. Props @aduth.
See https://make.wordpress.org/core/handbook/best-practices/inline-documentation-standards/javascript/#aligning-comments For the record, because of some really long strings it’s not more readable to me.
The idea came from @sirreal ([0], [1]) to use the already defined keys in defaultSettings as a natural way to have a stronger type guarantee. Once we have the narrower type, we can distinguish between settings object we expect from the user, where everything is optional and a settings object that comes from `loadSettings()` where we know that a massive amount of the properties have been set as part of `defaultSettings`. [0] https://git.io/JfVWj [1] https://git.io/JfVle
Since we have a typecheck guarantee that most properties are there, we don’t need to do all those checks. Note that we still do the checks for `shortcodesRegExp` and `type`. Since the guarantee for them doesn’t come from `defaultSettings`, but from `loadSettings()`, we would have needed yet another layer of type to describe the settings coming from `defaultSettings` + `type`, `shortcodes`, and `shortcodesRegExp`. Personally, this crossed a line, and I decided that leaving the checks in `stripShortcodes.js` and the `default` in the `switch` in `count()` were a smaller price to pay than adding yet another type and dealing with type trickery (would’ve needed a `Pick` and to disable `jsdoc/valid-types` again).
Doesn’t change functionality much, but it was more confusing spanning multiple declarations.
`extend` is an alias for [assignIn](https://lodash.com/docs/4.17.15#assignIn): > This method is like _.assign except that it iterates over own and inherited source properties. > Note: This method mutates object. Props @sirreal. Co-authored-by: Jon Surrell <[email protected]>
We know that `type` can’t be missing or falsy, so we can skip the check.
aa335aa
to
983b4d4
Compare
Rebased |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving. Merge pending CI 👍
I've read through your suggestions @sirreal but I'm afraid I'll need a bit longer to let them sink in 😬 and I'd rather get this un-stuck. We can iterate later if we decide that some types can use some more tightening 👍
Part of #18838.
Add types to
wordcount
package.Testing Instructions
npm run build:package-types
–– shouldn't be any errors.I am quite new to TypeScript and adding types to Gutenberg, so for the sake of learning, bring all of your nitpicking :-)