-
-
Notifications
You must be signed in to change notification settings - Fork 185
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Capabilities queries #2322
Add Capabilities queries #2322
Conversation
@tomayac looks like you’ve lots of linting errors. You can fix most of these automatically if you have your Python env set up in your src directory by running:
Gimme a shout if you need a hand. |
Oh and similarly you can run:
to make sure you’re all clean before committng if bored of waiting for GitHub Action to complete. |
All checks green now. I formatted via the BigQuery front-end, but looks like we have other preferences here, which is fine, too. I like the lint rules here better that the front-end's… |
Agree with most of ours but not a big fan of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great now. A couple of minor comments/nits.
Also is this the only query for the Capabilities chapter or will you be adding more? Noticed you had some blink counter queries last year. If adding more it would be good to add a checklist to the initial comment (see the other PRs as examples) so we can track progress for this chapter and see how far through we are in writing the queries.
Co-authored-by: Barry Pollard <[email protected]>
Co-authored-by: Barry Pollard <[email protected]>
Thanks for the nits :-) This will be the only query then, since the queries based on use counters already exist and are available in an evergreen report. We can reference specific APIs if we need to (example). This year the idea was to focus less on quantitative analysis but focus more on qualitative aspects like how apps use these APIs. |
Actually, literally hitting the Comment button I recall we wanted to do a fun analysis that would be quantitative: We wanted to determine the most Fugu page on the Internet by ordering pages by the number of Fugu APIs they use. Could you help with this? My SQL foo is a bit rusty (as you have no doubt noticed). |
OK then think it's good to merge. Can you copy the results into the official sheet for this chapter: https://docs.google.com/spreadsheets/d/1b4moteB9EiLYkH1Ln9qfi1tnU-E4N2UQ87uayWytDKw/edit?usp=sharing |
What about something like this: CREATE TEMP FUNCTION getFuguAPIs(data STRING)
RETURNS ARRAY<STRING>
LANGUAGE js AS '''
const $ = JSON.parse(data);
return Object.keys($);
''';
SELECT
_TABLE_SUFFIX AS client,
url,
COUNT(DISTINCT fuguAPI) AS fuguAPIs
FROM
`httparchive.pages.2021_07_01_*`,
UNNEST(getFuguAPIs(JSON_QUERY(payload, '$."_fugu-apis"'))) AS fuguAPI
WHERE
JSON_QUERY(payload, '$."_fugu-apis"') != "[]"
GROUP BY
client,
url
HAVING
COUNT(DISTINCT fuguAPI) >= 1
ORDER BY
fuguAPIs DESC,
url,
client
LIMIT 100; First 15 results are:
|
Amazing, thanks for that! Just committed this query to the repo. |
BTW, on an only slightly related point, I updated those reports to add the rank lens's recently and also got them working with these blink usage queries. So you can see if the top 1,000 websites use Fugu APIs more than the average internet (they don't), or whatever. Never bothered fixing it for the CMS lens's (Wordpress, Drupal and Magento) as a bit trickier and, particularly for these APIs it's unlikely to be used on those sites anyway. |
Don't see it. Did you push? |
…hive.org into fugu-queries
Ooops, sorry. I pushed, but didn't see that I had to pull first in order for it to go through. |
Results for both queries added to the official results sheet: https://docs.google.com/spreadsheets/d/1b4moteB9EiLYkH1Ln9qfi1tnU-E4N2UQ87uayWytDKw/edit?usp=sharing. |
Cheers. I added the rest of the 100 since I had that tab open still. Also added a pivot table. Why is GamePad used sooo much? Does some popular embed (YouTube?) used it? Anyway think you can tick off chapter item 3 - Validate Results! 🎉 |
Oh, thanks!
Looking at ChromeStatus, it looks not as popular (but it's looking at concrete events, not just the API per se.
Woohoo! |
It may be worth investigating these further. For the PWA queries we specifically excluded YouTube embeds for some queries (see #2272 ) as we felt they were incorrectly clouding the stats for events that were never used. @demianrenzulli has more details. |
I found a better ChromeStatus stats entry (this was hidden in plain sight). It looks like the |
Adds queries for mobile and desktop for all the APIs we have detection for.
Related to #2152.