Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug report: "Extract Domains" actually extracts hostnames #618

Closed
gwittel opened this issue Aug 23, 2019 · 3 comments
Closed

Bug report: "Extract Domains" actually extracts hostnames #618

gwittel opened this issue Aug 23, 2019 · 3 comments

Comments

@gwittel
Copy link

gwittel commented Aug 23, 2019

Describe the bug

Extract domains is more like extract hostnames. Please rename it or implement the claimed functionality (or implement both as separate items).

To Reproduce
Steps to reproduce the behavior or a link to the recipe / input used to cause the bug:

  1. Add 'Extract Domains' to a recipe
  2. Add input like http://www.host.com or http://www.host.co.uk (or leave out the schemes)
  3. Output has www.host.com or www.host.co.uk. Those would be the hostnames whereas host.com and host.co.uk are the domains.

Expected behavior

The domain for www.host.com is host.com.

Screenshots

n/a

Desktop (if relevant, please complete the following information):

  • Mac
  • Firefox
  • 68

Additional context
Add any other context about the problem here.

@gwittel gwittel added the bug label Aug 23, 2019
@n1474335
Copy link
Member

n1474335 commented Aug 27, 2019

In this case, 'Domains' is being used to refer to 'Domain names'. Domain names are any name assigned via DNS to represent an IP address. In your example above, com is the Top Level Domain (TLD), host is the second level domain (a subdomain of com), and www is the third level domain (a subdomain of host). The whole thing together, www.host.com, is technically a Fully Qualified Domain Name (FQDN).

A hostname is a label that identifies a networked device. FQDNs are a type of hostname, but a hostname doesn't necessarily have to use the DNS structure. On a local network you might just assign single word hostnames to identify your devices because globally unique, fully qualified DNS is not really required at that scale.

So the 'Extract Domains' operation isn't really looking for hostnames, it's looking for fully qualified domain names, or as much of the domain name as it can find. There is a potential argument for renaming it to 'Extract Domain names', however I think it's reasonably clear from the context that 'Domains' means 'Domain names' in this scenario.

Does that make sense?

@n1474335 n1474335 removed the bug label Aug 27, 2019
n1474335 added a commit that referenced this issue Aug 27, 2019
@gwittel
Copy link
Author

gwittel commented Aug 27, 2019

Yes, the FQDN clarification absolutely makes sense. My first intuition given the naming was toward registerable domains (one hop above public suffix). Feel free to close this out unless you want to move this toward a feature request for extracting the registerable name portion.

@n1474335
Copy link
Member

I'll close it for now. Programmatically extracting only the publicly registrable domain would be quite difficult.

Consider the domain m.en.host.co.uk. Determining which subdomain is the correct cut off is not trivial and I'm not sure I'm aware of many use cases that would make it worthwhile.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants