Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URLs using IP or other hostname for images in meta bundle #672

Closed
mizziness opened this issue Jun 12, 2020 · 12 comments
Closed

URLs using IP or other hostname for images in meta bundle #672

mizziness opened this issue Jun 12, 2020 · 12 comments
Labels
need info Need more information on the issue

Comments

@mizziness
Copy link

Example of this is live on our website at https://applause.com. If you view the source, the plugin seems to be spitting out a JSON blob with meta bundle data. However, under graph, we see the following incorrect urls:

...
         "@id":"@web#creator",
         "@type":"LocalBusiness",
         "alternateName":"Applause",
         "image":{
            "@type":"ImageObject",
            "height":"462",
            "url":"http://64.62.135.231/images/global/applause_brand_image.jpg",
            "width":"820"
         },
         "name":"Applause App Quality, Inc",
         "priceRange":"$",
         "url":"https://prod-website.cloud.applause.com"
...

I would expect both of these urls should be using either the baseUrl or the siteUrl. I've also tried deleting and re-adding all of the images under Global settings, but to no avail.

The data seems to be stored in the database under seomatic_metabundles with the sourceType (et al) set to __GLOBAL_BUNDLE__.

Possibly related to #577

@khalwat
Copy link
Collaborator

khalwat commented Jun 16, 2020

How are your asset volumes set up? It looks to me like they may not be explicitly declared?

Also how is your siteUrl set? Or are you using the @web built in alias?

craftcms/cms#3559

@khalwat khalwat added the need info Need more information on the issue label Jun 16, 2020
@khalwat
Copy link
Collaborator

khalwat commented Jun 18, 2020

I'm going to assume it's this:

https://twitter.com/nystudio107/status/1074429502615379968

https://docs.craftcms.com/v3/sites.html#creating-a-site

Don’t ever use the @web alias when defining your sites’ Base URLs. It could introduce a cache poisoning vulnerability, and Craft won’t be able to reliably determine which site is being requested

Basically, if you use the @web alias for your siteUrl or Asset Volume URLs without explicitly defining it, Yii2 tries to dynamically determine it, but it can be set to whatever the client wants it set to.

So explicitly set the @web alias or use some other explicitly defined alias or environment variable for your site's siteUrl and Asset Volume URLs.

@khalwat khalwat closed this as completed Jun 18, 2020
@mizziness
Copy link
Author

Apologies for the late reply! I do have my @web and siteUrl set definitively.

@khalwat
Copy link
Collaborator

khalwat commented Jul 8, 2020

Can you show me how you're setting this @mizziness ?

@khalwat
Copy link
Collaborator

khalwat commented Jul 8, 2020

Also I'd need to see how you're setting the URL for your Asset Volumes.

@mizziness
Copy link
Author

mizziness commented Jul 15, 2020

Here we go! This is in my config/general.php file:

'siteUrl' => [
      'english' =>    'https://' . $domain,
      'german' =>     'https://' . $domain . '/de',
      'french' =>     'https://' . $domain . '/fr',
    ],

It checks for $domain this way:

$prodWhitelist = array(
  'applause.com',                               // Production (Cloudfront) PostLaunch
  'www.applause.com',                           // Production (Cloudfront) PostLaunch
  'prod-www.cloud.applause.com',                // Production (Cloudfront) PreLaunch
  'prod-website.cloud.applause.com',
);

$domainWhitelist = array(
  'stage-website.devcloud.applause.com',
  'integration-website.devcloud.applause.com',
  'stage-www.devcloud.applause.com', 
  'integration-www.devcloud.applause.com',
);

if (in_array($_SERVER['HTTP_HOST'], $prodWhitelist)) {
  $domain = 'www.applause.com';
} elseif (in_array($_SERVER['HTTP_HOST'], $domainWhitelist)) {
  $domain = $_SERVER['HTTP_HOST'];
} else {
  $domain = getenv('LOCALHOST_DOMAIN');
}

The default fall-back should always be one of the fully-qualified domains. But on a site hard-coded to have a siteUrl of the official domain, I still see the server IP address and other domain.

Asset Volume Info:

Base Url: @defaultSiteUrl/images/global
File Path: @webroot/images/global

@defaultSiteUrl returns with https://www.applause.com always. @webroot returns the correct path. Opening any of the images on https://prod-website.cloud.applause.com actually loads as https://www.applause.com when viewing the image iteself.

@mizziness
Copy link
Author

@khalwat Info fully provided - would you mind taking a look when you can? It's still an issue.

@khalwat
Copy link
Collaborator

khalwat commented Jul 23, 2020

@mizziness I don't know exactly what would be causing this issues, except to say that all SEOmatic does is use the built-in Craft siteUrl() or asset.getUrl() methods for obtaining the full URL to a page or image.

So this is going to very likely be some kind of a Craft config setup issue.

What you're showing me is a URL for an image that seems incorrect... is it possible to see the Asset Volume setup/config for the Asset Volume that this image is coming from?

Assets don't use your siteUrl or baseUrl, they use their own Asset Volume URL, because they can be hosted separately from the Craft CMS website.

@mizziness
Copy link
Author

@khalwat I had edited my previous post with more info, sorry! Let me try and explain better:

@khalwat
Copy link
Collaborator

khalwat commented Jul 23, 2020

The Site URL Override is not something you'd normally use. It is only intended for "headless" setups where Craft isn't serving the content.

Show me how the alias @defaultSiteUrl is defined?

@mizziness
Copy link
Author

mizziness commented Jul 23, 2020

@khalwat

The Site URL Override is not something you'd normally use. It is only intended for "headless" setups where Craft isn't serving the content.

Okay, good, I was hoping that was an edge-case setting.

Show me how the alias @defaultSiteUrl is defined?

Long story short, it's in the .env file like so: DEFAULT_SITE_URL="https://www.applause.com" And then set to the alias via the general config file:

'aliases' => [
    ...
    '@defaultSiteUrl' => getenv('DEFAULT_SITE_URL'),
  ], ...

P.S. Thank you for putting up with me :)

@khalwat
Copy link
Collaborator

khalwat commented Jul 23, 2020

hmmmm. Well, that certainly should always be defined then. I still think this will end up being some kind of config setting, but happy to set up a video conference day/time with you to have a look if you like

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need info Need more information on the issue
Projects
None yet
Development

No branches or pull requests

2 participants