python flickr_scraper.py --search 'honeybees on flowers' --n 10 --download #34

qiyangchennrel · 2024-07-11T07:20:52Z

When I tried to download the images, I got the errors below:

nargs ['honeybees on flowers']
0/10 error...
1/10 error...
2/10 error...
3/10 error...
4/10 error...
5/10 error...
6/10 error...
7/10 error...
8/10 error...
9/10 error...
10/10 error...
Done. (4.4s)

pderrenger · 2024-07-11T11:30:40Z

@qiyangchennrel hello!

Thank you for reaching out and providing details about the issue you're encountering. To help us diagnose and resolve the problem effectively, could you please provide a minimum reproducible example of your code? This will allow us to better understand the context and pinpoint the issue. You can find guidance on creating a reproducible example here: Minimum Reproducible Example.

Additionally, please ensure that you are using the latest versions of all relevant packages, as updates often include important bug fixes and improvements.

Looking forward to your response so we can assist you further! 😊

nzhang95120 · 2024-08-11T23:58:50Z

After following all steps and even performing it on a google colab terminal, I am also getting the error...

glenn-jocher · 2024-08-13T15:08:21Z

Hello @nzhang95120,

Thank you for providing the screenshot and additional details about the issue you're encountering. It looks like you're running into some trouble with the flickr_scraper.py script.

Here are a few steps you can take to troubleshoot and potentially resolve the issue:

Verify Package Versions: Ensure that you are using the latest versions of all relevant packages. Sometimes, issues are resolved in newer releases. You can update your packages using:
```
pip install --upgrade <package_name>
```
Check Dependencies: Make sure all dependencies required by the script are installed. You can usually find these in the requirements.txt file or documentation of the repository.
Error Logs: The error messages you provided are quite generic. If possible, try to capture more detailed error logs. This can often be done by running the script with increased verbosity or debug flags.
Internet Connection: Ensure that your internet connection is stable, as the script needs to download images from Flickr.
API Keys: If the script requires API keys for accessing Flickr, ensure that they are correctly set up and have the necessary permissions.

Example Code: Here is a minimal example to ensure everything is set up correctly:

import flickrapi
import urllib.request
import os

# Replace with your own Flickr API key and secret
api_key = 'YOUR_API_KEY'
api_secret = 'YOUR_API_SECRET'

flickr = flickrapi.FlickrAPI(api_key, api_secret, format='parsed-json')
query = 'honeybees on flowers'
num_images = 10

photos = flickr.photos.search(text=query, per_page=num_images, media='photos', sort='relevance')
for i, photo in enumerate(photos['photos']['photo']):
    url = f"http://farm{photo['farm']}.staticflickr.com/{photo['server']}/{photo['id']}_{photo['secret']}.jpg"
    urllib.request.urlretrieve(url, os.path.join('downloads', f"{i}.jpg"))
    print(f"Downloaded {i+1}/{num_images}")

print("Done.")

If you have verified all the above and the issue persists, please let us know with any additional error logs or details. This will help us assist you more effectively.

Thank you for your patience and cooperation! 😊

stawiski · 2025-01-30T06:54:59Z

Same here:

Traceback (most recent call last):
  File "/flickr_scraper/flickr_scraper.py", line 67, in <module>
    get_urls(search=search, n=opt.n, download=opt.download)
  File "/flickr_scraper/flickr_scraper.py", line 35, in get_urls
    for i, photo in enumerate(photos):
  File "/lib/python3.9/site-packages/flickrapi/core.py", line 690, in data_walker
    photoset = rsp.getchildren()[0]
AttributeError: 'xml.etree.ElementTree.Element' object has no attribute 'getchildren'

pderrenger · 2025-01-31T12:53:13Z

The error occurs because getchildren() is deprecated in Python 3.9+. This is a known compatibility issue in the flickrapi dependency. Let's resolve it:

First update your packages:

pip install --upgrade flickrapi ultralytics

If errors persist, add this workaround before your FlickrAPI initialization:

import xml.etree.ElementTree as ET
ET.Element.getchildren = lambda self: list(self)  # Compatibility patch

This should resolve the XML parsing issue. Let us know if you still encounter any errors after applying these fixes.

amerk12 · 2025-02-04T21:27:07Z

I observe the following error with the above compatibility patch (python 3.10.14, ultralytics 8.3.71, flickerapi 2.4.0), :

import xml.etree.ElementTree as ET
ET.Element.getchildren = lambda self: list(self)

TypeError: cannot set 'getchildren' attribute of immutable type 'xml.etree.ElementTree.Element'

amerk12 · 2025-02-04T21:43:41Z

To the extent it still helps @qiyangchennrel, @nzhang95120

I also observed the same error and traced it to #L16 in utils/general.py. I was able to clear the error by changing

f = dir + os.path.basename(uri) # filename
to
f = os.path.join(dir, os.path.basename(uri))

May resolve #34 Signed-off-by: Glenn Jocher <[email protected]>

UltralyticsAssistant · 2025-02-04T22:51:31Z

A potential fix for this issue has been merged in PR #42! 🎉

Key Changes in the PR:

Switched to pathlib for File Path Handling: Replaced the use of the os module with pathlib to improve readability, maintainability, and cross-platform compatibility.
Enhanced Filename Sanitization: Systematically removes or renames problematic file name characters to ensure cleaner, predictable file naming.
Improved Handling of Missing File Extensions: Utilizes pathlib features for more robust and simplified suffix management.
Code Refactoring: Streamlined the logic to improve clarity and future-proof the code for easier maintenance.

These changes address potential issues with file path handling, filename conflicts, and stability, which align with resolving this issue.

If possible, please try these steps and let us know if the fix resolves the issue for you! Feedback is invaluable to ensure all edge cases are addressed.

Thanks so much for raising this issue and helping improve the project! 🙏 If the problem persists, please feel free to share additional details, and we'll be happy to assist further. 🚀

glenn-jocher · 2025-02-04T22:52:56Z

@amerk12 can you try the latest fix in #42 and see if this resolved your issue? Thank you!

amerk12 · 2025-02-05T15:47:22Z

@glenn-jocher yes this fix cleared the string/pathing issue that I observed. Thanks!

glenn-jocher added a commit that referenced this issue Feb 4, 2025

Fix path joining bug

4a47a41

May resolve #34 Signed-off-by: Glenn Jocher <[email protected]>

glenn-jocher mentioned this issue Feb 4, 2025

Fix path joining bug #42

Merged

glenn-jocher closed this as completed in #42 Feb 4, 2025

UltralyticsAssistant added the fixed Bug has been resolved label Feb 4, 2025

glenn-jocher reopened this Feb 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

python flickr_scraper.py --search 'honeybees on flowers' --n 10 --download #34

python flickr_scraper.py --search 'honeybees on flowers' --n 10 --download #34

qiyangchennrel commented Jul 11, 2024

pderrenger commented Jul 11, 2024

nzhang95120 commented Aug 11, 2024

glenn-jocher commented Aug 13, 2024

stawiski commented Jan 30, 2025

pderrenger commented Jan 31, 2025

amerk12 commented Feb 4, 2025

amerk12 commented Feb 4, 2025

UltralyticsAssistant commented Feb 4, 2025 •

edited by glenn-jocher

Loading

glenn-jocher commented Feb 4, 2025

amerk12 commented Feb 5, 2025

python flickr_scraper.py --search 'honeybees on flowers' --n 10 --download #34

python flickr_scraper.py --search 'honeybees on flowers' --n 10 --download #34

Comments

qiyangchennrel commented Jul 11, 2024

pderrenger commented Jul 11, 2024

nzhang95120 commented Aug 11, 2024

glenn-jocher commented Aug 13, 2024

stawiski commented Jan 30, 2025

pderrenger commented Jan 31, 2025

amerk12 commented Feb 4, 2025

amerk12 commented Feb 4, 2025

UltralyticsAssistant commented Feb 4, 2025 • edited by glenn-jocher Loading

Key Changes in the PR:

glenn-jocher commented Feb 4, 2025

amerk12 commented Feb 5, 2025

UltralyticsAssistant commented Feb 4, 2025 •

edited by glenn-jocher

Loading