Imagedl APIs

This document describes the main public API for searching and downloading images.

Main components:

  • ImageClient: the unified high-level interface

  • BaseImageClient: the base class for lower-level usage and extension

  • ImageClientCMD: CLI usage

ImageClient

ImageClient is the main entry point for searching and downloading images from one or more sources.

Module path:

imagedl.imagedl.ImageClient

Constructor:

ImageClient(
    image_sources: list | str = "BaiduImageClient",
    init_image_clients_cfg: dict = None,
    clients_threadings: dict = {},
    requests_overrides: dict = {},
    search_filters: dict = {},
)

Arguments:

  • image_sources

    Image sources to enable.

    • Type: list | str

    • Default: "BaiduImageClient"

    Examples:

    # str
    image_sources="BaiduImageClient"
    # list
    image_sources=["BaiduImageClient", "BingImageClient"]
    

    Notes:

    • A string is converted to a one-item list.

    • Invalid source names are ignored.

    • Initialization fails if no valid source remains.

  • init_image_clients_cfg

    Per-source initialization config.

    • Type: dict

    Example:

    init_image_clients_cfg={
      "BaiduImageClient": {"work_dir": "outputs/baidu"},
      "BingImageClient": {"work_dir": "outputs/bing", "max_retries": 8},
    }
    

    Common fields include:

    • work_dir

    • max_retries

    • auto_set_proxies

    • random_update_ua

    • maintain_session

    • disable_print

    • default_search_cookies

    • default_download_cookies

    Note:

    • In practice, pass a per-source config dict.

  • clients_threadings

    Per-source thread count used by both search and download.

    • Type: dict

    • Default for missing sources: 5

    Example:

    clients_threadings={
      "BaiduImageClient": 5,
      "BingImageClient": 8,
    }
    
  • requests_overrides

    Per-source request arguments forwarded to the underlying HTTP requests.

    • Type: dict

    • Default for missing sources: {}

    Typical fields:

    • timeout

    • headers

    • proxies

    • cookies

    Example:

    requests_overrides={
      "BaiduImageClient": {"timeout": 10},
      "BingImageClient": {"timeout": 15},
    }
    
  • search_filters

    Per-source search filters.

    • Type: dict

    • Default for missing sources: {}

    Example:

    search_filters={
      "BaiduImageClient": {"type": "face", "size": "large"},
      "BingImageClient": {"license": "commercial", "date": "pastmonth"},
    }
    

    Notes:

    • Supported filter keys depend on the selected source.

    • Unsupported filters may be ignored or rejected by the source implementation.

ImageClient.download()

Downloads images returned by ImageClient.search().

ImageClient.download(image_infos: list[ImageInfo] | dict[str, list[ImageInfo]]) -> list[ImageInfo]

Arguments:

  • image_infos

    Can be either:

    • a dictionary returned by ImageClient.search(), or

    • a flat list of ImageInfo

    Examples:

    # dict
    downloaded = client.download(results)
    # flat list
    flat_list = results["BaiduImageClient"]
    downloaded = client.download(flat_list)
    

Returns (A flat list of successfully downloaded ImageInfo objects):

[ImageInfo, ImageInfo, ...]

Behavior:

  • If a dict is passed in, it is flattened automatically.

  • Images are regrouped internally by source before downloading.

  • Each source uses its configured thread count and request overrides.

Example:

results = client.search("puppy", search_limits_per_source=20)
downloaded = client.download(results)

print(results.keys())
print(len(downloaded))

ImageClient.startcmdui()

Starts the interactive command-line interface.

Behavior:

  • prints basic program information

  • asks for a search keyword

  • runs ImageClient.search()

  • immediately runs ImageClient.download()

Special inputs:

  • q or Q: quit

  • r or R: restart

Example:

client.startcmdui(search_limits_per_source=50)

Output Files

After search and download, the code saves metadata as pickle files.

Typical files:

  • search_results.pkl

  • download_results.pkl

Images are saved under a timestamped directory similar to:

<work_dir>/<source>/<timestamp> <keyword>/

Example:

imagedl_outputs/BaiduImageClient/2026-03-29-10-30-15 cute cat/

BaseImageClient

BaseImageClient is the base class for lower-level source clients. It is useful when you want to extend the project or directly control a specific backend including,

  • imagedl.imagedl.modules.sources.AICImageClient

  • imagedl.imagedl.modules.sources.BaiduImageClient

  • imagedl.imagedl.modules.sources.BingImageClient

  • imagedl.imagedl.modules.sources.ClevelandArtImageClient

  • imagedl.imagedl.modules.sources.DuckduckgoImageClient

  • imagedl.imagedl.modules.sources.DanbooruImageClient

  • imagedl.imagedl.modules.sources.DimTownImageClient

  • imagedl.imagedl.modules.sources.EverypixelImageClient

  • imagedl.imagedl.modules.sources.FoodiesfeedImageClient

  • imagedl.imagedl.modules.sources.FreeNatureStockImageClient

  • imagedl.imagedl.modules.sources.FreeImagesImageClient

  • imagedl.imagedl.modules.sources.FlickrImageClient

  • imagedl.imagedl.modules.sources.GoogleImageClient

  • imagedl.imagedl.modules.sources.GelbooruImageClient

  • imagedl.imagedl.modules.sources.GratisoGraphyImageClient

  • imagedl.imagedl.modules.sources.GBIFImageClient

  • imagedl.imagedl.modules.sources.HuabanImageClient

  • imagedl.imagedl.modules.sources.I360ImageClient

  • imagedl.imagedl.modules.sources.INaturalistImageClient

  • imagedl.imagedl.modules.sources.JikanImageClient

  • imagedl.imagedl.modules.sources.LifeOfPixImageClient

  • imagedl.imagedl.modules.sources.LocGovImageClient

  • imagedl.imagedl.modules.sources.MetropolitanImageClient

  • imagedl.imagedl.modules.sources.NASAImageClient

  • imagedl.imagedl.modules.sources.OpenverseImageClient

  • imagedl.imagedl.modules.sources.PixabayImageClient

  • imagedl.imagedl.modules.sources.PexelsImageClient

  • imagedl.imagedl.modules.sources.PicJumboImageClient

  • imagedl.imagedl.modules.sources.SogouImageClient

  • imagedl.imagedl.modules.sources.SafebooruImageClient

  • imagedl.imagedl.modules.sources.StockSnapImageClient

  • imagedl.imagedl.modules.sources.UnsplashImageClient

  • imagedl.imagedl.modules.sources.WeiboImageClient

  • imagedl.imagedl.modules.sources.WikipediaImageClient

  • imagedl.imagedl.modules.sources.WellcomeImageClient

  • imagedl.imagedl.modules.sources.YandexImageClient

  • imagedl.imagedl.modules.sources.YahooImageClient

  • imagedl.imagedl.modules.sources.YandeImageClient

Module path:

imagedl.imagedl.modules.sources.BaseImageClient

Constructor:

BaseImageClient(
    auto_set_proxies: bool = False,
    random_update_ua: bool = False,
    enable_search_curl_cffi: bool = False,
    enable_download_curl_cffi: bool = False,
    max_retries: int = 5,
    logger_handle = None,
    maintain_session: bool = False,
    disable_print: bool = False,
    work_dir: str = "imagedl_outputs",
    freeproxy_settings: dict = None,
    default_search_cookies: dict = None,
    default_download_cookies: dict = None,
)

Arguments:

  • work_dir

    Root output directory.

  • max_retries

    Maximum retry count for HTTP requests.

  • auto_set_proxies

    Whether to automatically fetch and use proxies.

  • random_update_ua

    Whether to randomize the User-Agent between requests.

  • maintain_session

    Whether to keep a persistent session.

  • default_search_cookies

    Default cookies used for searching.

  • default_download_cookies

    Default cookies used for downloading.

BaseImageClient.download()

Downloads images for a single source client.

BaseImageClient.download(
    image_infos: list[ImageInfo],
    num_threadings: int = 5,
    request_overrides: dict = None,
) -> list[ImageInfo]

Behavior:

  • Returns only successfully downloaded items

  • Detects file extension automatically

  • Saves download results to download_results.pkl

Example:

downloaded = client.download(
    image_infos=items,
    num_threadings=8,
    request_overrides={"timeout": 10},
)

ImageClientCMD (CLI Usage)

The project also provides a command-line entry.

  • Basic command is imagedl. If no keyword is provided, interactive mode is opened.

  • Search directly from CLI:

    imagedl -k "cute cat"
    
  • Multi-source Example:

    imagedl -k "mountain lake" \
      -s "BaiduImageClient,BingImageClient" \
      -c '{"BaiduImageClient": {"work_dir": "outputs"}, "BingImageClient": {"work_dir": "outputs"}}' \
      -t '{"BaiduImageClient": 5, "BingImageClient": 8}' \
      -o '{"BaiduImageClient": {"timeout": 10}, "BingImageClient": {"timeout": 10}}' \
      -f '{"BaiduImageClient": {"size": "large"}, "BingImageClient": {"date": "pastmonth"}}' \
      -l 20
    
  • CLI Options:

    • -k, --keyword: search keyword

    • -s, --image-sources: comma-separated source names

    • -c, --init-image-clients-cfg: JSON string for per-source initialization config

    • -o, --requests-overrides: JSON string for per-source request overrides

    • -t, --clients-threadings: JSON string for per-source thread counts

    • -f, --search-filters: JSON string for per-source filters

    • -l, --search-limits-per-source: max number of results per source

  • Usage Notes:

    • Pass a dictionary for init_image_clients_cfg.

    • search() returns a dict grouped by source.

    • download() accepts either that dict or a flat list.

    • Thread counts are configured through clients_threadings.

    • Request overrides and filters are configured per source.

    • Supported filter keys depend on the source.