leankoala/urlextractor

There is no license information available for the latest version (dev-master) of this package.

Extract urls from all kind of file/stream types

dev-master 2021-08-24 17:09 UTC

This package is auto-updated.

Last update: 2024-12-21 23:26:08 UTC


README

The UrlExtractor library is used to extract URLs from any given document.

Supported types

  • PDF documents (PdfAdapter)
  • HTML document (HtmlAnchorAdapter)

Examples

# php examples/pdf.php

Found 20 URLs in the given document.

 - https://www.leankoala.com/de/features
 - https://www.leankoala.com/de/pricing
 - https://www.leankoala.com/de/akademie
 - https://blog.leankoala.com
 - https://monitor.leankoala.com/secure_area/register/de/
 - https://calendly.com/leankoala_com
 - https://www.leankoala.com/de/why
 - https://www.leankoala.com/de/why/e-commerce
 - https://www.leankoala.com/de/features/wizard
 - https://www.leankoala.com/de/akademie/videos/die-ersten-zwei-minuten
 - https://www.leankoala.com/de/why
 - https://blog.leankoala.com/leankoala-in-action/zahlpixel-uberprufen.html
 - https://blog.leankoala.com/leankoala-in-action/suchfunktion-uberwachen.html
 - https://blog.leankoala.com/leankoala-in-action/seo-eigenschaften-uberprufen.html
 - https://blog.leankoala.com/category/leankoala-in-action
 - https://www.leankoala.com/de/features/wizard
 - https://www.leankoala.com/
 - https://www.leankoala.com/de/about-us/impressum
 - https://www.leankoala.com/de/about-us/datenschutzbestimmungen
 - https://www.leankoala.com/de/about-us/terms-of-services/current