peterpostmann / parse_uri
Parse a URI and return its components
Installs: 178 890
Dependents: 4
Suggesters: 0
Security: 0
Stars: 1
Watchers: 2
Forks: 0
Open Issues: 0
Requires (Dev)
- phpunit/phpunit: ^5.7
- squizlabs/php_codesniffer: ~2.3
This package is not auto-updated.
Last update: 2024-10-27 05:09:48 UTC
README
Parse a URI and return its components
This function parses a URI (RFC3986 URL, URN or Windows path) and returns an associative array containing any of the various components of the URL that are present. The values of the array elements are not URL decoded.
This function is not meant to validate the given URL, it only breaks it up into the above listed parts.
Install
Via Composer
composer require peterpostmann/parse_uri
If you dont want to use composer just copy the parse_uri.php
file and include it into your project.
Why
This function extends the capabailities of parse_url. It parses rfc complient URLs and URNs and (windows) file paths (basically everything which can be passed to file functions (e.g. fopen, file_get_contents)).
Usage
use function peterpostmann\uri\parse_uri; array parse_uri ( string uri [, int $component = -2 [, bool $convertUrlToUrn = null ]])
Via option the output can be reduced to the output of parse_url
. Otherwise additonal components are provided:
URL
schema://user:pass@host:port/path?query#fragment array (size=14) 'scheme' => string 'schema' (length=6) 'host' => string 'host' (length=4) 'port' => string '' (length=0) 'user' => string 'user' (length=4) 'pass' => string 'pass' (length=4) 'path' => string 'port/path' (length=9) 'query' => string 'query' (length=5) 'fragment' => string 'fragment' (length=8) '_protocol' => string 'schema' (length=6) '_userinfo' => string 'user:pass@' (length=10) '_authority' => string 'user:pass@host:' (length=15) '_document' => string 'schema://user:pass@host:port/path' (length=33) '_ressource' => string 'schema://user:pass@host:port/path?query' (length=39) '_uri' => string 'schema://user:pass@host:port/path?query#fragment' (length=48)
URI
schema:path?query#fragment array (size=8) 'scheme' => string 'schema' (length=6) 'path' => string 'path' (length=4) 'query' => string 'query' (length=5) 'fragment' => string 'fragment' (length=8) '_protocol' => string 'schema' (length=6) '_document' => string 'schema:path' (length=11) '_ressource' => string 'schema:path?query' (length=17) '_uri' => string 'schema:path?query#fragment' (length=26)
_protocol
will return
- {schema} if present
- 'file' if it is an absolute path ('/path')
- false if it is a relative path ('path/file', 'file')
- true if it is a Windows path ('C:\path')
- null if there is no path
Example
parse URIs
use function peterpostmann\uri\parse_uri; echo "# URIs (with standard components)\n\n"; var_dump(parse_uri('/path/to/file.ext', peterpostmann\uri\PARSE_URI_DEFAULT)); var_dump(parse_uri('relative/path/to/file.ext', peterpostmann\uri\PARSE_URI_DEFAULT)); var_dump(parse_uri('fileInCwd.ext', peterpostmann\uri\PARSE_URI_DEFAULT)); var_dump(parse_uri('C:/path/to/winfile.ext', peterpostmann\uri\PARSE_URI_DEFAULT)); var_dump(parse_uri('C:\path\to\winfile.ext', peterpostmann\uri\PARSE_URI_DEFAULT)); var_dump(parse_uri('\\\\smbserver\share\path\to\winfile.ext', peterpostmann\uri\PARSE_URI_DEFAULT)); var_dump(parse_uri('file:///path/to/file.ext', peterpostmann\uri\PARSE_URI_DEFAULT)); var_dump(parse_uri('http://user:pass@example.org:8888/path/to/file', peterpostmann\uri\PARSE_URI_DEFAULT)); var_dump(parse_uri('news:comp.infosystems.www.servers.unix', peterpostmann\uri\PARSE_URI_DEFAULT)); echo "# URIs (with additional components)\n\n"; var_dump(parse_uri('C:\path\to\winfile.ext')); var_dump(parse_uri('\\\\smbserver\share\path\to\winfile.ext')); var_dump(parse_uri('file:///path/to/file.ext')); var_dump(parse_uri('http://user:pass@example.org:8888/path/to/file')); var_dump(parse_uri('news:comp.infosystems.www.servers.unix'));
The above example will output:
# URIs (with standard components) array (size=1) 'path' => string '/path/to/file.ext' (length=17) array (size=1) 'path' => string 'relative/path/to/file.ext' (length=25) array (size=1) 'path' => string 'fileInCwd.ext' (length=13) array (size=1) 'path' => string 'C:\path\to\winfile.ext' (length=22) array (size=1) 'path' => string 'C:\path\to\winfile.ext' (length=22) array (size=3) 'scheme' => string 'file' (length=4) 'host' => string 'smbserver' (length=9) 'path' => string '/share/path/to/winfile.ext' (length=26) array (size=3) 'scheme' => string 'file' (length=4) 'host' => string '' (length=0) 'path' => string '/path/to/file.ext' (length=17) array (size=6) 'scheme' => string 'http' (length=4) 'host' => string 'example.org' (length=11) 'port' => int 8888 'user' => string 'user' (length=4) 'pass' => string 'pass' (length=4) 'path' => string '/path/to/file' (length=13) array (size=2) 'scheme' => string 'news' (length=4) 'path' => string 'comp.infosystems.www.servers.unix' (length=33) # URIs (with additional components) array (size=5) 'path' => string 'C:\path\to\winfile.ext' (length=22) '_protocol' => string 'file' (length=4) '_document' => string 'C:\path\to\winfile.ext' (length=22) '_ressource' => string 'C:\path\to\winfile.ext' (length=22) '_uri' => string 'C:\path\to\winfile.ext' (length=22) array (size=8) 'scheme' => string 'file' (length=4) 'host' => string 'smbserver' (length=9) 'path' => string '/share/path/to/winfile.ext' (length=26) '_protocol' => string 'file' (length=4) '_authority' => string 'smbserver' (length=9) '_document' => string '/share/path/to/winfile.ext' (length=26) '_ressource' => string '/share/path/to/winfile.ext' (length=26) '_uri' => string '/share/path/to/winfile.ext' (length=26) array (size=7) 'scheme' => string 'file' (length=4) 'host' => string '' (length=0) 'path' => string '/path/to/file.ext' (length=17) '_protocol' => string 'file' (length=4) '_document' => string 'file:///path/to/file.ext' (length=24) '_ressource' => string 'file:///path/to/file.ext' (length=24) '_uri' => string 'file:///path/to/file.ext' (length=24) array (size=12) 'scheme' => string 'http' (length=4) 'host' => string 'example.org' (length=11) 'port' => int 8888 'user' => string 'user' (length=4) 'pass' => string 'pass' (length=4) 'path' => string '/path/to/file' (length=13) '_protocol' => string 'http' (length=4) '_userinfo' => string 'user:pass@' (length=10) '_authority' => string 'user:pass@example.org:8888' (length=26) '_document' => string 'http://user:pass@example.org:8888/path/to/file' (length=46) '_ressource' => string 'http://user:pass@example.org:8888/path/to/file' (length=46) '_uri' => string 'http://user:pass@example.org:8888/path/to/file' (length=46) array (size=6) 'scheme' => string 'news' (length=4) 'path' => string 'comp.infosystems.www.servers.unix' (length=33) '_protocol' => string 'news' (length=4) '_document' => string 'news:comp.infosystems.www.servers.unix' (length=38) '_ressource' => string 'news:comp.infosystems.www.servers.unix' (length=38) '_uri' => string 'news:comp.infosystems.www.servers.unix' (length=38)
patch URIs
use function peterpostmann\uri\parse_uri; use function peterpostmann\uri\build_uri; $uri = 'https://example.org/path/to/file?query#fragment': $patch = [ 'path' => '/path/to/otherfile'] echo build_uri($patch + parse_uri($uri));
The above example will output:
https://example.org/path/to/otherfile?query#fragment
Load References
function resolveReference($reference) { $components = parse_uri($reference); if(!isset($components['fragment'])) return null; // absolute reference if(isset($components['_document'])) { $document = $this->getLoader($components['_protocol'])->load($components['_document']); } else { // relative reference $document = $this->currentDocument(); } return $this->getReference($document, $components['fragment']); }
$components['_document']
returns the URI without the fragment. ($components['_ressource']
without fragment and query).
$components['_protocol']
indicates the protocol. It usualy is the same as scheme, except for file:/// URIs. For absolute, relative and windows path, scheme is not set, but $components['_protocol']
is set to file. For URIs without schema \\example.org\path
the variable is set, but empty.
Helper Functions
build_uri
The function creates a uri (string) from its components.
use function peterpostmann\uri\build_uri; echo build_uri([ 'scheme' => 'ssh2.sftp', 'host' => 'example.com', 'port' => 422, 'user' => 'user', 'pass' => 'pass', 'path' => '/file.php', 'query' => 'var1=val1&var2=val2', 'fragment' => 'anchor' ])."\n"; echo build_uri([ 'scheme' => 'news', 'path' => 'comp.infosystems.www.servers.unix', ])."\n";
The above example will output:
ssh2.sftp://user:pass@example.com:422/file.php?var1=val1&var2=val2#anchor news:comp.infosystems.www.servers.unix
convert_url2urn
The function converts php URLs which are not real URLs to URNs (what they should have been in first place).
use function peterpostmann\uri\build_uri; echo convert_url2urn('data://text/plain;base64,SSBsb3ZlIFBIUAo=')."\n"; echo convert_url2urn('zlib://archive.zip#dir/file.txt')."\n"; echo convert_url2urn('php://stdin')."\n";
The above example will output:
data:text/plain;base64,SSBsb3ZlIFBIUAo=' zlib:archive.zip#dir/file.txt php:stdin
If the URIs are parsed as-is, the parser will look for a host part which is not present. convert_url2urn
accepts a second parameter to control the conversion:
- null: Dont convert, except Formats from http://php.net/manual/en/wrappers.php (php://, compress.*://, zip://, zlib://, bzip2://, data://, glob://, phar://, rar://, ogg://, expect://)
- true: convert all
- false: convert nothing
License
The MIT License (MIT). Please see License File for more information.