adachsoft / directory-scanner-tool
Safe, configurable directory scanning and file content search tools for PHP with adachsoft/ai-tool-call integration
Installs: 3
Dependents: 0
Suggesters: 0
Security: 0
Stars: 0
Forks: 0
pkg:composer/adachsoft/directory-scanner-tool
Requires
- php: >=8.3
- adachsoft/ai-tool-call: ^2.0.0
- adachsoft/filesystem: ^1.3.0
- adachsoft/normalized-safe-path: ^0.1.0
Requires (Dev)
- adachsoft/php-code-style: ^0.2.1
- friendsofphp/php-cs-fixer: ^3.90
- phpstan/phpstan: ^2.1
- phpunit/phpunit: ^12.4
README
Safe, configurable directory scanner and file content search tool for PHP projects, designed to
integrate with the adachsoft/ai-tool-call
library and AI agents (e.g. Google Gemini).
It exposes two tools:
directory_scanner– scans a configured base directory, applies exclusions and depth/entry limits, and returns a flat list of file system entries with optional metadata.file_content_search– uses the same safe directory scanning, but additionally filters results to files whose contents match a given pattern (plain/regex/similarity search modes).
Version numbers are managed via Git tags / Packagist and follow
Semantic Versioning. See CHANGELOG.md for notable changes.
Features
- Safe scanning strictly confined to a configured base path (no directory traversal above base path).
- Support for excluded subpaths (e.g.
vendor,var/cache,.git). - Configurable maximum recursion depth (
max_allowed_depth). - Configurable maximum number of returned entries (
max_entries) with truncation indication. - Optional inclusion of additional metadata for each entry (keys are present only when enabled via
include_*/default_include_*flags and a non-null value is available from the filesystem):- file size (bytes),
- last modification time (ISO 8601 string),
- future‑ready fields for creation time and permissions.
- Flat, predictable result structure (
items+summary). - Ready‑to‑use SPI tools + factories for
adachsoft/ai-tool-call:DirectoryScannerTool/DirectoryScannerToolFactory,FileContentSearchTool/FileContentSearchToolFactory.
- Content‑based file filtering via
file_content_searchwith multiple search modes:- plain (case‑insensitive substring),
- plain_case_sensitive,
- regex,
- similarity (fuzzy match using
similar_text).
Requirements
- PHP 8.3 or higher
- Composer
The library depends on the following AdachSoft packages at runtime:
adachsoft/ai-tool-calladachsoft/filesystemadachsoft/normalized-safe-path
These are installed automatically when you require this package.
Installation
composer require adachsoft/directory-scanner-tool
Concepts and architecture
The core pieces of this library are:
DirectoryScannerTool– SPI tool implementation (AdachSoft\AiToolCall\SPI\ToolInterface) that is discovered and executed byadachsoft/ai-tool-call.DirectoryScannerToolFactory– factory used byAiToolCallFacadeBuilderto create configured tool instances based on aConfigMap.FileContentSearchTool– SPI tool that wraps directory scanning and then filters entries by inspecting file contents using pluggable search strategies.FileContentSearchToolFactory– factory that wires the sameDirectoryScannerServiceand filesystem configuration, and composesFileContentSearchService.DirectoryScannerService/DirectoryScanRunner– services responsible for scanning the file system and collecting results.FileContentSearchService– usesDirectoryScannerServiceplus a set of search strategies (Strategy pattern) to keep the search logic extensible and testable.PathNormalizationHelper– usesadachsoft/normalized-safe-pathto ensure all paths stay inside the configured base path.
You typically do not construct these objects manually. Instead, you plug the factories into
AiToolCallFacadeBuilder and configure the tools using ConfigMap.
Configuration
The tools are configured by the host application (not by the AI agent) via
DirectoryScannerToolFactory, FileContentSearchToolFactory and ConfigMap.
Factory configuration for directory_scanner (host application)
Example of wiring the directory scanner tool with AiToolCallFacadeBuilder:
use AdachSoft\AiToolCall\PublicApi\Builder\AiToolCallFacadeBuilder;
use AdachSoft\AiToolCall\SPI\Collection\ConfigMap;
use AdachSoft\DirectoryScannerTool\DirectoryScannerToolFactory;
$factory = new DirectoryScannerToolFactory();
$facade = AiToolCallFacadeBuilder::new()
->withSpiFactories([$factory])
->withToolConfigs([
'directory_scanner' => new ConfigMap([
'base_path' => '/var/www/my-project',
'excluded_paths' => ['vendor', 'var/cache', '.git'],
'max_allowed_depth' => 10,
'max_entries' => 5000,
// Optional defaults for include_* flags when the agent does not specify them
'default_include_size' => false,
'default_include_created_at' => false,
'default_include_modified_at' => false,
'default_include_permissions' => false,
]),
])
->build();
Supported config keys (both tools)
All config keys are passed as an array to ConfigMap for tool names directory_scanner and
file_content_search:
base_path(string, required)- Absolute path that acts as the root of all scans.
- All agent‑provided paths are resolved relative to this base path.
excluded_paths(string[]|optional)- List of relative paths (from base path) that should be excluded from scanning.
- Both the directory itself and all its descendants are excluded.
max_allowed_depth(int, optional, default:10)- Maximum recursion depth allowed by the host application.
- The effective depth used for a given request is the minimum of this value and
the request‑level
max_depthparameter (see below).
max_entries(int, optional, default:5000)- Maximum number of entries that will be returned from a single scan.
- If the limit is reached, scanning stops and
summary.truncated_by_max_entriesis set totrue.
default_include_size(bool, optional, default:false)default_include_created_at(bool, optional, default:false)default_include_modified_at(bool, optional, default:false)default_include_permissions(bool, optional, default:false)- Default values used when the agent omits corresponding request parameters.
Internally DirectoryScannerConfig keeps PHP properties in camelCase (e.g. $basePath,
$excludedPaths), but everywhere arrays/JSON are used the keys follow snake_case as shown above.
The file_content_search tool uses the same configuration, but always returns only file entries
whose contents match the request pattern.
Tool invocation (AI agent request)
Once the tools are registered, AI agents (or your own code) call them through the
AdachSoft\AiToolCall\PublicApi\AiToolCallFacade.
Request parameters – directory_scanner
The directory_scanner tool exposes the following parameters schema (as seen in
DirectoryScannerTool::getDefinition()):
path(string, required)- Relative path to scan from base path (e.g.
.,src,src/Module). .means "start from the base path itself".
- Relative path to scan from base path (e.g.
recursive(bool, default:false)- Whether nested directories should be scanned recursively.
max_depth(int|null, default:null)- Maximum recursion depth relative to the starting directory.
1means "only direct children".- The actual maximum depth used is
min(max_depth, config.max_allowed_depth).
include_size(bool, default:false)- Whether to include file size in bytes (for files only).
include_created_at(bool, default:false)- Reserved for future use (creation time; currently may always be
nulldepending on filesystem).
- Reserved for future use (creation time; currently may always be
include_modified_at(bool, default:false)- Whether to include last modification time as an ISO 8601 string.
include_permissions(bool, default:false)- Reserved for future use (POSIX‑like permission string); may be
nullif unavailable.
- Reserved for future use (POSIX‑like permission string); may be
Request parameters – file_content_search
The file_content_search tool accepts the same parameters as directory_scanner, plus:
pattern(string, required)- Text or pattern to search for in file contents.
- Must be a non‑empty string.
search_mode(string, default:plain)- Controls how
patternis applied to file contents. - One of:
plain– case‑insensitive substring search,plain_case_sensitive– case‑sensitive substring search,regex– PHP regular expression, pattern is wrapped as"/{$pattern}/u",similarity– fuzzy match usingsimilar_text(internal threshold ~70%).
- Controls how
If search_mode = 'regex' and the pattern is not a valid regular expression, the tool throws
InvalidToolCallException.
Example: calling directory_scanner via Public API
use AdachSoft\AiToolCall\PublicApi\Dto\ToolCallRequestDto as PublicToolCallRequestDto;
$request = new PublicToolCallRequestDto(
toolName: 'directory_scanner',
parameters: [
'path' => '.',
'recursive' => true,
'max_depth' => 3,
'include_size' => true,
'include_modified_at' => true,
],
);
$result = $facade->callTool($request);
// $result->toolName === 'directory_scanner'
// $result->result is an array with keys 'items' and 'summary'
$items = $result->result['items'];
$summary = $result->result['summary'];
Example: calling file_content_search via Public API
use AdachSoft\AiToolCall\PublicApi\Dto\ToolCallRequestDto as PublicToolCallRequestDto;
$request = new PublicToolCallRequestDto(
toolName: 'file_content_search',
parameters: [
'path' => '.',
'recursive' => true,
'max_depth' => 3,
'pattern' => 'TODO',
'search_mode' => 'plain',
],
);
$result = $facade->callTool($request);
// $result->toolName === 'file_content_search'
// $result->result has the same shape as for directory_scanner
$items = $result->result['items'];
$summary = $result->result['summary'];
Only files whose contents match the given pattern (according to search_mode) are returned in
items. Directories are never included in file_content_search results.
Response structure
Both tools return a structure containing two top‑level keys: items and summary.
items
items is a flat list of scan entries:
/**
* @var array<int, array{
* path: string,
* name: string,
* is_file: bool,
* is_directory: bool,
* size?: int,
* created_at?: string,
* modified_at?: string,
* permissions?: string,
* }> $items
*/
$items = $result->result['items'];
path– relative path from the configured base path.name– basename of the entry (file or directory name).is_file–trueif the entry is a file.is_directory–trueif the entry is a directory.size– file size in bytes. Present only wheninclude_size/default_include_sizeis enabled for a file entry and the filesystem provides a size.created_at– creation time as ISO 8601 string. Present only wheninclude_created_at/default_include_created_atis enabled and creation time is available from the filesystem.modified_at– last modification time as ISO 8601 string. Present only wheninclude_modified_at/default_include_modified_atis enabled and last modification time is available from the filesystem.permissions– POSIX‑style permissions string. Present only wheninclude_permissions/default_include_permissionsis enabled and the filesystem exposes a permissions string.
Optional metadata keys are omitted entirely when the corresponding include flags are disabled or
not requested. The tools do not emit "...": null for fields that were not explicitly asked for.
For file_content_search, the structure is identical, but only entries with is_file === true that
match the content search criteria are present.
summary
summary contains metadata about the scan:
/**
* @var array{
* base_path: string,
* requested_path: string,
* recursive: bool,
* requested_max_depth: int|null,
* effective_max_depth: int,
* actual_depth_reached: int,
* total_entries_found: int,
* returned_entries_count: int,
* truncated_by_max_entries: bool,
* } $summary
*/
$summary = $result->result['summary'];
base_path– the configured base path used for scanning.requested_path– thepathvalue from the request.recursive– whether recursive scanning was enabled for the request.requested_max_depth– rawmax_depthfrom the request (may benull).effective_max_depth– actual recursion depth limit used after applying config constraints.actual_depth_reached– deepest level reached during the scan.total_entries_found– total number of entries encountered (before truncation).returned_entries_count– number of entries actually returned initems.truncated_by_max_entries–trueif scanning was stopped becausemaxEntrieswas reached.
Error handling
The tools use exceptions from adachsoft/ai-tool-call and their own domain exceptions to signal
problems:
InvalidToolCallException- Thrown when request parameters are invalid (e.g. wrong types, impossible options, invalid
pattern for
file_content_search).
- Thrown when request parameters are invalid (e.g. wrong types, impossible options, invalid
pattern for
ToolExecutionException- Wraps domain and filesystem errors that occur during scanning or content search.
- The original cause is available as the previous exception and usually contains a
DirectoryScannerToolExceptionwith a more detailed message.
DirectoryScannerDomainException- Used internally for invalid or unsafe path operations (e.g. attempts to escape base path).
In typical adachsoft/ai-tool-call setups, these exceptions are translated into structured error
responses returned to the AI agent.
Development
To work on the library locally, install dev dependencies and run the checks:
composer install
# Run test suite
vendor/bin/phpunit
# Run static analysis
vendor/bin/phpstan analyse
# Run coding standards fixer (dry run or fix)
PHP_CS_FIXER_IGNORE_ENV=1 vendor/bin/php-cs-fixer fix --dry-run
Versioning
This library follows Semantic Versioning. Versions are published as Git tags
and exposed on Packagist. The composer.json file does not contain an explicit version field;
Composer reads version information from VCS tags.
See CHANGELOG.md for a list of notable changes between versions.
License
This library is open‑source software licensed under the MIT License. See the LICENSE
file for full license text.
Author
- Arkadiusz Adach