paysera / lib-pagination
Paginates Doctrine QueryBuilder using cursor based or offset based pagination
Installs: 127 364
Dependents: 2
Suggesters: 0
Security: 0
Stars: 12
Watchers: 10
Forks: 14
Open Issues: 5
Requires
- php: ^7.1|^8.0
- ext-json: *
- doctrine/orm: ^2.5
- psr/log: ^1.0|^2.0
- symfony/property-access: ^3.0|^4.0|^5.0|^6.0
Requires (Dev)
- doctrine/annotations: ^1.0 || ^2.0
- phpunit/phpunit: ^7.5 || ^8.5 || ^9.5
- symfony/cache: ^3.0|^4.0|^5.0|^6.0
This package is auto-updated.
Last update: 2024-10-27 15:24:04 UTC
README
This component allows to paginate Doctrine QueryBuilder
instances using cursor-based pagination. Offset is also supported.
Why?
Performance
Cursor pagination can be much more efficient under large data-sets.
For example, to take 3000th page, consisting of 100 items, using offset you would need to execute such query:
SELECT * FROM items WHERE field = 'some value' ORDER BY created_at DESC LIMIT 100 OFFSET 29900
This type of query, even if you have indexes on the fields you are querying by (and the fields you order by), would need to iterate over 29900 items first, before giving you the requested result.
If using cursor based pagination, the resulting query can look like this:
SELECT * FROM items WHERE field = 'some value' AND created_at <= '2018-01-01 00:01:02' AND id < 12345 ORDER BY created_at DESC, id DESC LIMIT 100
This allows to add required indexes to the table to enable fast querying. Keep in mind that without any indexes, the performance can be really similar.
In this concrete example, there should be a multiple-column index on field,created_at,id
for best performance
(or just field,created_at
if created_at
is rather unique).
Iterate over all the items
When using query from previous example with offset based pagination, it's possible to get the same item twice or miss some of the items.
This is because when we're getting the pages, new items can be added or some of the items removed, which changes the position of all other items.
Let's take an example. We have pages, each of 5 items. We've already got the first page with items 1 2 3 4 5
. Before
we get the second page, these situations might occur:
Original: 1 2 3 4 5 6 7 8 9
Second page: 6 7 8 9
"0" is added: 0 1 2 3 4 5 6 7 8 9
Second page: 5 6 7 8 9 <- 5 is duplicated
"2" was removed: 1 3 4 5 6 7 8 9
Second page: 7 8 9 <- 6 was skipped
If we're using cursor based pagination, we start the page after (including or excluding) or before some concrete item, so we don't get these types of issues.
Caveats
Using cursors, it's easy to move to the next and previous pages, but it's not possible to:
- know what page you are in right now;
- to skip some number of pages directly or go to page number X.
To support mixed cases:
- cursors are always provided and available for pagination;
hasNext
andhasPrevious
are also always available to know if you're in the first or last page;- offset is available for pagination, but can be limited to some maximum. This could be used to move to N-th page, but not 100N-th, which is not needed that often;
- if you need to skip to the last page, you should offer (or do that automatically) to reverse the ordering and go to first page instead;
- to display number of pages, total count must be calculated. This can be done but is disabled by default, again, for performance reasons. You should avoid giving this with each of the pages – instead provide total count only when needed or at least cache it.
Installation
composer require paysera/lib-pagination
Basic Usage
Use ResultProvider
class (service) to provide the result.
To get the result, two arguments are needed:
ConfiguredQuery
. This is wrapper aroundQueryBuilder
and has configuration for available ordering fields, maximum offset, result item transformation and whether total count should be calculated;Pager
. This holds parameters provided by the user: offset or after/before cursor, limit and ordering directions.
Generally ConfiguredQuery
holds internals related to QueryBuilder
so it's recommended to return this from the
Repository
directly – let's hold implementation details together.
Pager
should be created somewhere in the input layer, usually in the controller or similar place.
Example usage:
<?php use Paysera\Pagination\Entity\Doctrine\ConfiguredQuery; use Paysera\Pagination\Service\Doctrine\ResultProvider; use Paysera\Pagination\Service\CursorBuilder; use Paysera\Pagination\Service\Doctrine\QueryAnalyser; use Paysera\Pagination\Entity\OrderingConfiguration; use Paysera\Pagination\Entity\Pager; use Paysera\Pagination\Entity\OrderingPair; use Symfony\Component\PropertyAccess\PropertyAccess; use Doctrine\ORM\EntityManagerInterface; $resultProvider = new ResultProvider( new QueryAnalyser(), new CursorBuilder(PropertyAccess::createPropertyAccessor()) ); /** @var EntityManagerInterface $entityManager */ $queryBuilder = $entityManager->createQueryBuilder() ->select('m') ->from('Bundle:MyEntity', 'm') ->andWhere('m.field = :param') ->setParameter('param', 'some value') ; $configuredQuery = (new ConfiguredQuery($queryBuilder)) ->addOrderingConfiguration( 'my_field_name', (new OrderingConfiguration('m.field', 'field'))->setOrderAscending(true) ) ->addOrderingConfiguration('my_other_name', new OrderingConfiguration('m.otherField', 'otherField')) ->setTotalCountNeeded(true) // total count will be returned only if this is called ->setMaximumOffset(100) // you can optionally limit maximum offset ->setItemTransformer(function ($item) { // return transformed item if needed }) ; $pager = (new Pager()) ->setLimit(10) ->setOffset(123) // set only one of offset, after or before ->setAfter('Cursor from Result::getNextCursor') ->setBefore('Cursor from Result::getPreviousCursor') ->addOrderBy(new OrderingPair('my_field_name')) // order by default direction (asc in this case) ->addOrderBy(new OrderingPair('my_other_name', true)) // or set direction here ; $result = $resultProvider->getResultForQuery($configuredQuery, $pager); $result->getItems(); // items in the page $result->getNextCursor(); // value to pass with setAfter for next page $result->getPreviousCursor(); // value to pass with setBefore for previous page $result->getTotalCount(); // available only if setTotalCountNeeded(true) was called $totalCount = $resultProvider->getTotalCountForQuery($configuredQuery); // calculate total count directly
Using iterator
There are a few classes (services) to help iterating over large result sets:
ResultIterator
– iterates using pagination, hides the pagination beneath;FlushingResultIterator
– same asResultIterator
, but flushes and clearsEntityManager
after each page. Use when you need to modify Entities managed by Doctrine. This pattern (instead of just calling$em->flush()
in the end) is needed to avoid out of memory failures and to optimize the process.
Using ResultIterator
:
<?php use Paysera\Pagination\Entity\Doctrine\ConfiguredQuery; use Paysera\Pagination\Service\Doctrine\ResultIterator; use Paysera\Pagination\Service\CursorBuilder; use Paysera\Pagination\Service\Doctrine\QueryAnalyser; use Paysera\Pagination\Service\Doctrine\ResultProvider; use Symfony\Component\PropertyAccess\PropertyAccess; use Psr\Log\NullLogger; /** @var ConfiguredQuery $configuredQuery */ $resultIterator = new ResultIterator( new ResultProvider( new QueryAnalyser(), new CursorBuilder(PropertyAccess::createPropertyAccessor()) ), new NullLogger(), $defaultLimit = 1000 ); foreach ($this->resultIterator->iterate($configuredQuery) as $item) { // process $item where flush is not needed // for example, send ID or other data to working queue, process files etc. }
Using FlushingResultIterator
:
<?php use Paysera\Pagination\Entity\Doctrine\ConfiguredQuery; use Paysera\Pagination\Service\Doctrine\FlushingResultIterator; use Paysera\Pagination\Service\CursorBuilder; use Paysera\Pagination\Service\Doctrine\QueryAnalyser; use Paysera\Pagination\Service\Doctrine\ResultProvider; use Symfony\Component\PropertyAccess\PropertyAccess; use Psr\Log\NullLogger; use Doctrine\ORM\EntityManagerInterface; /** @var ConfiguredQuery $configuredQuery */ /** @var EntityManagerInterface $entityManager */ $resultIterator = new FlushingResultIterator( new ResultProvider( new QueryAnalyser(), new CursorBuilder(PropertyAccess::createPropertyAccessor()) ), new NullLogger(), $defaultLimit = 1000, $entityManager ); foreach ($this->resultIterator->iterate($configuredQuery) as $item) { // process $item or other entities where flush will be called after each page // for example: $item->setTitle(formatTitleFor($item)); // keep in mind, that clear is also called – don't reuse other Entities outside of foreach cycle } echo "Updated successfully"; // no need to flush here anymore
If there was out of memory exception etc, search logs (INFO level) for last "Continuing with iteration" message, look at "after" in the context, then:
$lastCursor = '"123"'; // get from logs $startPager = (new Pager()) ->setLimit(500) // can also override default limit ->setAfter($lastCursor) ; foreach ($this->resultIterator->iterate($configuredQuery, $startPager) as $item) { // process $item }
Semantic versioning
This library follows semantic versioning.
See Symfony BC rules for basic information about what can be changed and what not in the API.
Running tests
composer update
composer test
Contributing
Feel free to create issues and give pull requests.
You can fix any code style issues using this command:
composer fix-cs