Memory leak when executing Doctrine query in loop
I just ran into the same issue, these are the things that fixed it for me:
--no-debug
As the OP mentioned in their answer, setting --no-debug
(ex: php bin/console <my_command> --no-debug
) is crucial for performance/memory in Symfony console commands. This is especially true when using Doctrine, as without it, Doctrine will go into debug mode which consumes a huge amount of additional memory (that increases on each iteration). See the Symfony docs here and here for more info.
--env=prod
You should also always specify the environment. By default, Symfony uses the dev
environment for console commands. The dev
environment usually isn't optimized for memory, speed, cpu etc. If you want to iterate over thousands of items, you should probably be using the prod
environment (ex: php bin/console <my_command> --env prod
). See here and here for more info.
Tip: I created an environment called console
that I specifically configured for running console commands. Here is info about how to create additional Symfony environments.
php -d memory_limit=YOUR_LIMIT
If running a big update, you should probably choose how much memory is acceptable for it to consume. This is especially important if you think there might be a leak. You can specify the memory for the Command by using php -d memory_limit=x
(ex: php -d memory_limit=256M
). Note: you can set the limit to -1
(usually the default for the php cli) to let the command run with no memory limit but this is obviously dangerous.
A Well Formed Console Command For Batch Processing
A well formed console command for running a big update using the above tips would look like:
php -d memory_limit=256M bin/console <acme>:<your_command> --env=prod --no-debug
Use Doctrine's IterableResult
Another huge one when using Doctrine's ORM in a loop, is to use Doctrine's IterableResult (see the Doctrine Batch Processing docs). This won't help in the example provided but usually when doing processing like this it is over results from a query.
Flush Periodically
If part of what you are doing is making changes to the data, you should flush periodically instead of on each iteration. Flushing is expensive and slow. The less often you flush, the faster your command will finish. Keep in mind, however, that Doctrine will hold the unflushed data in memory. So the less often that you flush, the more memory you will need.
You can use something like the following to flush every 100 iterations:
if ($count % 100 === 0) {
$this->em->flush();
}
Also make sure to flush again at the end of your loop (for flushing the last < 100 entries).
Clear the EntityManager
You may also want to clear after you flush:
$this->em->flush();
$em->clear(); // Detach ALL objects from Doctrine.
Or
$this->em->flush();
$em->clear(MyEntity::class); // Detach all MyEntity from Doctrine.
$em->clear(MyRelatedEntity::class); // Detach all MyRelatedEntity from Doctrine.
Output the memory usage as you go
It can be really helpful to keep track of how much memory your command is consuming while it is running. You can do that by outputting the response returned by PHP's built-in memory_get_usage() function.
$output->writeln(memory_get_usage());
Example
$memUse = round(memory_get_usage() / 1000000, 2).'MB';
$this->output->writeln('Processed '.$i.' of '.$totalCount.' (mem: '.$memUse.')');
Roll Your Own Batches
It may also be helpful to roll your own batches. You can do this by using a start and limit just like you would for pagination. I was able to process 4 millions rows using only 90Mb of RAM doing this.
Here's some example code:
protected function execute(InputInterface $input, OutputInterface $output) {
/* ... */
$totalCount = $this->getTotalCount();
$batchSize = 10000;
$i = 0;
while ($i < $totalCount) {
$i = $this->processBatch($i, $batchSize, $totalCount);
}
/* ... */
}
private function processBatch(int $start, int $limit, int $totalCount): int {
/* @var $q \Doctrine\ORM\Query */
$q = $this->em->createQueryBuilder()
->select('e')
->from('AcmeExampleBundle:MyEntity', 'e')
->setFirstResult($start)
->setMaxResults($limit)
->getQuery();
/* @var $iterableResult \Doctrine\ORM\Internal\Hydration\IterableResult */
$iterableResult = $q->iterate(null, \Doctrine\ORM\Query::HYDRATE_SIMPLEOBJECT);
$i = $start;
foreach ($iterableResult as $row) {
/* @var $myEntity \App\Entity\MyEntity */
$myEntity = $row[0];
$this->processOne($myEntity);
if (0 === ($i % 1000)) {
$memUse = round(memory_get_usage() / 1000000, 2).'MB';
$this->output->writeln('Processed '.$i.' of '.$totalCount.' (mem: '.$memUse.')');
}
$this->em->detach($row[0]);
$i++;
}
return $i;
}
private function processOne(MyEntity $myEntity): void {
// Do entity processing here.
}
private function getTotalCount(): int {
/* @var $q \Doctrine\ORM\Query */
$q = $this->em
->createQueryBuilder()
->select('COUNT(e.id)')
->from('AcmeExampleBundle:MyEntity', 'e')
->getQuery();
$count = $q->getSingleScalarResult();
return $count;
}
Good luck!
Doctrine keeps logs of any query you make. If you make lots of queries (normally happens in loops) Doctrine can cause a huge memory leak.
You need to disable the Doctrine SQL Logger to overcome this.
I recommend doing this only for the loop part.
Before loop, get current logger:
$sqlLogger = $em->getConnection()->getConfiguration()->getSQLLogger();
And then disable the SQL Logger:
$em->getConnection()->getConfiguration()->setSQLLogger(null);
Do loop here:
foreach() / while() / for()
After loop ends, put back the Logger:
$em->getConnection()->getConfiguration()->setSQLLogger($sqlLogger);
I resolved this by adding --no-debug
to my command. It turns out that in debug mode, the profiler was storing information about every single query in memory.