What is mview in Magento 2?
In official documentation: https://devdocs.magento.com/guides/v2.3/extension-dev-guide/indexing.html there is the stement:
`Allows tracking database changes for a certain entity (product, category and so on) and running change handler.
Emulates the materialized view technology for MySQL using triggers and separate materialization process (provides executing PHP code instead of SQL queries, which allows materializing multiple queries).`
MView stands for Materialized View which is a snapshot of the database at a point in time. https://en.wikipedia.org/wiki/Materialized_view Why would we need to duplicate tables. Indexers are costly to run, especially when there is traffic on category pages, customers place orders and admins save products. On product save the cache gets invalidated (off topic). In case of stock indexer, before it ends the execution, it sends the entity ids affected as cache tags to be cleaned (full page cache type). In Magento 2.0 categories ids of purchased products are sent. In Magento 2.1 the product ids are sent.
There are 2 MySQL tables that keep indexer codes and statuses:
indexer_state
mview_state
mview_state
works with Update by Schedule
in Admin > System > Indexer Management
Update by Schedule
makes the indexers to be run in cron.
There are 3 entries in Magento_Indexer/etc/contab.xml
:
<group id="index">
<job name="indexer_reindex_all_invalid" instance="Magento\Indexer\Cron\ReindexAllInvalid" method="execute">
<schedule>* * * * *</schedule>
</job>
<job name="indexer_update_all_views" instance="Magento\Indexer\Cron\UpdateMview" method="execute">
<schedule>* * * * *</schedule>
</job>
<job name="indexer_clean_all_changelogs" instance="Magento\Indexer\Cron\ClearChangelog" method="execute">
<schedule>0 * * * *</schedule>
</job>
</group>
indexer_reindex_all_invalid
is run onindexer_state
. There are is still the need to run 'normal' indexers in cronindexer_update_all_views
is run onmview_state
indexer_clean_all_changelogs
- clears changelogs used bymview_state
Note that cron indexer group tasks run in a separate php process, as declared in etc/contab_groups.xml
:
<use_separate_process>1</use_separate_process>
.
Changelog tables are:
[indexer name]_cl
(suffixed with _cl
).
e.g. cataloginventory_stock_cl
. If you have indexers set to Update by Schedule
and save a product in admin you'll see the entity_id
of that product in this table. It's a big circle, I'm thinking place order or create shipment will add here an entry too.
Someone provided an example in official devdoc on how to create new materialized views and what are the interface methods required (disregard the above statement about orders in the snippet bellow):
<?php
<VendorName>\Merchandizing\Model\Indexer;
class Popular implements \Magento\Framework\Indexer\ActionInterface, \Magento\Framework\Mview\ActionInterface
{
public function executeFull(); //Should take into account all placed orders in the system
public function executeList($ids); //Works with a set of placed orders (mass actions and so on)
public function executeRow($id); //Works in runtime for a single order using plugins
public function execute($ids); //Used by mview, allows you to process multiple placed orders in the "Update on schedule" mode
}
This will make sense:
//public function execute($ids); Used by mview, allows you to process multiple **entities** in the "Update on schedule" mode
}
Where $ids
parameter has the entities ids from *_cl
tables.
What is the link between cache invalidation and indexers. Categories pages are now full page cached (built-in full page cache or through Varnish).
There is \Magento\Indexer\Model\Processor\InvalidateCache::afterUpdateMview
:
/**
* Update indexer views
*
* @param \Magento\Indexer\Model\Processor $subject
* @return void
* @SuppressWarnings(PHPMD.UnusedFormalParameter)
*/
public function afterUpdateMview(\Magento\Indexer\Model\Processor $subject)
{
if ($this->moduleManager->isEnabled('Magento_PageCache')) {
$this->eventManager->dispatch('clean_cache_after_reindex', ['object' => $this->context]);
}
}
Back to Magento\Indexer\Cron\UpdateMview::execute()
:
/**
* Regenerate indexes for all invalid indexers
*
* @return void
*/
public function execute()
{
$this->processor->updateMview();
}
Magento\Indexer\Model\Processor::updateMview()
:
/**
* Update indexer views
*
* @return void
*/
public function updateMview()
{
$this->mviewProcessor->update('indexer');
}
In app/etc/di.xml
there is:
<preference for="Magento\Framework\Mview\ProcessorInterface" type="Magento\Framework\Mview\Processor" />
/**
* Materialize all views by group (all views if empty)
*
* @param string $group
* @return void
*/
public function update($group = '')
{
foreach ($this->getViewsByGroup($group) as $view) {
$view->update();
}
}
Magento\Framework\Mview\ViewInterface
/**
* Materialize view by IDs in changelog
*
* @return void
* @throws \Exception
*/
public function update();
app/etc/di.xml
<preference for="Magento\Framework\Mview\ViewInterface" type="Magento\Framework\Mview\View" />
In Magento\Framework\Mview\View::update()
there is:
$action = $this->actionFactory->get($this->getActionClass());
$this->getState()->setStatus(View\StateInterface::STATUS_WORKING)->save();
..
$action->execute($ids);
..
If you search in vendor/
directory for Magento\Framework\Mview\ActionInterface
you'll find for example this:
In \Magento\CatalogInventory\Model\Indexer
:
class Stock implements \Magento\Framework\Indexer\ActionInterface, \Magento\Framework\Mview\ActionInterface
In this class there is:
/**
* Execute materialization on ids entities
*
* @param int[] $ids
*
* @return void
*/
public function execute($ids)
{
$this->_productStockIndexerRows->execute($ids);
}
And it looks like it goes back to 'normal' class of indexers' execute` method which is used by MView.
About cache cleaning after Stock Indexer.
When an order is placed on checkout, the quantities are subtracted using this observer: \Magento\CatalogInventory\Observer\SubtractQuoteInventoryObserver
$itemsForReindex = $this->stockManagement->registerProductsSale(
$items,
$quote->getStore()->getWebsiteId()
);
Further, another observer triggers indexer (but not directly on Mview/Indexer by Schedule):
\Magento\CatalogInventory\Observer\ReindexQuoteInventoryObserver
if ($productIds) {
$this->stockIndexerProcessor->reindexList($productIds);
}
In Mview case, when the new quantities are subtracted in SubtractQuoteInventoryObserver
, the MySQL trigger (created for Mview) will insert a row in cataloginventory_stock_cl
, marking that a reindex (stock & fulltext) needs to be done to those purchased product ids.
There are many MySQL triggers created for Mview. See them all with SHOW TRIGGERS;
.
When a product gets out of stock after checkout you'll see 2 rows inserted in that table (Magento saves 2 times stock item in these 2 observers).
When cron runs stock indexer in Mview mode the affected product ids (in M2.1) or categories ids (in M2.0) are sent to cache clean as cache tags. By cache I mean full page cache type. Example: catalog_product_99
or other cache tag format depending on the Magento version. The same when Mview is not enabled.
\Magento\CatalogInventory\Model\Indexer\Stock\AbstractAction::_reindexRows
...
$this->eventManager->dispatch('clean_cache_by_tags', ['object' => $this->cacheContext]);
And Magento_PageCache has observer \Magento\PageCache\Observer\FlushCacheByTags
that will clean full page cache type by tags. It does it for buil-in full page cache. Varnish related code is in \Magento\CacheInvalidate\Observer\InvalidateVarnishObserver
.
There is a free extension that will deny cache clean on still in stock products after customer checkout:
https://github.com/daniel-ifrim/innovo-cache-improve
Cache cleaning only on out of stock products after checkout was introduced in default Magento 2.2.x. See \Magento\CatalogInventory\Model\Indexer\Stock\CacheCleaner
.
I'm thinking the cron execution for indexer in Admin > Stores > Configuration > Advanced > System > Cron configuration options for group: index
should be set to much more than 1 minute.
Reference from Magento document is already here so I'm skipping that part.
Magento implemented materialized view in 2.0 which tracks changes for all indexers. Each indexer has a _cl
table which gets entity_id
and a auto_increment
version_id
from triggers added on main tables.
When cron job executes, indexer gets last version_id
for each view from mview_state
table and index next available entities in _cl
table.
Reindexing was a headache till 1.9.x.x and with huge catalog it always slow down the system.
In Magento 2.0 indexer only update the particular entity information on indexer tables rather than reindexing whole data. This keeps ball rolling without slower down the server.
Note: Materialized View is not supported in mysql so in Magento, it is managed by PHP code and it works similar to Materialized view which is a feature in enterprise level DBMS like oracle.
The mview.xml
is used along with indexer.xml
to setup indexers.
The mview.xml
file declares:
- indexer view ID
- indexer class
- the database tables the indexer tracks
- what column data is sent to the indexer
The indexer.xml
file declares:
- indexer ID
- indexer class name
- indexer title
- indexer description
- indexer view ID
You can find more information about custom indexer declaration here: Custom indexer on Magento2
From what I understood, there's two different things here:
- The indexer from the
Magento_Indexer
module - The Mview from
Magento\Framework\Mview
which emulates the materialized view for MySQL using triggers.
Here are some englighted info from the official documentation
Indexing types
Each index can perform the following types of reindex operations:
Full reindex, which means rebuilding all the indexing-related database tables.
Full reindexing can be caused by a variety of things, including creating a new web store or new customer group. You can optionally fully reindex at any time using the command line.
Partial reindex, which means rebuilding the database tables only for the things that changed (for example, changing a single product attribute or price).
The type of reindex performed in each particular case depends on the type of changes made in the dictionary or in the system. This dependency is specific for each indexer.
Regarding the Workflow, here it is for partial reindexing: