Drupal - Finding nodes that have not been indexed
The status of node indexing is based on the search_dataset
. This table stores content keyword blobs and their associated sid
, the primary key for the content's keyword (i.e. the nid
). When compared/joined against the node
table, it should let you see which nodes aren't index.
From what it sounds like, you've already spotted problem node(s) so it's just a matter of confirmation. Removing the node(s) from being indexed (e.g. hacking
NodeSearch::indexNode()
) to confirm it's the problem, then finding out what content in the node is blocking indexer.