what does _doc represents in elasticsearch?
From ElasticSearch 8.x version, only _doc is supported and it is just an endpoint name, not a document type.
In 7.0, _doc represents the endpoint name instead of the document type. The _doc component is a permanent part of the path for the document index, get, and delete APIs going forward, and will not be removed in 8.0.
Elasticsearch 8.x Specifying types in requests is no longer supported. The include_type_name parameter is removed.
Schedule For Removal of Mapping Types
_doc
is a mapping type, which by the way is now deprecated.
A mapping type
used to be a separate collection inside the same index. E.g. a twitter
index could have a mapping of type user
for storing all users, and a mapping of type tweet
to store all tweets. Both of these types still belong to the same index, so you could search inside multiple types in the same index.
Since elaticsearch came out with the news to deprecate mapping types for several reasons, they forced v6 users to ONLY use 1 mapping type per index i.e. you can have either user
or tweet
inside the twitter
index, but not both. They further recommended to be consistent and use _doc
as the name of the mapping type. But this can literally be any string - dog, cat, etc. It is just recommended to be _doc
because in v7 the mapping type field is completely going away. So if every index in elasticsearch only has 1 mapping type, then it would be easier to migrate to v7 because you just have to remove the mapping type and all documents would then directly come under the index.
I believe these two use cases are not using the _doc
terminology for the same purpose:
The keyword
_doc
for sorting is new in Elasticsearch 2 and is a replacement for the old scan and scroll way to efficiently paginate deep into the results of a query. There is no actual_doc
field in the documents.The
_doc
syntax to be used in the_source
portion of a search (or get, update, etc) request has not been implemented as shown at the beginning of that git discussion, but using thefielddata_fields
field instead. It has nothing to do with the usage of_doc
in sorting.
In the scripting documentation you'll find a section about document field data, that is extremely fast to read as it is stored in memory and is accessible using a similar doc
syntax (that might add to the confusion).