Allow access to stored term frequencies for searchable fields
Term frequencies are maintained for searchable fields in order to compute things like TF/IDF quickly. It should be possible to access those term frequencies for additional analysis.
Given the current API surface area of Azure Search, how would you actually use these term frequencies? Said another way, what other capabilities would you require in the API in order for term frequencies to actually be useful?
Thank you for your feedback. While it is unlikely we’ll address this suggestion in the near future, we’ll reassess based on the number of votes it receives.
Azure Search Product Team
Xiaolu Lu commented
Personally, I will also vote for this feature. Also for other statistics stored in the inverted index, such as IDF and document length etc.
It will be great if Azure Search provides access to those features as it will give more flexibility to developers who is willing to develop complex models, and is more friendly for relevance tuning and debugging. For example, use BM25F (which is known to be effective in field-based relevance model), and pseudo-relevance feedback (we wont have a lot of user data at the beginning).
Or at least, to provice some support for developing customized retrieval models
Patrick Cox commented
Personally, I am attempting to develop a learning-to-rank sevice which would require that I calculate numerous metrics such as different forms of TF/IDF, BM25, term entropy, etc.