Azure Search

Azure Search is a search-as-a-service solution that allows developers to incorporate a sophisticated search experience into web and mobile applications without having to worry about the complexities of full-text search and without having to deploy, maintain or manage any infrastructure

  • Hot ideas
  • Top ideas
  • New ideas
  • My feedback
  1. OCR Cognitive Skill for both printed and handwritten text

    When parsing documents and images through the OCR cognitive skill, the 'handwritten' text extraction algorithm fails on printed documents and vice versa. This obviously isn't a bug, but it is an issue when indexing data dumps of both document types. It seems like a fix might be to have a small binary classifier model which can infer which model is appropriate for each document.
    An alternative might be an easy method of flagging documents as handwritten or printed to handle them with the appropriate model.

    10 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Indexing  ·  Flag idea as inappropriate…  ·  Admin →

    Thank you for your feedback. While it is unlikely we’ll address this suggestion in the near future, we’ll reassess based on the number of votes it receives.

    While we’re currently not planning on solving this out of the box, we are exploring a new cognitive services for custom document classification, that you could use to build your own classifier, and then wire it as a cognitive skill to your search pipeline. Feel free to reach out to us if you’re interested in exploring this further.

    Thanks,
    Elad
    Azure Search Product Team

  2. Expose Content as searchable field even in structured (JSON) data

    If I want to index JSON files I can either full text search using parsingMode TEXT or I can index fields using fieldMappings.

    Why not make both available? That way a user can get a fast search using an indexeg field (like a Product number), or a slower one using the file content (like Product description).

    8 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Indexing  ·  Flag idea as inappropriate…  ·  Admin →
  3. Access to queue of azure indexers waiting to run

    I think it would be nice to allow the SDK to have access to the queue of azure indexers waiting to run, that way you can tell which ones will be running next

    7 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Indexing  ·  Flag idea as inappropriate…  ·  Admin →
  4. Ability to iterate over all documents in an index

    Other search engines have the ability to examine the contents of an index by simply iterating over the document collection. This is extremely use for maintenance operations like reconciling the contents of the index to records in a database. It is quite easy for records to be left behind in the index which no longer have relevance. The only way to accomplish this now is by searching and paging through the results. My current index has over 100,000 entries which is the limitation of the Skip directive. The recommended workaround is to filter the data on another field, however this…

    7 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Indexing  ·  Flag idea as inappropriate…  ·  Admin →
  5. Add a skill to merge arrays

    Create a skill that can merge arrays. This would be useful when extracting key-phrases from a set of pages, and then you need to merge them and don't want duplicates

    7 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Indexing  ·  Flag idea as inappropriate…  ·  Admin →
  6. API to check availability of documents for search

    Currently there is short delay before documents become available for search after they were created or updated.

    For better consistency a blocking call that returns after all of the documents updates, creates, etc have been indexed and are ready for search would be great.

    Or provide status api for the set of documents updates, creates, etc sent to understand if they have all been indexed and are ready for search

    7 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Indexing  ·  Flag idea as inappropriate…  ·  Admin →
  7. Adding hard delete policy to all of Indexer

    Can we add hard delete policy to all of indexer as same as soft delete policy?
    Some senario in custmer immideiately want to refect source change result. We know we can do it by making push base approach, but it's much easier and simplyer to ensure single data update way.

    Cosmos:
    https://docs.microsoft.com/en-us/azure/search/search-howto-index-cosmosdb#indexing-deleted-documents

    Azure Table:
    https://docs.microsoft.com/en-us/azure/search/search-howto-indexing-azure-tables#incremental-indexing-and-deletion-detection

    6 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Indexing  ·  Flag idea as inappropriate…  ·  Admin →
  8. Expose write metrics not just read metrics through the portal

    At the moment the portal only shows read metrics (QPS, Latency, Throttled) but this is only half the picture of what can be impacting an Azure Search Service.

    Write metrics (e.g. index updates per second) are just as important especially for high volume re-indexing operations,

    To get anything like this needs an export of logs, which goes to an unusable Json format (i.e. CSV is more useful for analysis) and/or a PowerBI account.

    Neither of which are particularly useful for this scenario where we just need a quick review of what's actually going on with our index.

    6 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Indexing  ·  Flag idea as inappropriate…  ·  Admin →
  9. Index blob container metadata

    We're trying to set up an indexer for groups of related blobs (e.g. multiple formats of a single image file) using containers. Unfortunately, it seems like the blob indexer doesn't extract metadata from containers - only the blobs themselves. As a workaround, we can duplicate metadata across all blobs in a container, but it would be nice if the indexer supported indexing container metadata directly.

    6 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Indexing  ·  Flag idea as inappropriate…  ·  Admin →
  10. Indexer(crawler) on Sharepoint document library or any external data source

    Customer has a lot of documents in sharepoint document library.Unlike pushing data to an index, crawling all the documents from external data source(here it is sharepoint) and creating an index in azure search so that we can leverage cognitive services.

    We understood sharepoint integration with azure search is present in your roadmap, but if there is a possibility to leverage index(crawler) to index external data source it would be great. There are lot of clients approaching for this usecase.

    Thanks for your time.

    6 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Indexing  ·  Flag idea as inappropriate…  ·  Admin →
  11. Use a XSD schema to populate the index fields

    We have large scehams that define individual aspects of healthcare data. We would like to use these schemas to define the fields than entering most of these manually. For this to work, obviously we need support for crawling XML data (similar to what you have for JSON). XML provides a good structure and has many industry standard schemas that we can leverage.

    5 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Indexing  ·  Flag idea as inappropriate…  ·  Admin →
  12. Index Dynamics 365

    Add Dynamics 365 Customer Engagement to index options. Today, search is costly for smaller customers as they must create a data warehouse for search.

    4 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Indexing  ·  Flag idea as inappropriate…  ·  Admin →

    Thank you for your feedback. While it is unlikely we’ll address this suggestion in the near future, we’ll reassess based on the number of votes it receives.

    Also, you might want to take a look at Azure Logic Apps to ingest data from this source and then use a HTTP connector to push data into Azure Search.

    Thanks,
    Liam
    Azure Search Product Team

  13. On bulk merge/upload, response should be 200 ok if document was successfully re-indexed

    The issue is documented here:
    https://stackoverflow.com/questions/50746146/does-mergeorupload-action-on-azure-search-index-takes-some-re-indexing-time-even

    Azure search should respond back with success message on successful re-indexing of the merged\uploaded data as against current behavior which seems to indicate "Merge \Upload request is taken up and would be indexed after unknown time, please keep looking"..

    3 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Indexing  ·  Flag idea as inappropriate…  ·  Admin →

    Thank you for your feedback. We understand the value this would bring, but there is concern about keeping an update request open the entire time it takes to index everything contain within it. What we may consider here is an API to request the status of the indexing queue to determine what has/hasn’t been processed. We’d like to know if this would be useful to you.

    While it is unlikely we’ll address this suggestion in the near future, we’ll reassess based on the number of votes it receives.

    Thanks,
    Mike
    Azure Search Product Team

  14. Detailed Request Monitoring

    We use Sitecore CMS which now supports Azure Search indexes. During our implementation we have discovered that re-index process fails. Using Fiddler we found the Azure Search REST requests that returned 207 (Multi status) responses and then reviewed the response body to determine the errors.

    Beyond Fiddler running as a proxy and capturing the HTTPS traffic from Sitecore - we found no other way to get insight into the requests/responses from Azure Search service.

    We enabled Operation Log monitoring but the Operation Log monitoring does not provide details about request body and response body.

    We looked into Search Traffic Analytics,…

    3 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Indexing  ·  Flag idea as inappropriate…  ·  Admin →
  15. Ability to specify a skillset to be used when adding documents to Search Index via DocumentsOperations SDK and REST API

    Need to be able to specify the skillset to be used when adding documents to Azure Search index using the SDK (or even the REST API one - https://docs.microsoft.com/en-us/rest/api/searchservice/addupdate-or-delete-documents). Currently it's only possible by having an indexer. Please refer StackOverflow question for the scenario in which this would be useful:
    https://stackoverflow.com/questions/54529101/using-a-skillset-when-adding-documents-to-azure-search-index/54539852?noredirect=1#comment95962064_54539852

    3 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Indexing  ·  Flag idea as inappropriate…  ·  Admin →
  16. Inverted Index

    After import data, I would check the words registered in indexes. However, Azure Search doesn't provide such features.

    3 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    2 comments  ·  Indexing  ·  Flag idea as inappropriate…  ·  Admin →
  17. Ensure data properties sent as input to custom skill web APIs are valid JSON

    Ensure data properties sent as input to custom skill web APIs are valid JSON.
    Some of the input JSONs received as per https://docs.microsoft.com/en-us/azure/search/cognitive-search-custom-skill-web-api have had unescaped double quotes, or special characters which aren't converted to Unicode

    3 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Indexing  ·  Flag idea as inappropriate…  ·  Admin →
  18. Azure Data Factory indexer data source

    Currently Azure Search is supported as one of the Destinations in ADF Pipelines. It would be awesome if we have support for them to act as Data sources. In this way migrating from one Azure search index to another would be easy.

    3 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Indexing  ·  Flag idea as inappropriate…  ·  Admin →
  19. Reindex completely all documents

    I can provide reason for doing it. For example there is some legacy application and SQL View as source for search or even app that is using Cosmos DB. Soft delete is only option. It could be very hard to change application code to remove documents from search index everywhere. So removed documents could stay in index.
    I know from feedback portal that Hard delete is not planned, but it is not needed if there would be possibility to allow indexer to reindex all documents completely. Just checkbox setting.

    3 votes
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    1 comment  ·  Indexing  ·  Flag idea as inappropriate…  ·  Admin →
  20. Allow limiting blob documents to be indexed based on a specific metadata value

    We only want to index a subset of documents in our blob container and in order to do so now, we have to have two blob containers and manage them. Similar to how you can limit document types to be indexed or not, the ability to restrict the scope of Blob objects based on a metadata value would help reduce our operating expenses and document management overhead. We could add a new Metadata name called "AzureSearch" and if set to "true", would be picked up by the indexer. Removing it from the index would simply require changing that value and…

    1 vote
    Vote
    Sign in
    (thinking…)
    Sign in with: Microsoft
    Signed in as (Sign out)
    You have left! (?) (thinking…)
    0 comments  ·  Indexing  ·  Flag idea as inappropriate…  ·  Admin →
  • Don't see your idea?

Feedback and Knowledge Base