Add an API for Text extraction from a file (or string/stream with file content)
Indexer https://azure.microsoft.com/en-us/documentation/articles/search-howto-indexing-azure-blob-storage/ does extraction of text from a file (document).
Instead of using azure blob (extra implementation and usage charge) and delay with processing a document from there, client will have possibility to extract text from document and do mergeOrUpload content.
We know when file content was changed (or a new file created) and would like to handle that our self.
Thank you for your feedback. We’re considering this for a future release of Azure Search. Essentially we need a push-API before document cracking and enrichment occurs.
Azure Search Product Team
this would also be a solution to my problem :https://feedback.azure.com/forums/263029-azure-search/suggestions/20233354-make-the-blob-indexer-faster
If we had this implemented, we could scale the indexing process using Azure Batch or hadoop or whatever... please implement something because it is seriously affecting our progress.