Using CNTK for RNN is great but making the models operational is a pain. It would be great to be able to use the CNTK libraries within the Azure ML service.3 votes
I'd like to be able to see the performance (time taken to execute) of each Module in the Experiment, especially when running as a Web Service. I'd like to know which steps need the most tuning. And I'd like to be able to compare the performance of, for example, a Linear Regression Model vs. a Decision Forest Model.1 vote
At the end of processing, keep the starting datetime and the time elapsed in the upper right corner of MLS
After an experiment is ended, it'll be useful to read the processing time elapsed and the starting datetime in the upper right corner (where "Running" appears) of the main window.
Thank you1 vote
Currently, there is an option to enable logging which can be turned on in the web service configuration page. Unfortunately, those logs are kind of useless to me.
What I would like to get is all the logs from Execute R Script modules available in Azure Storage after each run with logging set to on. Exactly the same ones I can access after running an experiment, containing also text from the console.
Without proper logs I can't easily debug the script. You mentioned over a year ago that you were going to include debugging which would also be very helpful.3 votes
It'll be a really good option write T-SQL scripts in the Apply SQL Transformation module since we are in a MS environment.
Thank you.21 votes
It'd be nice if I could restore the last experiment I deleted by mistake.
Thank you20 votes
The confirm bottom of "select columns" disappear resulting in cant saving the column select in Chorme version 55.0.2883.95 (64-bit). I use Firefox to read the experiments, the confirm bottom show correctly.1 vote
How can I vizualize more than 100 default values in the Result dataset ?
Thanks in advance for your answer.
Best Regards3 votes
I need help to deploy my Web Service.
I don't understand why I have only one numerical output value despide the fact that my model result return 1000 numerical values?
Could you help me to solve my problem?
Thanks in advance.
Best Regards1 vote
"Save trained model" as the new version in a copied experiment should save model onto a copied model, not the original model by default
In a copied experiment, on the "Save trained model" window, when saving model as the new version of an exisiting trained model, by default the trained model from the original experiment is selected for the target exisiting model.
However, this default selection of the exisiting model of the "original" experiment might invoke an unintentional overwrite on the original trained model of the original experiment. So the default selection should be the existing model of the copied experiment for any copied experiments.3 votes
What are we supposed to do with this error message: Saving intermediate output failed. Unable to generate schema and visualization.
I tried to save the Vocabulary from an "Extract N-Gram Features from Text", but got this error message:-
"Saving intermediate output failed. Unable to generate schema and visualization."
What am I supposed to do with that message?2 votes
Please enable collapsible grouping of modules (perhaps similar to SSIS?)
This would be a functionality that enables the visual grouping of modules that belong together. When chosen to group selected modules (via right click for example), the items are then put into a rectangle that is collapsible to hide all items.
This should have no impact on how the modules work.
Reason for this idea is to make the screen easier to navigate when working with large experiments. It could also allow easy copy/pasting of certain modules that belong together.10 votes
I'm using the free workspace, and I'm hitting the 10GB storage limit, how to free the workspace storage to go under the 10GB storage limit?2 votes
Many users have models set up to be automatically retrained on new data when it is available or at set times. It would make sense to have a Mathews Correlation Coefficient calculated on the holdout set for the current model and then when a retrain occurs, calculated on the retrain with the option (or automatically if desired) to update or discard the model based on it having a higher or lower MCC score than the current model. If there were multiple trained models the MCC would be calculated for all of them and the user given the option to pick one or have it automatically selected.
This would head towards more automated machine learning which would require less input and manual training of models.
Many users have models set up to be automatically retrained on new data when it is available or at set times. It would make sense to have a Mathews Correlation Coefficient calculated on the holdout set for the current model and then when a retrain occurs, calculated on the retrain with the option (or automatically if desired) to update or discard the model based on it having a higher or lower MCC score than the current model. If there were multiple trained models the MCC would be calculated for all of them and the user given the option to pick…4 votes
Having seen the output from evaluate model in the experiment tab. I initially thought this would produce the best performing model out of the two algorithms would be the output of that block.
Seen that this is not the case, I feel this is a good idea to implement. This is especially when applied to live data through the web service as there is no guarantee that the chosen algorithm will best fit it over time.
Having to manually check this repeatedly then edit the model correspondingly with any changes would get cumbersome very quickly I imagine.3 votes
Today, after submiting my 3rd entry on competition named "Loan Granting Binary Classification December 2016", I've noticed that there is error in message saying when could I send new entry
Text is placed above API key. It says: "You cannot submit more than 3 time(s) each day. You can submit again at Invalid Date."
So there's no exact date1 vote
The "Remove Duplicate Rows" module currently does not seem to consider missing values to be equal. While this might sometimes be desirable (analogous to two nulls not being considered equal), there are many cases where it would preferable to consider missing values to be equal during duplicate removal. An option that allows this would help avoid less efficient workarounds.3 votes
I have run an N-Gram text processing problem against a Multiclass logistical regression model. The "Evaluate Model" gives me the following results:-
Overall accuracy 0.004621
Average accuracy 0.995271
What does that mean? In short, can you define the following terms: Overall accuracy, Average accuracy, Micro-averaged precision, Macro-averaged precision, Micro-averaged recall, Macro-averaged recall
I can't find a definition of these anywhere on the internet!4 votes
Right now you can access files on Azure Data Lake Store through a Hive cluster, which is both expensive and hard to set up. Azure ML should have direct access to Azure Data Lake Store files.27 votes
I need to do some tagging project. the ideas I quite similar to text classification, however, you classify to many classes [01 text may have more than 02 tags]. Do Microsoft Azure offer any tool to do that?1 vote
- Don't see your idea?