Add Excel as source
Add excel file as source.
Data factory now supports Excel format in both copy activity and mapping data flow. Excel format is supported on all the file-based connectors as source. Learn more from https://docs.microsoft.com/en-us/azure/data-factory/format-excel.
Kalpesh Patel commented
While using excel as source in mapping data flow, it copy cell formula rather then cell value. Any idea when this will resolve?
Scott C commented
Been having issues trying to load an excel file to a staging DB SQL table and then moving the excel file to an archive directory. Unfortunately excel destination isn't an option for File System as of yet but I managed to get a work around in place using the Binary option.
On the copy data task, set the source dataset to be a binary dataset and point to your excel file directory/file. The Sink also needs to be binary and for me it was just an archive directory again using a binary file system linked service even though i use my excel file to set this up.
So i have a copy data task which loads my excel file from FTP to a DB table. I then have this second copy data task which uses a file system binary task (still select excel file to setup) to pick up the excel file from FTP and a binary archive destination (sink) to move the excel file to archive.
Hi Excel Data Set is working fine in the Copy Activity but where as in the Mapping data flow it is throwing an error in the Data Preview as Error:
DF-SYS-0 Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 7, 10.139.64.15, executor 0): com.databricks.sql.io.FileReadException: Error while reading file
Can you please look into this and help me
Vidya Salimath commented
I'm trying to create a Excel File Format for the Azure Blob storage and not able to see the Excel type.Although I created a type of DelimitedText format and changed the type to Excel then tried to map source in the Mapping data flow,seems still it's not supported.Kindly let me know when this is available so that I can plan the things.
Morten Høybye Frederiksen commented
Please add sheet identification by index (in addition to name) - sometimes files are delivered from systems with auto generated sheet names indicating e.g. version or revision, which makes it impossible to loop over multiple files automatically.
Excel as a source is now present in ADF v2 but not as a sink yet
Joseph Reisinger commented
Hi Do we have a rough date when this will be available?
Awesome! when can we try this out?
Is there any update on Excel connector in ADF?
Please let me know when we can expect in private/public preview?
Any estimation on how long this feature will take?
Francesco Milano commented
See that's started, great news! Could we expect "Excel as Destination" as well in this implementation? Thanks
David Lean commented
As this is a "planned" feature may I suggest some thoughts on improving on the existing limitations found in SSIS using the Access driver.
1. Offer 2 modes; a. Relational & b. Key/Value pair.
Key Value pair means every cell becomes a row. (Sheet, Row, Column, value ) with an option that cells with empty values do/don't produce rows. (or maybe can specify which sheet you want. )
This is useful when processing financial models. which have a few tabular sections & input sections & other bits & pieces spread around the sheet. There are millions of these in the world & currently the only way to extract the data is custom VSTO/VBS script. Ideally you could limit the output to a rectangle of "Used" cells & not the 1 million possible rows.
This format is also handy for input to PowerBI
Relational currently suffers from its attempt to guess the datatype based on the first few rows. Inevitably alpha values in row 200,001 get converted to NULL. or truncated to 256 chars.
Have the ability override & specify the exact schema if desired.
Michael Daynes commented
As Excel is the most common data source in organisations, I am surprised it wasn't the first data source to be added to Data Factory.
Rachna Verma commented
Hi Team, Any updates on this. By when we can expect this to be rolled out ? Excel is one of the very commonly used data source and should have been supported as a source long back.
Andrew David Murphy commented
This needs to be done as much as we know that Excel is the source of so many data issues - its always going to be a source that is required.
We have the same problem: Excel as input format with multiple tabs.
The suggested workaround by MaFishHH ("use wrangling data flow") has the same limitation, only csv is supported as data set format.
Don't forget to add Excel as a TARGET too !
Hey guys, you could check if the ADF feature "data flows" --> "wrangling data flow" is fine for you. This is a PowerQuery engine, this should also allow reading Excel files. You can add this data flow into your pipeline by the "data flow" activity.
Ronny Hagen commented
Any news on this topic?