How can we improve Azure SQL Database?

Implement python bindings for azure-sqldb-spark connector

The azure-sqldb-spark Spark connector (https://github.com/Azure/azure-sqldb-spark) provides support for Spark on Scala, but does not currently provide Python bindings.

Python-based Spark applications can still connect to MSSQL/Azure SQL databases using a JDBC connection, but this approach does not support bulk-inserts and is therefore quite slow for persisting large Spark dataframes to MSSQL.

It would be useful if PySpark applications could take advantage of the bulk insert capabilities provided by the azure-sqldb-spark scala package. From the git repo readme: "Comparing to the built-in Spark connector, this connector provides the ability to bulk insert data into SQL databases. It can outperform row by row insertion with 10x to 20x faster performance."

129 votes
Vote
Sign in
(thinking…)
Sign in with: Microsoft
Signed in as (Sign out)
You have left! (?) (thinking…)
Matthew Kravetz shared this idea  ·   ·  Flag idea as inappropriate…  ·  Admin →

0 comments

Sign in
(thinking…)
Sign in with: Microsoft
Signed in as (Sign out)
Submitting...

Feedback and Knowledge Base