Provide load balancing option across regions at the server level
I have a read replica set up in a different region/same country. Since I am paying full price for that region I want to be able to use it to carry some of the query load. Using the node.js SDK, although preferred location and endpoint discovery is set as documented all queries get sent only to the primary region. Regardless, it should not be left up to client side applications to manage load balancing - that might be ok if you only have 1 application. Also, we might want to use applications that (shock horror) are not coded ourselves- Power BI for example.
At a minimum, a round-robin option should be available but preferably something a little more intelligent with queries routed to the least utilised region.
With regards to secondary regions having different RU/s we do not have this on our road map at this time. There is another User Voice items which have this suggestion, Please feel free to vote that item up.
For scenario 1. Same answer.
For scenario 2. Typically, just putting the front-end closer to users is not something we see help with application latency where they are backed by a database as the app still needs to call the database which is still separated by some distance.
With regards to knowing which region to point to I would suggest conducting tests to see which regions provide the best latency and include this as the preferred regions for each regional deployment.
Thank you both for your suggestion and comments.
East US and West US both have Cosmos DB and my application deployed. In this case, it seems sensible that the application deployment to East US should have the following Preferred Location list: East US, West US. The deployment to West US would have the reverse. But what about if 95% of my user base routes to the West US application via Traffic Manager, based on closest region? This gives minimal latency from user to application (yay, Traffic Manager win!), but then causes load distribution problems on the Cosmos DB and my RUs are not efficiently utilized because East US will need to have enough provisioned to serve West US (side note, there is another request for provisioning region-specific RUs that can also mitigate this problem).
Lets say I have the following Cosmos regions:
East US, West US, South Central US
And I have my application deployed across these regions:
East US, West US, South East Asia, Japan East, Brazil South, North Europe
Lets say I have good reasons not to replicate the data outside US, but have to reach a global user base so there is value in getting the applications closer to the user.
In this case, the deployments in the regions outside the US have to essentially be hard-coded (or at least statically configured) to specify the Preferred Locations of Cosmos DB replicas, but how would I know the best sequence of regions to specify from any given region (without guess and check, and ongoing validation)?
In general, I fully agree that the service should provide a mechanism to decide which replica should serve a read request based on availability, geo proximity, load, etc. without mandating the client to do so. The client should certainly be able to explicitly override, but if not specified the service should have the capability without the current behavior of "If the PreferredLocations property is not set, all requests will be served from the current write region"
Ian Bennett commented
The replica is setup more for DR purposes but there is no option to run the secondary at reduced RUs so that is potentially a lot of wasted resource and money. If I am using something like Power BI Service then I don't get a lot of say in where the application runs from. You are assuming a 3 Tier Architecture but Cosmos is not just being used for apps and websites. Data Analytics applications are often 2 Tier and more taxing on the database.