AzureCloud-FINAL-kb copy
0 (0 Likes / 0 Dislikes)
>> Hello, everyone. Over the next few minutes,
I'll talk about some of the features
that make Azure Cosmos DB a good choice
for globally distributed data storage.
If you're new to Cosmos DB, you might be wondering
why you should choose it for your application
or perhaps even why your organization
or data architect has chosen to use Cosmos DB
over other data storage options
when there are so many to choose from.
[WHAT EXACTLY IS A GLOBALLY DISTRIBUTED DATABASE?]
First of all, what exactly is a globally distributed database?
Well, in a nutshell, globally
distributed databases enable your data
to live in virtually every region
of the world where it's needed.
Now that's important for two reasons.
First, if there's an outage in one region,
you have copies of your data sitting around the world
that your application can fail over to.
Second, your applications can pull data from the region
closest to your users to reduce data access latency.
The question then is if your data is distributed
and you make changes to data in one region,
how do you sync these data changes
to all other instances of your data
located in other regions across the globe?
Well, answering that question turns out
to be a key reason to choose Cosmos DB.
It gives you more control over consistency
than any other distributed database offering today.
When you're dealing with distributed data,
you need to think about how you want your data to sync between locations,
given the tradeoffs between the availability
and staleness of the data.
[AVAILABILITY AND STALENESS, LATENCY]
The latency you're willing to accept during queries.
[COST]
And the cost of throughput to get the data to its destinations.
This is something you need to consider upfront
as you begin to architect your application.
Now historically, distributed database services
have not been very flexible,
there was either no choice at all or you had to choose
between the two extreme ends
of the continuum of consistency.
It's basically the difference between choosing whether everything
is always in sync regardless
of the cost versus everything
potentially being out of sync and never really being sure
you're working with the latest data.
Again, these are two extremes
along the continuum of consistency,
and thankfully Cosmos DB gives us more choices.
Azure Cosmos DB offers five consistency models
that tradeoff to varying degrees
between availability, latency, and throughput.
You select the level that offers
the right balance for your scenario.
Since no one size will fit all
in terms of consistency model,
this makes Cosmos DB a great choice
for distributed data.
We talk about the attributes of these consistency models
at length in our Architect Modules on Cosmos DB later.
So the main takeaway for now as you are getting started
is that depending on your organization's needs,
you can spend less to keep your data
in sync by allowing it to sync up eventually
or spend more to increase the throughput of the data
and keep it fully in sync at all times.
The second reason you might choose Cosmos DB
is that it is fully schema-agnostic.
[SCHEMA AGNOSTIC]
This means that, as a developer, you can iterate the schema of your application
without worrying about database schema
and/or index management.
It enables you to use key-value, graph,
and document data together in a single service.
[AUTOMATIC INDEXING]
In addition, Cosmos DB automatically indexes all the data
it ingests without requiring
any schema or indexes.
And serves up blazing fast reads and writes
that are backed by a Service Level Agreement.
And finally, the third reason
you might choose Cosmos DB
is that it supports different APIs for accessing the data.
[MANY APIS SAME DATA STORE]
In other words, if you come from a SQL server background,
you can think of the organization
of your data in a relational way,
and query it using
a familiar SQL-like syntax.
On the other hand, if you're coming
from a MongoDB background, there's an API
that you'll feel right at home using.
It's the same data in storage,
but the way you access it and think
about it can be dramatically different.
Cosmos DB currently
supports five different APIs.
[SQL, MONGO DB, CASSANDRA API, GREMLIN API, TABLE API]
In addition to SQL and MongoDB, there's a Cassandra API, a Gremlin API, and a Table API
for those who want
to move on from Azure Table Storage.
So to recap, Azure Cosmos DB
is a globally distributed data store,
it gives you five consistency models supported
by the Cosmos DB replication protocol
that provide a clear tradeoff
between specific consistency guarantees
and performance, and therefore, cost.
It also supports
multiple data models
and popular APIs for accessing and querying data.
And finally, it's schema-less, yet indexes everything automatically
for extremely fast data retrieval
with money-back performance guarantees.
I hope you found this overview helpful.
Thanks for watching.