AzureCloud-FINAL-kb
0 (0 Likes / 0 Dislikes)
>> Hello, everyone.
Over the next few minutes,
I'll talk about some of the features
that make Azure Cosmos DB,
a good choice for globally distributed
data storage.
If you're new to Cosmos DB,
you might be wondering
why you should choose it
for your application,
or perhaps even why your organization
or data architect has chosen to use Cosmos DB
over other data storage options
when there are so many to choose from.
First of all,
[WHAT EXACTLY IS A GLOBALLY DISTRIBUTED DATABASE?]
what exactly is a globally distributed database?
Well, in a nutshell,
globally distributed databases
enable your data to live
in virtually every region of the world
where it's needed.
Now that's important for two reasons.
First, if there's an outage in one region,
you have copies of your data
sitting around the world
that your application can fail over to.
Second, your applications
can pull data from the region
closest to your users
to reduce data access latency.
The question then is
if your data is distributed
and you make changes to data in one region,
how do you sync these changes
to all other instances of your data
located in other regions across the globe?
Well, answering that question turns out
to be a key reason to choose Cosmos DB.
It gives you more control over consistency
than any other distributed database
offering today.
When you're dealing with distributed data,
you need to think about how you want
your data to sync between locations,
given the tradeoffs between
the availability and staleness of the data.
[AVAILABILITY AND STALENESS]
[LATENCY]
The latency you're willing to accept during queries.
And the cost of throughput
[COST]
to get the data to its destinations.
This is something
you need to consider upfront
as you begin to architect your application.
Now historically,
distributed database services
have not been very flexible,
there was either no choice at all,
or you had to choose between
the two extreme ends
of the continuum of consistency.
It's basically the difference between
choosing whether
everything is always in sync
regardless of the cost
versus everything
potentially being out of sync
and never really being sure
you're working with the latest data.
Again, these are two extremes
along the continuum of consistency,
and thankfully Cosmos DB
gives us more choices.
Azure Cosmos DB offers
five consistency models
that tradeoff to varying degrees
between availability,
latency, and throughput.
You select the level that offers
the right balance for your scenario.
Since no one size will fit all
in terms of consistency model,
this makes Cosmos DB
a great choice for distributed data.
We talk about the attributes
of these consistency models
at length in our Architect Modules
on Cosmos DB later.
So the main takeaway for now
as you are getting started
is that depending
on your organization's needs,
you can spend less to keep your data in sync
by allowing it to sync up eventually
or spend more to increase
the throughput of the data
and keep it fully in sync at all times.
The second reason you might choose Cosmos DB
is that it is fully schema-agnostic.
[SCHEMA AGNOSTIC]
This means that, as a developer,
you can iterate the schema
of your application
without worrying about database schema
and/or index management.
It enables you to use key-value, graph,
and document data together
in a single service.
In addition, Cosmos DB automatically indexes
[AUTOMATIC INDEXING]
all the data it ingests
without requiring any schema or indexes.
And serves up blazing fast reads and writes
that are backed
by a Service Level Agreement.
And finally, the third reason
you might choose Cosmos DB
is that it supports different APIs
for accessing the data.
[MANY APIS SAME DATA STORE]
In other words, if you come
from a SQL server background,
you can think of the organization
of your data in a relational way,
and query it using a familiar
SQL-like syntax.
On the other hand, if you're coming
from a MongoDb background,
there's an API that you'll feel
right at home using.
It's the same data in storage,
but the way you access it
and think about it can be
dramatically different.
Cosmos DB currently supports
five different APIs.
[SQL, MONGO DB]
In addition to SQL and Mongo DB,
[CASSANDRA API]
there's a Cassandra API,
[GREMLIN API]
a Gremlin API,
[TABLE API]
and a Table API
for those who want to move on
from Azure Table Storage.
So to recap, Azure Cosmos DB
is a globally distributed data store,
it gives you five consistency models
supported by the Cosmos DB
replication protocol
that provide a clear tradeoff
between specific consistency
guarantees and performance,
and therefore, cost.
It also supports multiple data models
and popular APIs
for accessing and querying data.
And finally, it's schema-less,
yet indexes everything automatically
for extremely fast data retrieval
with money-back performance guarantees.
I hope you found this overview helpful.
Thanks for watching.