GCP Google Cloud Platform - Storage and OLTP Databases

October 02, 2018

The Series

Big Data

Basic Concepts TL;DR
Disks? Not so simple…
Common Disks for Cloud are called Block Storage and are usually made of slices of several disks “virtually” seen as one.
And replicated behind the scene.
You may decide type (SSD, HDD), speed and location (latency). Regional.
Different Costs and Performances.
But you may also have RAM Disks and Distributed File Systems
key value Storage (similar to Drive, Dropbox) → Cloud Storage.
Cloud Storage (similar to S3) may only contain files (not applications,for example)
and it is inexpensive and full of nice functions (versioning, lifecycle, archiving..)

We are talking about Managed Services by GCP, that is, a pret-a-porter DB.
But they usually have automated backups and failover capabilities. With minimum or no effort.
Amazing! Pay attention...it is not a free meal. DBs are usually the most expensive Services.
Main Features/Products:
Cloud Sql (mySql and Postgres): small (mySql) to medium (Postgres) DB. Inside a Region.
Spanner: big, multiregion and highly scalable RDBS

Cloud Datastore: economic document noSQL (like Mongo)
with like SQL query language, scalable but it has not always strong consistency.
BigTable: at a same time a DB and a BigData tool. Unique in its kind.
A key value noSQL DB, but the values are structured in hierarchical columns.
Used by Google for Gmail, Maps ecc.
Petabytes in ms.

Ask yourself
What does it mean RPO and RTO? Why are they so important?
A Wordpress Site, a Gaming Startup and a Big Corporation or Institution.
Which are the requirement for Database and Security? Which products are suitable? Why?
How can I build a Static Web Site with Cloud Storage? Which are the benefits?

No big explanations but only the relevant info with links. More info than you actually need for certification. Just pick what you need.

Regional persistent disk and regional SSD persistent disk:  max 64 TB replicated in two zones. IOPS: 3,000/60,000 - greater latency
Local SSD: more expensive ephemeral max 3TB Instance IOPS: 280,000/680,000
Create a file server or distributed file system on Compute Engine to use as a network file system with NFSv3 and SMB3 capabilities.
Mount a RAM disk within instance memory to create a block storage volume with high throughput and low latency.
Snapshots - Backups  incremental encryption with system-defined keys or with customer-supplied keys
Objects and  Buckets  99.999999999% durability
gsutil tool  
encryption at rest  
storage-transfer service transfers data from an online data source to a data sink. Your data source can be an Amazon Simple Storage Service (Amazon S3) bucket, an HTTP/HTTPS location, or a Cloud Storage bucket. Your data sink (the destination) is always a Cloud Storage bucket
Coldline Storage  99.9% $0.007 Month/GB

Useful functions:

Regional 1st Generation up to 5.5 - 2nd Generation 5.7
Max 16 GB of RAM and 500 GB data storage
sql-proxy → secure external connections
Data replication + zones;  Mysql client External applications
Kubernetes Engine → connect to
HA configuration Cluster: primary instance failover replica (different zone only one)

up to 416 GB of RAM and 64 CPUs
Regional 2nd Generation 9.6 -. Data replication + Zones
no Point-in-time recovery (PITR)
HA configuration Regional instance located in a primary and secondary zone + standby instance
synchronous replication

noSQL document GQL → SQL-like query language cheap+free tier
Table→ Kind - Row → Entity -  Field → Property key
ACID properties

Cloud Bigtable is a key/value store. It does not support joins, nor does it support transactions except within a single row
ZONAL regional
Cloud Bigtable performs best with 1 TB or more of data.
noSQL that scale to billions of rows and thousands of columns, enabling you to store terabytes or even petabytes of data. Managed but not-so-easy to configure/optimize.
Instance is a container for your clusters and nodes
single-keyed data with very low latency  integrates Apache Bigdata Apache HBase library for Java
replicate→ add a second cluster all automatic
Column families data→ raw byte strings 8-byte big-endian
Front end server pool → node  (pointers) → data (colossus)
sharded into blocks of contiguous rows, called tablets  
Design time series hotspotting

fully managed, mission-critical, relational database SQL (ANSI 2011 with extensions) automatic, synchronous replication for HA.
Replication globally synchronous replication always the most up-to-date data read-write replicas, read-only replicas, and witness replicas.
Cloud Spanner divides your data into chunks called "splits", where individual splits can move independently from each other and get assigned to different servers, which can be in different physical locations.
Read-only replicas are only used in multi-region instances, are not eligible to become a leader and may serve stale reads without needing a round-trip to the default leader region.
Regional configurations contain exactly three read-write replicas.
Multi-region configurations contain more replicas but not all have voting rights.Instead Witness replicas vote for commit but don’t have all the data.

Study Material
Links may not work if you are not enrolled to Coursera

AWS Storage short and free

Demos Videos
Don’t get scared: many videos last just 1 minute...Only a little demo. If you are in a hurry, they can replace labs.

Labs Qwiklabs
Migrate a MySQL Database to Google Cloud SQL
Loading Data into Google Cloud SQL
Cloud SQL for PostgreSQL: Qwik Start
Cloud Spanner: Qwik Start
Bigtable: Qwik Start - Command Line

Practice Tests
For any doubt you may refer to the doc Building Blocks

  • Share:

You Might Also Like


Note: only a member of this blog may post a comment.