AWS Amazon DynamoDB
Amazon DynamoDB is a fully managed proprietary NoSQL database service. It uses synchronous replication across multiple datacenters for high durability and availability.
As a NoSQL DB, it is usually compared with Hadoop or MongoDB (DynamoDB vs. Hadoop vs. MongoDB).
Picture credit : All Things Distributed.
Features:
- Managed NoSQL database.
- Provisioned throughput.
- Fast, predictable performance.
- Fully distributed, fault tolerant.
- JSON support.
This tutorial is based on Getting Started with Amazon DynamoDB.
We'll work through this tutorial using the downloadable version of DynamoDB, including an interactive JavaScript shell.
This lets us learn about the DynamoDB API for free, without having to pay any fees for throughput, storage, or data transfer.
DynamoDB is available as an executable .jar file.
- Download DynamoDB for free using one of these links:
$ wget http://dynamodb-local.s3-website-us-west-2.amazonaws.com/dynamodb_local_latest.tar.gz $ gunzip dynamodb_local_latest.tar.gz $ tar xvf dynamodb_local_latest.tar
- To start DynamoDB
$ java -Djava.library.path=./DynamoDBLocal_lib -jar DynamoDBLocal.jar -sharedDb Initializing DynamoDB Local with the following configuration: Port: 8000 InMemory: false DbPath: null SharedDb: true shouldDelayTransientStatuses: false CorsParams: *
-
We can now access the built-in JavaScript shell.
URL: http://localhost:8000/shell
We can choose languages: Java, .NET, Node.js, PHP, Python, or Ruby.
In this tutorial, our choice is Ruby.
Set up the AWS SDK for Ruby:
$ sudo apt-get install ruby-full
AWS SDK for Ruby is modularized into multiple gems, each of which offers specific functionality:
$ gem install aws-sdk
'aws-sdk' is the main gem of the SDK. It contains two gems 'aws-sdk-core' and 'aws-sdk-resources', which offer two different styles of programming over AWS APIs:
$ gem install aws-sdk-core
The Core gem, 'aws-sdk-core', provides full one-to-one mapping to AWS APIs, in an RPC-style programming model. It also has a number of new built-in features such as automatic response paging, waiters, parameter validation, and Ruby type support in the Amazon DynamoDB client:
$ gem install aws-sdk-resources
The Resources gem, 'aws-sdk-resources', provides an object-oriected abstraction over the "low-level" or RPC-style interface in the Core, for a simpler and more intuitive coding experience.
A resource object is a reference to an AWS resource (such as an Amazon EC2 instance or an Amazon S3 object) that exposes the resource's attributes and actions as instance variables and methods.
Supported services include Amazon EC2, Amazon S3, Amazon SNS, Amazon SQS, AWS IAM, Amazon Glacier, AWS OpsWorks, and AWS CloudFormation, and more services will continue to be added.
We'll create a table named Movies. The primary key for the table is composed of the following two attributes:
- year - The partition key.
- title - The sort key.
We set the endpoint (endpoint: "http://localhost:8000") to indicate that we are creating the table in DynamoDB on our computer.
In the create_table call, we specify table name, primary key attributes, and its data types.
The provisioned_throughput parameter is required; however, the downloadable version of DynamoDB ignores it. (Provisioned throughput is beyond the scope of this exercise.)
$ ruby MoviesCreateTable.rb Created table. Status: ACTIVE
Now we want to populate the Movies table with sample data.
We use a sample data file that contains information about a few thousand movies from the Internet Movie Database (IMDb). The movie data is in JSON format, as shown in the following example.
For each movie, there is a year, a title, and a JSON map named info.
In the JSON data, note the following:
- We use the year and title as the primary key attribute values for our Movies table.
- We store the rest of the info values in a single attribute called info. This program illustrates how we can store JSON in a DynamoDB attribute.
The following is an example of movie data:
{ "year" : 2013, "title" : "Turn It Down, Or Else!", "info" : { "directors" : [ "Alice Smith", "Bob Jones" ], "release_date" : "2013-01-18T00:00:00Z", "rating" : 6.2, "genres" : [ "Comedy", "Drama" ], "image_url" : "http://ia.media-imdb.com/images/N/O9ERWAU7FS797AJ7LU8HN09AMUP908RLlo5JF90EWR7LJKQ7@@._V1_SX400_.jpg", "plot" : "A rock band plays their music at high volumes, annoying the neighbors.", "rank" : 11, "running_time_secs" : 5215, "actors" : [ "David Matthewman", "Ann Thomas", "Jonathan G. Neff" ] } }
Download the Sample Data File : moviedata.zip.
$ wget http://docs.aws.amazon.com/amazondynamodb/latest/gettingstartedguide/samples/moviedata.zip
After downloading the sample data, we can run the following program to populate the Movies table.
MoviesLoadData.rb:
require "aws-sdk-core" require "json" Aws.config.update({ region: "us-west-2", endpoint: "http://localhost:8000" }) dynamodb = Aws::DynamoDB::Client.new tableName = 'Movies' file = File.read('moviedata.json') movies = JSON.parse(file) movies.each{|movie| params = { table_name: tableName, item: movie } begin result = dynamodb.put_item(params) puts "Added movie: #{movie["year"]} #{movie["title"]}" rescue Aws::DynamoDB::Errors::ServiceError => error puts "Unable to add movie:" puts "#{error.message}" end }
Type the following command to run the program:
$ ruby MoviesLoadData.rb ... Added movie: 2010 The Clinic ...
WE can use the query method to retrieve data from a table. We must specify a partition key value but the sort key is optional.
To query all movies released in a year we may want to run MoviesQuery01.rb:
require "aws-sdk-core" Aws.config.update({ region: "us-west-2", endpoint: "http://localhost:8000" }) dynamodb = Aws::DynamoDB::Client.new tableName = "Movies" params = { table_name: tableName, key_condition_expression: "#yr = :yyyy", expression_attribute_names: { "#yr" => "year" }, expression_attribute_values: { ":yyyy" => 1985 } } puts "Querying for movies from 1985."; begin result = dynamodb.query(params) puts "Query succeeded." result.items.each{|movie| puts "#{movie["year"].to_i} #{movie["title"]}" } rescue Aws::DynamoDB::Errors::ServiceError => error puts "Unable to delete table:" puts "#{error.message}" end
Run the code:
$ ruby MoviesItemQuery01.rb Querying for movies from 1985. Query succeeded. 1985 A Nightmare on Elm Street Part 2: Freddy's Revenge 1985 A Room with a View ...
AWS (Amazon Web Services)
- AWS : EKS (Elastic Container Service for Kubernetes)
- AWS : Creating a snapshot (cloning an image)
- AWS : Attaching Amazon EBS volume to an instance
- AWS : Adding swap space to an attached volume via mkswap and swapon
- AWS : Creating an EC2 instance and attaching Amazon EBS volume to the instance using Python boto module with User data
- AWS : Creating an instance to a new region by copying an AMI
- AWS : S3 (Simple Storage Service) 1
- AWS : S3 (Simple Storage Service) 2 - Creating and Deleting a Bucket
- AWS : S3 (Simple Storage Service) 3 - Bucket Versioning
- AWS : S3 (Simple Storage Service) 4 - Uploading a large file
- AWS : S3 (Simple Storage Service) 5 - Uploading folders/files recursively
- AWS : S3 (Simple Storage Service) 6 - Bucket Policy for File/Folder View/Download
- AWS : S3 (Simple Storage Service) 7 - How to Copy or Move Objects from one region to another
- AWS : S3 (Simple Storage Service) 8 - Archiving S3 Data to Glacier
- AWS : Creating a CloudFront distribution with an Amazon S3 origin
- AWS : Creating VPC with CloudFormation
- AWS : WAF (Web Application Firewall) with preconfigured CloudFormation template and Web ACL for CloudFront distribution
- AWS : CloudWatch & Logs with Lambda Function / S3
- AWS : Lambda Serverless Computing with EC2, CloudWatch Alarm, SNS
- AWS : Lambda and SNS - cross account
- AWS : CLI (Command Line Interface)
- AWS : CLI (ECS with ALB & autoscaling)
- AWS : ECS with cloudformation and json task definition
- AWS Application Load Balancer (ALB) and ECS with Flask app
- AWS : Load Balancing with HAProxy (High Availability Proxy)
- AWS : VirtualBox on EC2
- AWS : NTP setup on EC2
- AWS: jq with AWS
- AWS & OpenSSL : Creating / Installing a Server SSL Certificate
- AWS : OpenVPN Access Server 2 Install
- AWS : VPC (Virtual Private Cloud) 1 - netmask, subnets, default gateway, and CIDR
- AWS : VPC (Virtual Private Cloud) 2 - VPC Wizard
- AWS : VPC (Virtual Private Cloud) 3 - VPC Wizard with NAT
- DevOps / Sys Admin Q & A (VI) - AWS VPC setup (public/private subnets with NAT)
- AWS - OpenVPN Protocols : PPTP, L2TP/IPsec, and OpenVPN
- AWS : Autoscaling group (ASG)
- AWS : Setting up Autoscaling Alarms and Notifications via CLI and Cloudformation
- AWS : Adding a SSH User Account on Linux Instance
- AWS : Windows Servers - Remote Desktop Connections using RDP
- AWS : Scheduled stopping and starting an instance - python & cron
- AWS : Detecting stopped instance and sending an alert email using Mandrill smtp
- AWS : Elastic Beanstalk with NodeJS
- AWS : Elastic Beanstalk Inplace/Rolling Blue/Green Deploy
- AWS : Identity and Access Management (IAM) Roles for Amazon EC2
- AWS : Identity and Access Management (IAM) Policies, sts AssumeRole, and delegate access across AWS accounts
- AWS : Identity and Access Management (IAM) sts assume role via aws cli2
- AWS : Creating IAM Roles and associating them with EC2 Instances in CloudFormation
- AWS Identity and Access Management (IAM) Roles, SSO(Single Sign On), SAML(Security Assertion Markup Language), IdP(identity provider), STS(Security Token Service), and ADFS(Active Directory Federation Services)
- AWS : Amazon Route 53
- AWS : Amazon Route 53 - DNS (Domain Name Server) setup
- AWS : Amazon Route 53 - subdomain setup and virtual host on Nginx
- AWS Amazon Route 53 : Private Hosted Zone
- AWS : SNS (Simple Notification Service) example with ELB and CloudWatch
- AWS : Lambda with AWS CloudTrail
- AWS : SQS (Simple Queue Service) with NodeJS and AWS SDK
- AWS : Redshift data warehouse
- AWS : CloudFormation
- AWS : CloudFormation Bootstrap UserData/Metadata
- AWS : CloudFormation - Creating an ASG with rolling update
- AWS : Cloudformation Cross-stack reference
- AWS : OpsWorks
- AWS : Network Load Balancer (NLB) with Autoscaling group (ASG)
- AWS CodeDeploy : Deploy an Application from GitHub
- AWS EC2 Container Service (ECS)
- AWS EC2 Container Service (ECS) II
- AWS Hello World Lambda Function
- AWS Lambda Function Q & A
- AWS Node.js Lambda Function & API Gateway
- AWS API Gateway endpoint invoking Lambda function
- AWS API Gateway invoking Lambda function with Terraform
- AWS API Gateway invoking Lambda function with Terraform - Lambda Container
- Amazon Kinesis Streams
- AWS: Kinesis Data Firehose with Lambda and ElasticSearch
- Amazon DynamoDB
- Amazon DynamoDB with Lambda and CloudWatch
- Loading DynamoDB stream to AWS Elasticsearch service with Lambda
- Amazon ML (Machine Learning)
- Simple Systems Manager (SSM)
- AWS : RDS Connecting to a DB Instance Running the SQL Server Database Engine
- AWS : RDS Importing and Exporting SQL Server Data
- AWS : RDS PostgreSQL & pgAdmin III
- AWS : RDS PostgreSQL 2 - Creating/Deleting a Table
- AWS : MySQL Replication : Master-slave
- AWS : MySQL backup & restore
- AWS RDS : Cross-Region Read Replicas for MySQL and Snapshots for PostgreSQL
- AWS : Restoring Postgres on EC2 instance from S3 backup
- AWS : Q & A
- AWS : Security
- AWS : Security groups vs. network ACLs
- AWS : Scaling-Up
- AWS : Networking
- AWS : Single Sign-on (SSO) with Okta
- AWS : JIT (Just-in-Time) with Okta
Ph.D. / Golden Gate Ave, San Francisco / Seoul National Univ / Carnegie Mellon / UC Berkeley / DevOps / Deep Learning / Visualization