Amazon RedShift - install, data load, and query
bogotobogo.com site search:
Launch Cluster










Connecting to Redshift



We'll use a web-based PostgreSQL client (pgweb, ip-address = 34.216.21.87) to connect to Redshift.


Loading data
In this section, we'll create a Redshift table to store flight information.
Paste the following above the "Run Query":
CREATE TABLE flights ( year smallint, month smallint, day smallint, carrier varchar(80) DISTKEY, origin char(3), dest char(3), aircraft_code char(3), miles int, departures int, minutes int, seats int, passengers int, freight_pounds int );

Click "Run Query":

We can now load our data. The data has already been uploaded to S3 and all we have to do is COPY.
Paste the following above the "Run Query":
COPY flights FROM 's3://us-west-2-aws-training/awsu-spl/spl17-redshift/static/data/flights-usa' IAM_ROLE 'arn:aws:iam::113125727155:role/Redshift-Role' GZIP DELIMITER ',' REMOVEQUOTES REGION 'us-west-2';
Click "Run Query". It may take about 3 minutes to load the data.

Running queries



Appendix - Redshift VPC cloudformation template
The Redshift cloudformation creates the following resources and output:


The "publicIp" is the address of the PostgreSQL server (linux) that's running by the CF stack:

AWSTemplateFormatVersion: 2010-09-09 Description: SPL-17 Working with Amazon Redshift Parameters: AmazonLinuxAMIID: Type: AWS::SSM::Parameter::Value<AWS::EC2::Image::Id> Default: /aws/service/ami-amazon-linux-latest/amzn-ami-hvm-x86_64-gp2 Resources: VPC: Type: AWS::EC2::VPC Properties: CidrBlock: 10.0.0.0/16 EnableDnsSupport: true EnableDnsHostnames: true Tags: - Key: Name Value: Lab VPC IGW: Type: AWS::EC2::InternetGateway Properties: Tags: - Key: Name Value: Lab IGW VPCtoIGWConnection: Type: AWS::EC2::VPCGatewayAttachment DependsOn: - IGW - VPC Properties: InternetGatewayId: !Ref IGW VpcId: !Ref VPC InternetRouteTable: Type: AWS::EC2::RouteTable DependsOn: VPC Properties: VpcId: !Ref VPC Tags: - Key: Name Value: Public Route Table InternetRoute: Type: AWS::EC2::Route DependsOn: - InternetRouteTable - IGW Properties: DestinationCidrBlock: 0.0.0.0/0 GatewayId: !Ref IGW RouteTableId: !Ref InternetRouteTable PublicSubnet: Type: AWS::EC2::Subnet DependsOn: VPC Properties: VpcId: !Ref VPC MapPublicIpOnLaunch: true CidrBlock: 10.0.0.0/24 AvailabilityZone: !Select - 0 - !GetAZs Ref: AWS::Region Tags: - Key: Name Value: Public Subnet PublicRouteTable: Type: AWS::EC2::SubnetRouteTableAssociation DependsOn: - InternetRouteTable - PublicSubnet Properties: RouteTableId: !Ref InternetRouteTable SubnetId: !Ref PublicSubnet pgwebSecurityGroup: Type: AWS::EC2::SecurityGroup DependsOn: VPC Properties: GroupName: pgweb Security Group GroupDescription: Enable access to pgweb VpcId: !Ref VPC SecurityGroupIngress: - IpProtocol: tcp FromPort: 80 ToPort: 80 CidrIp: 0.0.0.0/0 Tags: - Key: Name Value: pgweb Security Group pgwebServer: Type: AWS::EC2::Instance DependsOn: - InternetRoute - PublicSubnet - pgwebSecurityGroup Properties: InstanceType: t2.micro SubnetId: !Ref PublicSubnet ImageId: !Ref AmazonLinuxAMIID SecurityGroupIds: - !Ref pgwebSecurityGroup UserData: Fn::Base64: !Sub | #!/bin/bash yum update -y yum install -y docker service docker start docker run -d -p 80:8081 awsu/pgweb Tags: - Key: Name Value: pgweb RedshiftRole: Type: AWS::IAM::Role Properties: RoleName: Redshift-Role AssumeRolePolicyDocument: Version: 2012-10-17 Statement: - Effect: Allow Principal: Service: - redshift.amazonaws.com Action: - sts:AssumeRole Path: / Policies: - PolicyName: root PolicyDocument: Version: 2012-10-17 Statement: - Effect: Allow Action: s3:* Resource: '*' RedshiftClusterSubnetGroup: Type: AWS::Redshift::ClusterSubnetGroup Properties: Description: LabClusterSubnet SubnetIds: - !Ref PublicSubnet RedshiftSG: Type: AWS::EC2::SecurityGroup DependsOn: - VPC - RedshiftRole Properties: GroupName: Redshift Security Group GroupDescription: Enable access to redshift VpcId: !Ref VPC SecurityGroupIngress: - IpProtocol: tcp FromPort: 5439 ToPort: 5439 CidrIp: 0.0.0.0/0 Tags: - Key: Name Value: Redshift Security Group Outputs: Region: Value: !Ref AWS::Region Description: Lab Region RedshiftRole: Value: !GetAtt - RedshiftRole - Arn Description: Redshift Role ARN pgweb: Value: !GetAtt - pgwebServer - PublicIp Description: pgwebIPAddress
AWS (Amazon Web Services)
- AWS : EKS (Elastic Container Service for Kubernetes)
- AWS : Creating a snapshot (cloning an image)
- AWS : Attaching Amazon EBS volume to an instance
- AWS : Adding swap space to an attached volume via mkswap and swapon
- AWS : Creating an EC2 instance and attaching Amazon EBS volume to the instance using Python boto module with User data
- AWS : Creating an instance to a new region by copying an AMI
- AWS : S3 (Simple Storage Service) 1
- AWS : S3 (Simple Storage Service) 2 - Creating and Deleting a Bucket
- AWS : S3 (Simple Storage Service) 3 - Bucket Versioning
- AWS : S3 (Simple Storage Service) 4 - Uploading a large file
- AWS : S3 (Simple Storage Service) 5 - Uploading folders/files recursively
- AWS : S3 (Simple Storage Service) 6 - Bucket Policy for File/Folder View/Download
- AWS : S3 (Simple Storage Service) 7 - How to Copy or Move Objects from one region to another
- AWS : S3 (Simple Storage Service) 8 - Archiving S3 Data to Glacier
- AWS : Creating a CloudFront distribution with an Amazon S3 origin
- AWS : Creating VPC with CloudFormation
- AWS : WAF (Web Application Firewall) with preconfigured CloudFormation template and Web ACL for CloudFront distribution
- AWS : CloudWatch & Logs with Lambda Function / S3
- AWS : Lambda Serverless Computing with EC2, CloudWatch Alarm, SNS
- AWS : Lambda and SNS - cross account
- AWS : CLI (Command Line Interface)
- AWS : CLI (ECS with ALB & autoscaling)
- AWS : ECS with cloudformation and json task definition
- AWS Application Load Balancer (ALB) and ECS with Flask app
- AWS : Load Balancing with HAProxy (High Availability Proxy)
- AWS : VirtualBox on EC2
- AWS : NTP setup on EC2
- AWS: jq with AWS
- AWS & OpenSSL : Creating / Installing a Server SSL Certificate
- AWS : OpenVPN Access Server 2 Install
- AWS : VPC (Virtual Private Cloud) 1 - netmask, subnets, default gateway, and CIDR
- AWS : VPC (Virtual Private Cloud) 2 - VPC Wizard
- AWS : VPC (Virtual Private Cloud) 3 - VPC Wizard with NAT
- DevOps / Sys Admin Q & A (VI) - AWS VPC setup (public/private subnets with NAT)
- AWS - OpenVPN Protocols : PPTP, L2TP/IPsec, and OpenVPN
- AWS : Autoscaling group (ASG)
- AWS : Setting up Autoscaling Alarms and Notifications via CLI and Cloudformation
- AWS : Adding a SSH User Account on Linux Instance
- AWS : Windows Servers - Remote Desktop Connections using RDP
- AWS : Scheduled stopping and starting an instance - python & cron
- AWS : Detecting stopped instance and sending an alert email using Mandrill smtp
- AWS : Elastic Beanstalk with NodeJS
- AWS : Elastic Beanstalk Inplace/Rolling Blue/Green Deploy
- AWS : Identity and Access Management (IAM) Roles for Amazon EC2
- AWS : Identity and Access Management (IAM) Policies, sts AssumeRole, and delegate access across AWS accounts
- AWS : Identity and Access Management (IAM) sts assume role via aws cli2
- AWS : Creating IAM Roles and associating them with EC2 Instances in CloudFormation
- AWS Identity and Access Management (IAM) Roles, SSO(Single Sign On), SAML(Security Assertion Markup Language), IdP(identity provider), STS(Security Token Service), and ADFS(Active Directory Federation Services)
- AWS : Amazon Route 53
- AWS : Amazon Route 53 - DNS (Domain Name Server) setup
- AWS : Amazon Route 53 - subdomain setup and virtual host on Nginx
- AWS Amazon Route 53 : Private Hosted Zone
- AWS : SNS (Simple Notification Service) example with ELB and CloudWatch
- AWS : Lambda with AWS CloudTrail
- AWS : SQS (Simple Queue Service) with NodeJS and AWS SDK
- AWS : Redshift data warehouse
- AWS : CloudFormation
- AWS : CloudFormation Bootstrap UserData/Metadata
- AWS : CloudFormation - Creating an ASG with rolling update
- AWS : Cloudformation Cross-stack reference
- AWS : OpsWorks
- AWS : Network Load Balancer (NLB) with Autoscaling group (ASG)
- AWS CodeDeploy : Deploy an Application from GitHub
- AWS EC2 Container Service (ECS)
- AWS EC2 Container Service (ECS) II
- AWS Hello World Lambda Function
- AWS Lambda Function Q & A
- AWS Node.js Lambda Function & API Gateway
- AWS API Gateway endpoint invoking Lambda function
- AWS API Gateway invoking Lambda function with Terraform
- AWS API Gateway invoking Lambda function with Terraform - Lambda Container
- Amazon Kinesis Streams
- AWS: Kinesis Data Firehose with Lambda and ElasticSearch
- Amazon DynamoDB
- Amazon DynamoDB with Lambda and CloudWatch
- Loading DynamoDB stream to AWS Elasticsearch service with Lambda
- Amazon ML (Machine Learning)
- Simple Systems Manager (SSM)
- AWS : RDS Connecting to a DB Instance Running the SQL Server Database Engine
- AWS : RDS Importing and Exporting SQL Server Data
- AWS : RDS PostgreSQL & pgAdmin III
- AWS : RDS PostgreSQL 2 - Creating/Deleting a Table
- AWS : MySQL Replication : Master-slave
- AWS : MySQL backup & restore
- AWS RDS : Cross-Region Read Replicas for MySQL and Snapshots for PostgreSQL
- AWS : Restoring Postgres on EC2 instance from S3 backup
- AWS : Q & A
- AWS : Security
- AWS : Security groups vs. network ACLs
- AWS : Scaling-Up
- AWS : Networking
- AWS : Single Sign-on (SSO) with Okta
- AWS : JIT (Just-in-Time) with Okta
Ph.D. / Golden Gate Ave, San Francisco / Seoul National Univ / Carnegie Mellon / UC Berkeley / DevOps / Deep Learning / Visualization