Skip to content

Deriving inferences from datasets sourced from the national data archive through visualizations and AWS supported processing.

Notifications You must be signed in to change notification settings

VidyutChakrabarti/GOI-Statistics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Study on Household Consumption and Inflation data

Deriving inferences from datasets sourced from the national data archive.

Repository Structure

repo structure diagram

Current Datasets

  1. Household Consumption Dataset

  2. Consumer Price Index and Inflation Data

To get the cleaned dataset: Download the S3-Data and explore it.


You can access the cleaning script from here: Open in Colab

Youtube video: Link

Pipeline overview:

pipeline diagram

Our final database schema looks like this:

Schema diagram

In order to deploy on AWS:

  1. Create terraform.tfvars with appropriate access keys to your account(must have a proper IAM role)
  2. First deploy GOIStats/bootstrap with the following commands:
terraform init

If credentials problem comes up then configure your environment using:

aws configure

Finally use:

terraform plan
terraform apply -auto-approve
  • Upload the psycopg2-layer.zip as a layer in the Lambda function, this packaged layer has been prepared in AWS-Linux OS. (using a zipped package from Windows gives an error.)

Note: Additional charges may apply if large datasets are transferred within the vpc using lambda, we avoided using AWS Glue due to cost constraints.

PowerBI dashboard:

powerbi dashboard

About

Deriving inferences from datasets sourced from the national data archive through visualizations and AWS supported processing.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •