The repo is to supplement the youtube video on PySpark for Glue. It includes a cloudformation template which creates the s3 bucket, glue tables, IAM roles, and csv data files. Below are the schemas ...
A custom sink provider for Apache Spark that sends the contents of a dataframe to AWS SQS. It grabs the content of the first column of the dataframe and sends it to an AWS SQS queue. It needs the ...