In-depth knowledge and experience of GCP Data services : Bigquery / Dataproc / Composer / Pubsub / Dataflow / GCS/ BigTable
Must have proficient experience in GCP Databases : Bigtable / Spanner / CloudSQL / AlloyDB
Solid understanding of relational database concepts and technologies such as SQL, MySQL, PostgreSQL, or Oracle.
Hands-on experience with other cloud platforms and services such as AWS RDS, or Azure SQL Database.
Experience with NoSQL databases such as MongoDB, Scylla, Cassandra, or DynamoDB is a plus.
Familiarity with database performance tuning, optimization, and
troubleshooting techniques is a plus
Strong working experience on one or more of the Bigdata/Hadoop distributions(or ecosystems) like, Cloudera/Hortonworks, MapR, Azure HDInsight, IBM Open platform, Kafka, Hive, Spark etc
Good understanding of the following AWS Data services, Redshift, RDS, Athena or SQS/Kinesis
Good understanding of Native and external tables, with different file formats : Avro, ORC, Parquet
CI/CD pipelines for data workloads using Cloud Build, Artifact Registry, Terraform Data governance solutioning using GCP governance tooling (Dataplex, Data Catalog)
Must have: programming knowledge and willingness to be hands-on - Python, Java
Specialization on streaming such as PubSub or Kafka or equivalent
Good to Have
Experience in BigQuery or Presto or Equivalent
Experience with open source ecosystem and distributions such as Hadoop, Spark, Cloudera/Hortonworks and frameworks and tech such as SPARK, Oozie, Kafka, HBASE
Understanding and experience with NoSQL databases such as HBASE, MongoDB, Cassandra,
Knowledge of cloud databases such as Spanner, BigTable, Cloud SQL, DB migrations