Teja
Teja
Teja
PROFESSIONAL SUMMARY:
Around 10 Years of experience in data engineer and Python developer with a proven track record in orchestrating,
optimizing, and maintaining data pipelines utilizing cutting-edge technologies such as Apache Spark, Kafka, and AWS
services.
Proficient in architecting large-scale data solutions, expertise in AWS Redshift, Snowflake, and real-time data processing
systems.
Adept at leveraging Informatica, Teradata, and Apache Airflow for streamlined data integration and orchestration, I have a
strong foundation in data warehousing and ETL processes.
Committed to delivering high-quality solutions, I possess a deep understanding of data engineering practices and
methodologies, with experience in Agile environments.
Experience in Big Data/Hadoop, Data Analysis, Data Modeling professional with applied information Technology.
Strong experience working with HDFS, Map Reduce, Spark, Hive, Sqoop, Flume, Kafka, Oozie, Pig and HBase.
IT experience in Big Data technologies, Spark, database development.
Good experience in Amazon Web Service (AWS) concepts like EMR and EC2 web-services which provides fast and client
processing of Teradata Big Data Analytics.
Solid expertise in cloud platforms, including AWS (IAM, S3, EC2 etc.) and strong programming skills in Python and Java
Have experience in Apache Spark, Spark Streaming, Spark SQL, and NoSQL databases like HBase, Cassandra, and MongoDB.
Establishes and executes the Data Quality Governance Framework, which includes end - to-end process and data quality
framework for assessing decisions that ensure the suitability of data for its intended purpose.
Proficiency in Big Data Practices and Technologies like HDFS, Map Reduce, Hive, Pig, HBase, Sqoop, Spark, Kafka.
Experience in implementing security practices within Airflow, including user authentication, access controls, and
encryption, ensuring data privacy and compliance.
Extensive experience in loading and analyzing large datasets with Hadoop framework (Map Reduce, HDFS, PIG, HIVE,
Flume, Sqoop, SPARK, Impala, Scala), NoSQL databases like MongoDB, HBase, Cassandra.
Integrated Kafka with Spark Streaming for real time data processing.
Strong experience in the Analysis, design, development, testing and Implementation of Business Intelligence solutions
using Data Warehouse/Data Mart Design, ETL, BI, Client/Server applications and writing ETL scripts using Regular
Expressions and custom tools (Informatica, Pentaho) to ETL data.
Orchestrated intricate ETL processes with Airflow, ensuring seamless execution and monitoring of tasks within defined
schedules.
Implemented best practices for logging, monitoring, alerting to maintain high availability and reliability of data pipelines.
Deep expertise in advanced SQL, data modeling, distributed data processing frameworks like Spark
Expertise in transforming business requirements into analytical models, designing algorithms, building models, developing
Data Mining, Data Acquisition, Data Preparation, Data Manipulation, Feature Engineering, Machine Learning Algorithms.
Experience with Data Analytics, Data Reporting, Ad-hoc Reporting, Graphs, Scales, PivotTables and OLAP reporting.
Skilled in performing data parsing, data manipulation and data preparation with methods including describing data
contents.
Experienced in writing complex SQL Quires like Stored Procedures, triggers, joints, and Sub quires.
Proficiency in Business Intelligence Tools, with a preference for PowerBI, facilitating the creation of insightful and visually
appealing data visualizations for effective communication and decision-making.
Extensive experience in generating data visualizations using R, Python and creating dashboards using tools like Tableau.
Experience designing data models and ensuring they align with business objectives.
Extensive working experience with Databricks for data engineering and analytics
Skilled in crafting efficient data models for seamless integration and reporting in Tableau, I excel in Python scripting,
PySpark, and SQL for data manipulation and cleansing.
TECHNICAL SKILLS:
Big Data Systems Amazon Web Services (AWS), Azure, Google Cloud Platform (GCP), Cloudera Hadoop,
Hortonworks Hadoop, Apache Spark, Spark Streaming, Apache Kafka, Pig Hive, Amazon S3,
AWS Kinesis
Databases Cassandra, HBase, DynamoDB, MongoDB, BigQuery, SQL, Hive, MySQL, Oracle, PL/SQL,
RDBMS, AWS Redshift, Amazon RDS, Teradata, Snowflake
Programming & Scripting Python, R, Scala, PySpark, SQL, Java, Bash
Web Programming HTML, CSS, Javascript, XML
ETL Data Pipelines Apache Airflow, Sqoop, Flume, Apache Kafka, DBT, Pentaho, SSIS
Visualization Tableau, Power BI, Quick Sight, Looker
Cloud Platforms AWS, GCP, Azure
Scheduler Tools Apache Airflow, Azure Data Factory, AWS Glue, Step functions
Spark Framework Spark API, Spark Streaming, Spark Structured Streaming, Spark SQL
CI/CD Tools Jenkins, GitHub, GitLab
Operating Systems Windows, Linux, Unix, Mac OS X
PROFESSIONAL EXPERIENCE:
Environment: AWS (EC2, S3, EBS, ELB, RDS, SNS, SQS, VPC, LAM Cloud formation, CloudWatch, ELK Stack), Ansible, Python, Shell
Scripting, PowerShell, GIT, Jira, JBOSS, Bamboo, Snaplogic, Docker, Web Logic, GCP, Maven, Web sphere, Unix/Linux, AWS Xray,
DynamoDB, Kinesis, Snowflake DB, DBT, Data Modelling, Data warehouse, Power Query, Splunk, SonarQube. Snowflake, Java,
Databricks, Bitbucket, Kafka, Spark, Lambda, Hadoop, Tableau, Hive, SQL, Oracle, scheduling tool, Shell scripting.
Environment: IBM Info sphere DataStage 9.1/11.5, Oracle 11g, Flat les, Snowflake, Autosys, GCP, UNIX, Erwin, TOAD, MS SQL
Server database, XML les, AWS, MS Access database.
Environment: Python, SQL server, Oracle, HDFS, HBase, AWS, Map Reduce, Hive, Impala, Pig, Sqoop, NoSQL, Tableau, RNN,
LSTM, Unix/Linux, Core Java.
Environment: Python, Django, Pandas, RestAPI, HTML, CSS, JavaScript, AngularJS, Oracledb, PostgreSQL, Python-MYSQL
connector.
Environment: MS SQL Server 2008, SQL Server Business Intelligence Development Studio, R, SAS, Tableau, SSIS- 2008, SSRS-
2008, Report Builder, Office, Excel, Flat Files, .NET, T-SQL
Environment: - Python, PySpark, Kafka, GitLab, PyCharm, Hadoop, AWS S3, Tableau, Hive, Impala, Flume, Apache Nifi, Java, Shell-
scripting, SQL, Sqoop, Oozie, Java, Python, Oracle, SQL Server, HBase, PowerBI, Agile Methodology
Education: