Resume Ankit Anand
Resume Ankit Anand
Resume Ankit Anand
SUMMARY
Dedicated and highly skilled Data Engineer with 7 years of experience in designing, implementing, and optimizing
data pipelines and systems. Proficient in data integration, ETL processes, and data warehousing. Adept at leveraging a
wide range of technologies like Spark, HDFC, SQL, Python, Hadoop, AWS stack.
PROFESSIONAL EXPERIENCE
Data Engineer | Absolutdata Analytics, Gurugram, India [Jan 2021 - July 2021 ]
• Data modelling for a healthcare organisation where 20+ tables were involved.
• Migrated the database from SQL Server to Azure SQL database.
• Used Alteryx to automate the data ingestion coming in different file formats (xlsx, csv, ad-hoc master data)
• Data ingestion framework in hive for daily, weekly, quarterly run, handling over 70m records/ day for
multiple tables.
• Made data curation scripts in shell & hive to extract data and make data readily available for analytics.
Data Engineer | Valiance Analytics, Noida, India [June 2020 - Jan 2021 ]
• Spark SFTP based pipeline to encrypt data on HDFS partitioned by Y, M, D where encrypt module was
residing on a secure Unix box.
• Above pipeline was deprecated in view of a new NiFi based pipeline that encrypts on the fly with the integration of
encryption module as a custom processor.
• Enabling compression in terms of compression algorithm and choosing file formats saved space in TBs.
• Worked on schema registry under NiFi to take control over CSVs schema.
• Implemented ELK stack dashboards for logs monitoring for the automated runs of data ingestion per day.
• Made a spark-based solution for data correctness and consistency after ingestion.
Data Engineer | Tata Consultancy Services, Gurgaon, India [Dec 2016 - June 2020 ]
• Implemented a lambda architecture (Realtime & batch) pipeline framework using spark, Kafka, HBase for the
ingestion of finance data.
• Developed Sqoop Generic jobs to handle almost 200 GB data leading to major cost saving for the customer.
• Used special ORC Format tables with UTF-8 encoding so that all the special character are handled perfectly
even in Hive tables.
• Independently designed Spark Based Solution to parse raw data, populate staging tables & store processed data in Data
Lake scheduled by Apache oozie.
• Worked on a prod application to ingest multi line json files and xml files.
• Wrote Complex SQL queries using joins, sub queries to retrieve the data from database Involved.
Education
Jamia Hamdard