Data Engineering Mock Interview
This exciting new video presents a mock interview with a skilled and experienced Data Engineer. Through insightful questions, we explore the techniques, tools, and technologies the interviewee has employed in their work, gaining valuable insights into the complex and ever-evolving world of data engineering.
From designing and implementing scalable, high-performance batch processing architectures to working with cutting-edge data processing frameworks like #apachespark, #spark, #sql, #aws, #machinelearning, #coding, #systemdesign.
Our expert guest interviewer Kuldeep Pal shared his hard-won knowledge and expertise, offering valuable advice and insights for aspiring data engineers and seasoned professionals.
Rashmika completed her Master of Science from the University of Washington. Previously she worked with Microsoft as a Software Engineer. She is skilled in Bigdata Framework, Data Science and cloud technologies such as AWS, Azure.
Whether you're just starting out in your career or looking to take your skills to the next level, this interview is an essential resource for anyone interested in the fascinating world of data processing and engineering. So don't miss out - tune in now and discover the secrets of success in this dynamic and exciting field!
π
To book a Mock interview - https://topmate.io/ankur_ranjan
π
LinkedIn - / thebigdatashow
π
Instagram - / ranjan_anku
π
Kuldeep Pal (Interviewer) 's LinkedIn profile - / kuldeep27396
π
Rashmika Reddy Vookanti (Interviewee)'s LinkedIn profile - / rashmika-reddy
Chapters:
00:00 - Introduction
02:16 - Architecture of Spark
03:56 - Lineage in Spark
05:14 - Direct Acyclic Graph (DAG) in Spark
06:58 - Optimization Technique to avoid Shuffle in Spark
09:18 - Broadcast Variable in Spark and its usage
10:18 - Working of catalyst optimizer
11:13 - Difference between RDD, DataFrame & Dataset
12:06 - Which data structure we should use for the large data RDD or DataFrame?
12:50 - Adaptive Query Execution (AQE) in Spark
13:35 - Issues faced while handling Large Dataset
15:10 - What is data quality and ways to check the Data Quality?
19:55 - Why do you choose EMR over AWS Glue?
21:10 - Difference between Airflow and MWAA
22:22 - Working of TF-IDF in Machine Learning
24:12 - Why did you choose Arima over other time series models such as LSTM?
26:32 - Process to Debugg any code
28:18 - When do we use Windows Function in SQL?
29:12 - SQL Problem 1
35:32 - SQL Problem 2
52:27 - DSA Problem 1
1:12:20 - System Design Question
1:13:18 - Design of SplitWise Application
#interview #dataengineering #bigdata #apachespark #careerswitch #job #mockinterview
#datastructures