The Big Data ChallengeVersion en ligne Unravel the mystery of a powerful data processing framework. par Smt.K.Swathi Assistant Professor 1 The Big Data Challenge Hints 1 Its name comes from a toy elephant owned by a creator's child. 2 It uses a distributed storage system called HDFS. 3 This framework is known for handling large datasets. 2 The Art of Duplication Hints 1 Think of DNA and how it makes copies of itself. 2 Often used in science and technology. 3 It's the process of making identical copies. 3 Custom functions for specific tasks in Pig 4 I take key-value pairs from the Map phase and aggregate them. What am I? 5 I remove records that don’t meet certain criteria. Who am I? 6 I am a distributed dataset that supports fault tolerance and parallel processing. What am I? 7 I define how data flows in Spark, but I don’t compute until an action is called. Who am I? 8 I am a distributed collection of data with named columns, often used in Spark SQL. What am I? 9 I am a temporary SQL result that doesn't store data but lets you reuse queries. Who am I? 10 I am a framework designed for fast, distributed data processing. What am I?