Spark RDD to Task mapping
Understanding how RDD are converted to Task
Understanding how RDD are converted to Task
Whats the difference between groupByKey vs ReduceByKey in Spark
Use combine by key and use map transformation to find Max value for all the keys in Spark
Group by key and in the Map Partition apply some custom logic on the aggregated values of the key in Spark
For a given key collect all the values, which can be later use for applying some custom logic like ( average , max , min, top n, expression evaluation) in Spark
Read from Aerospike with a spark application via mapPartitions in Spark
Read From Aerospike Using Spark via Map Transformation in Spark
Read from Hdfs and write to Aerospike from Spark via Map Transformation
Understanding Spark Serialization , and in the process try to understand when to use lambada function , static,anonymous class and transient references
Spark Map Vs mapPartitions