Search code examples
apache-sparkapache-flink

What is 3G & 4G of Big Data mean and the different?


I've read a page about the comparison between Apache Spark and Apache Flink. I don't know what the 3G & 4G of Big Data mean.

Please explain to me!


Solution

  • Means 3rd Generation, 4th Generation. There are many publications and websites that use these 3G or 4G terms to highlight or denigrate some technology by assigning a certain "generation". Each tool have things for and against according to the problem you are facing. From hadoop to Flink (there are many more Zamza, Spark, Storm ...) each has brought something new to the world of Big Data:

    • Calculation on huge volumes of data

    • Easy to use

    • Support for efficient iterative calculation
    • Unification of batch and streaming APIs
    • Support for CEP
    • Full streaming processing
    • Complete compatibility with the hadoop ecosystem
    • Exactly-once processing guarantees
    • ...

    What others have recommended is true. You should not be guided by these 3G or 4G criteria to select a technology. You must study your problem fully, know the technologies and tools available or at least have them classified according to their philosophy and use case Something old but illustrative is this book You will form an idea and classify each one according to your own criteria :) Something is true: each tool comes first or later and each stands out because it contains a different or more appropriate approach to certain problems