Search code examples
.netapache-sparkloggingconsolelog4j

How to disable all logging info to the spark console from .net application


How could I show results without logging in console? While running execution I have like this, but with many rows:

20/08/28 13:35:27 INFO MemoryStore: MemoryStore started with capacity 366.3 MB
20/08/28 13:35:27 INFO SparkEnv: Registering OutputCommitCoordinator
20/08/28 13:35:27 INFO Utils: Successfully started service 'SparkUI' on port 4040.
20/08/28 13:35:27 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://
20/08/28 13:35:27 INFO SparkContext: Added JAR file:/C:/
20/08/28 13:35:27 INFO Executor: Starting executor ID driver on host localhost

Unfortnately to me, there is no result after my manipulation. I found file log4j.properties.template, change everything to WARN and stiil have the same issue. Also I changed date time format to check if file was read.But format was stil the same, like log4j.appender.console.layout.ConversionPattern=%d{yy:MM:dd HH:mm:ss} %p %c{1}: %m%n. So, that is why I decided this file was not read. Also I wrote in my .cs file

var sc = new SparkContext(new SparkConf());
            sc.SetLogLevel("WARN");

in my code like this

namespace mySparkApp
{
    class Program
    {
        static void Main(string[] args)
        {

            //Logger.getLogger("org").setLevel(Level.OFF);
            //Logger.getLogger("akka").setLevel(Level.OFF);

            var sc = new SparkContext(new SparkConf());
            sc.SetLogLevel("WARN");

            // Create a Spark session
            SparkSession spark = SparkSession
                .Builder()
                .AppName("word_count_sample")
                .GetOrCreate();

            // Create initial DataFrame
            DataFrame dataFrame = spark.Read().Text("input.txt");

            // Count words
            DataFrame words = dataFrame
                .Select(Functions.Split(Functions.Col("value"), " ").Alias("words"))
                .Select(Functions.Explode(Functions.Col("words"))
                .Alias("word"))
                .GroupBy("word")
                .Count()
                .OrderBy(Functions.Col("count").Desc());

            // Show results
            words.Show();

            // Stop Spark session
            spark.Stop();
        }
    }
}

Then reboot my machine, but I have the same.


Solution

  • You need to rename/copy the file log4j.properties.template to log4j.properties.

    If you want to see fewer logs, you can set the logging level to ERROR instead of WARN.

    You can also put these lines to avoid logging the other errors you got.

    log4j.logger.org.apache.spark.util.ShutdownHookManager=OFF
    log4j.logger.org.apache.spark.SparkEnv=ERROR
    

    Source