Search code examples
formatparquetclassnotfoundexception

ClassNotFoundException ParquetOutputFormat


I am want to create a parquet file from payload in CentOs. Here is what I did.

parquetDataSet.write().mode(SaveMode.Append).parquet(tempFile.getAbsolutePath());

here is the dependency I used

<dependency>
        <groupId>org.apache.spark</groupId>
        <artifactId>spark-sql_2.12</artifactId>
        <version>2.4.1</version>
    </dependency>

Help me with the following is the error.

 Servlet.service() for servlet [dispatcherServlet] in context with path [] threw exception [Handler dispatch failed; nested exception is java.util.ServiceConfigurationError: org.apache.spark.sql.sources.DataSourceRegister: Provider org.apache.spark.sql.execution.datasources.parquet.ParquetFileFormat could not be instantiated] with root cause



java.lang.ClassNotFoundException: org.apache.parquet.hadoop.ParquetOutputFormat$JobSummaryLevel
        at java.net.URLClassLoader.findClass(URLClassLoader.java:382) ~[na:1.8.0_212]
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424) ~[na:1.8.0_212]
        at org.springframework.boot.loader.LaunchedURLClassLoader.loadClass(LaunchedURLClassLoader.java:92) ~[app.jar:0.0.1-SNAPSHOT]
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357) ~[na:1.8.0_212]
        at java.lang.Class.getDeclaredConstructors0(Native Method) ~[na:1.8.0_212]
        at java.lang.Class.privateGetDeclaredConstructors(Class.java:2671) ~[na:1.8.0_212]
        at java.lang.Class.getConstructor0(Class.java:3075) ~[na:1.8.0_212]
        at java.lang.Class.newInstance(Class.java:412) ~[na:1.8.0_212]

Solution

  • add the below dependency

    <dependency>
        <groupId>org.apache.parquet</groupId>
        <artifactId>parquet-hadoop</artifactId>
        <version>1.11.0</version>
    </dependency>
    

    you can get rid of ClassNotFoundException ParquetOutputFormat$JobSummaryLevel

    I hope this helps.