Search code examples
pysparkpalantir-foundrygraphframes

Use of Graphframes library in palantir-foundry


I want to use GrafFrames package with Pyspark in my Foundry code repository.
As mentioned here: https://www.palantir.com/docs/foundry/transforms-python/environment-troubleshooting/#packages-which-require-both-a-conda-package-and-a-jar

I included graphframes package into list of conda libraries to be installed, but I also need to install server-side jar whent the spark session is initialized. So I go to transforms-python/build.gradle and I have the following code:

// DO NOT MODIFY THIS FILE
buildscript {
    repositories {
        maven {
            credentials {
                username ''
                password project.transformsBearerToken
            }
            authentication {
                basic(BasicAuthentication)
            }
            url project.transformsMavenProxyRepoUri
        }
    }

    dependencies {
        classpath "com.palantir.transforms:transforms-gradle-plugin:${transformsVersion}"
    }
}

apply plugin: 'com.palantir.transforms-defaults'

dependencies { 
    condaJars 'graphframes:graphframes:0.8.1-spark3.0-s_2.12' 
}

Then I save changes, I reload the page to apply changes, but then I get a code assist error:

FAILURE: Build failed with an exception.

* Where:

Build file '/scratch/standalone/1c8fbb49-de4d-4c21-8081-47c92748189a/code-assist/contents/build.gradle' line: 24

* What went wrong:

A problem occurred evaluating root project 'feature-generation'.

> Could not find method condaJars() for arguments [graphframes:graphframes:0.8.1-spark3.0-s_2.12] on object of type org.gradle.api.internal.artifacts.dsl.dependencies.DefaultDependencyHandler.

* Try:

Run with --info or --debug option to get more log output. Run with --scan to get full insights.

Does anyone have any idea why and how to fix this?


Solution

  • Almost there. The link you provided has the clue:

    Select the option to Show hidden files and folders in the Settings cog, and select the inner transforms-python/build.gradle file. At the bottom of the file, add the following block:

    1. Check that the version of the jar is equal to what you've added through Conda/Pypi - conda has the most recent version of 0.7.32 whereas Pypi has 0.6.
    2. Move the dependencies entry to the inner build.gradle.

    And I'm sure you're aware, the maven coordinates can be found here: https://mvnrepository.com/artifact/graphframes/graphframes