Search code examples
Targeting specialized skills...


pysparkwindow-functions

Read More
How to run arbitrary / DDL SQL statements or stored procedures using AWS Glue...


pysparkaws-gluepy4j

Read More
Is there a way to use a map/dict in Pyspark to avoid CASE WHEN condition equals pairs?...


apache-sparkpysparkapache-spark-sqlconfigparser

Read More
pyspark : NameError: name 'spark' is not defined...


apache-sparkmachine-learningpysparkdistributed-computingapache-spark-ml

Read More
Importing RIDs from a dataset with column RIDs with Palantir Foundry Code Repository...


pythonpysparkpalantir-foundryfoundry-code-repositories

Read More
Python default dictionary seems to be giving duplicate key - what is happening?...


pythonpython-2.7pysparkdefaultdict

Read More
Get pyspark corrupt records reason...


jsonpyspark

Read More
How to modify pyspark dataframe nested struct column...


dataframeapache-sparkpysparkstructapache-spark-sql

Read More
How to update a value in the nested column of struct using pyspark...


pythonapache-sparkpysparkapache-spark-sql

Read More
Is there a way to see TQDM progress bars while using PySpark?...


pythonpysparkjupyter-labtqdm

Read More
How to achieve Column mapping just like in ADF in Databricks...


apache-sparkpysparkazure-data-factorydatabricksazure-databricks

Read More
PySpark withColumn() function doesn't recognize hierarchical structure...


jsonapache-sparkpysparkdatabricks

Read More
Read multiple CSV files with different number of columns for each CSV file...


pysparkapache-spark-sql

Read More
Is there a temporary folder that I can access while using AWS Glue?...


amazon-web-servicespysparkaws-glue

Read More
Not able to write spark dataframe. Error Found nested NullType in column 'colname' which is ...


pythonpandasapache-sparkpysparkapache-spark-sql

Read More
String type order change and remove a specific character using Pyspark...


pyspark

Read More
XGBoost model running out of memory in Databricks/PySpark...


pysparkdatabricksxgboost

Read More
How does spark show the output of a dataframe even though the table from which the df is based on is...


sqldataframepysparkdatabricks

Read More
alias for count in Pyspark...


countpysparkalias

Read More
Pyspark code to remove a column within a complex Json schema...


dataframepysparkdatabricks

Read More
How can I interpolate missing values based on the sum of the gap using pyspark?...


dataframepysparkdata-cleaninglinear-interpolation

Read More
Replace rows with nearest time using pyspark...


pythonapache-sparkpysparkapache-spark-sql

Read More
reusing the same dataframe via cache...


apache-sparkpyspark

Read More
Replace parts of dataframe values based on values in another dataframe...


dataframeapache-sparkpysparkreplacedatabricks

Read More
'spark.jars.packages' not working as expected in AWS Glue and Spark...


pysparkjarsnowflake-cloud-data-platformaws-glueaws-glue-connection

Read More
How to sum row wise data using single column in pysaprk...


pysparkdatabricks

Read More
Pyspark - Repeat value until change in column...


pythondataframeapache-sparkpysparkapache-spark-sql

Read More
Return rows with last updated date for different days...


sqlpysparkpartitiondays

Read More
How remove all copies of duplicates from pyspark dataframe...


dataframepysparkduplicates

Read More
TypeError: 'JavaPackage' object is not callable for XGBoost in PySpark...


scalapysparkxgboost

Read More
BackNext