Blog .NET for Apache Spark UDFs Missing Shared State

,

The Problem When you use a UDF in .NET for Apache Spark, something like this code:

class Program { static void Main(string[] args) { var spark = SparkSession.Builder().GetOrCreate(); _logging.AppendLine("Starting Select"); var udf = Functions.Udf<int, string>(theUdf); spark.Range(100).Select(udf(Functions.Col("id"))).Show(); _logging.AppendLine("Ending Select"); Console.WriteLine(_logging.ToString()); } private static readonly StringBuilder _logging = new StringBuilder(); private static string theUdf(int val) { _logging.AppendLine($"udf passed: {val}"); return $"udf passed {val}"; } } Generally, knowing .NET we would expect the following output:

Original post (opens in new tab)

Rate

Share

Share

Rate