Skip to content
This repository was archived by the owner on Dec 20, 2018. It is now read-only.

Conversation

@jon-morra-zefr
Copy link

As outlined in #235 it is desirable to get Avro serialization working with UserDefinedTypes. This PR addresses that concern and allows serialization of UserDefinedTypes. I wasn't able to write any tests, but I'm sure that can be done easily. This works on my local projects.

*/

package com.databricks.spark.avro
package org.apache.spark.avro
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to reference UserDefinedType we have to change the package name.

FloatType | DoubleType | StringType | BooleanType => identity
case _: DecimalType => (item: Any) => if (item == null) null else item.toString
FloatType | DoubleType | BooleanType => identity
case _: DecimalType | StringType => (item: Any) => if (item == null) null else item.toString
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My use case for this is serializing Enum[_] types. I'm doing this as strings on the backend. However, in my tests it was still presenting the native enum type to the native avro reader. By moving the StringType out and explicitly calling .toString on it I'm able to get around this issue assuming the UserDefinedType for my enums is StringType.

@zeitan
Copy link

zeitan commented Jun 28, 2018

Question, some plan to merge this improve???
I'm getting a possible problem trying to deserialize an avro enum field in spark and I think that this is the reason

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants