Currently Imagenet-pretrained encoders are used. It might be a good idea to load encoder weights pertained on our dataset.
A simple solution should be to use the EfficientUnet++ architecture and just cut all skip-connections between encoder/ decoder thus creating an autoencoder...