--

Great article, colleague and friend. The only thing I would add is that, depending on the cluster size (number of cores/executors), the OPTIMIZE command will write files that may be less than 1 GB in size. Rest assured that this file size will be optimal for your cluster configuration.

--

--

Nnaemezue Obi-Eyisi
Nnaemezue Obi-Eyisi

Written by Nnaemezue Obi-Eyisi

I am passionate about empowering, educating, and encouraging individuals pursuing a career in data engineering. Currently a Senior Data Engineer at Capgemini

No responses yet