Netcdf Issues
- Travis Simmons
- Mar 13, 2024
- 1 min read
I am running a large scale pipeline for a NASA/CNES satellite. We are accessing a netcdf file as our database 1000's of times a second, and we are running into locking errors. I need to see if I can just disable locking on the files via environment variables.
I have used the unlocking env variable locally when a file isn't closed properly (usually from erroring out), but I really would love to a more appropriate database that supports parallel read/write.
We are scaling up to 10's of thousands of rivers now, so this is an interesting time.
We have had some issues with parallel processing as well. We are currently using AWS batch to launch jobs on Fargate machines. We have a vCPU cap of 4k, which we hit pretty quickly in parallel. You would assume that if you have a vCPU cap of 4k, you could launch 4k 1 vcpu jobs. It is not so! Fargate actually provisions 2x what you ask for no matter what, so you could only run 2k.
We could move to ec2 instances, but that may be more trouble than it is worth.



Comments