Different float precision for the subhalo catalogue from API or hdf5 files

Yan Xiang Lai
  • 2
  • 16 Dec '25

Hi. I am trying to select spiral galaxies from the TNG50-1 simulation. I was testing my code by loading the data through the API first. Once I finished debugging, I downloaded the TNG catalogue to the local supercomputer and loaded the data directly from the HDF5 file.

However, I discovered that data from the API is in float64 format, whereas the same data from the HDF5 file is in float32 format. Strangely, for gas and stellar particles, both API and the HDF5 files return the float32 precision. I have attached my code below.

For galaxy_id = 8 (starts from 0), the outputs for "SubhaloPos", "SubhaloVel", and "SubhaloHalfmassRadStars" are [8070.64 24389.4 20882.3], [-913.312 435.268 270.353], 3.2534 for API, and [8070.6353 24389.377 20882.332], [-913.3116 435.26785 270.35333], 3.2533743 for reading in the HDF5 file. The data from the API has higher float precision than that from the HDF5 file. I guess I should use the ones from the HDF5 file since the documentation states all data is saved with the float32 precision, but I am not quite sure what caused the difference here. The small offset between the API and HDF5 files can significantly impact the inclination calculation by more than 10 degrees using the formulae from sections III.C and III.D in this paper.

Code for loading in API data:

def get_api_data(self, endpoint, params=None):
        """
        General method to get data from TNG API
        Parameters:
        -----------
        endpoint : str
            API endpoint (e.g., 'snapshots/99/subhalos/')
        params : dict
            Query parameters

        Returns:
        --------
        dict : API response data
        """
        url = f"{self.base_url}{self.simulation}/{endpoint}"

        try:
            response = requests.get(url, headers=self.headers, params=params)
            response.raise_for_status()
            return response.json()
        except requests.RequestException as e:
            print(f"API request failed: {e}")
            return None

Code for loading in HDF5 files:

    subhalo_detail = il.groupcat.loadSubhalos(
                self.pardict["TNG_dir"],
                self.snapshot,
                fields=[
                    "SubhaloPos",
                    "SubhaloVel",
                    "SubhaloFlag",
                    "SubhaloMass",
                    "SubhaloMassType",
                    "SubhaloLen",
                    "SubhaloLenType",
                    "SubhaloHalfmassRad",
                    "SubhaloHalfmassRadType",
                    "SubhaloSFR",
                    "SubhaloVmax",
                    "SubhaloVelDisp",
                    "SubhaloSpin",
                    "SubhaloGasMetallicity",
                    "SubhaloStarMetallicity",
                ],
            )
Dylan Nelson
  • 24 Dec '25
  1. The data in the HDF5 files is the original, and most accurate. You should always use this if such precision issues are relevant.

  2. The API returns data in many different ways. If you mean the info.json endpoint as in

https://www.tng-project.org/api/TNG50-1/snapshots/99/subhalos/8/

then this, at the moment you request it, simply loads the corresponding data from the HDF5 file, and converts it to JSON. In this conversion to txt/json, no particular care is paid to precision issues.

Note that fields like velocities are only stored in float32, so while the info.json may appear to return more significant digits, it is not actually float64.

Yan Xiang Lai
  • 6 Jan

Hi Dylan. Thanks for your response. The inclination angle calculation is very sensitive to the floating precision, so I have manually recast the variables from the hdf5 files from float32 to float64 first. After this, the API and hdf5 return identical results.

  • Page 1 of 1