float_params2

MATLAB Code for Parameters of Floating-Point Arithmetics

Marco Cococcioni

Versione 1.0.1 (2,66 KB)

20 download

(0)

10 giu 2021

Scarica

Apri in MATLAB Online

Segui

Scarica

Apri in MATLAB Online

Segui

`float_params2` is a MATLAB function for obtaining the parameters of several

floating-point arithmetics. The parameters are built into the code and are

not computed at run time.

The parameters are

- the unit roundoff,

- the smallest positive (subnormal) floating-point number,

- the smallest positive normalized floating-point number,

- the largest floating-point number,

- the number of binary digits in the significand (including the

implicit leading bit)

and the arithmetics supported are

- bfloat8,

- bfloat16,

- IEEE half precision (fp16),

- IEEE single precision (fp32),

- IEEE double precision (fp64),

- IEEE quadruple precision (fp128).

The code was developed in MATLAB R2020a and works with versions at least

back to R2016b.

This is a small extension to float_params of Nick Higham, to which I added the

support to the 8-bit Brain Float, as proposed at Intel by Naveen K. Mellempudi.

More details can be found here: https://arxiv.org/abs/1905.12334

I also renamed NVIDIA tf32 into tf19, just to reflect that it is a 19-bit precision float.

Cita come

Marco Cococcioni (2026). float_params2 (https://it.mathworks.com/matlabcentral/fileexchange/93835-float_params2), MATLAB Central File Exchange. Recuperato giugno 8, 2026.

Riconoscimenti

Ispirato da: float_params

Tag

Aggiungi tag

Compatibilità della release di MATLAB

Compatibile con qualsiasi release

Compatibilità della piattaforma

Windows
macOS
Linux

Apri in una nuova scheda

Versione	Pubblicato	Note della release	Action
1.0.1	10 giu 2021	very small update	Scarica
1.0.0	10 giu 2021		Scarica