documentation¶

OCML User Guide¶

What Is OCML¶

OCML is an LLVM-IR bitcode library designed to relieve language compiler and runtime implementers of the burden of implementing efficient and accurate mathematical functions. It is essentially a “libm” in intermediate representation with a fixed, simple API that can be linked in to supply the implementations of most standard low-level mathematical functions provided by the language.

Using OCML¶

Standard Usage¶

OCML is expected to be used in a standard LLVM compilation flow as follows:

Compile source modules to LLVM-IR bitcode (clang)

Link program bitcode, “wrapper” bitcode, OCML bitcode, and OCML control functions (llvm-link)

Generic optimizations (opt)

Code generation (llc)

Here, “wrapper” bitcode denotes a thin library responsible for mapping mangled built-in function calls as produced by clang to the OCML API. An example in C might look like

inline float sqrt(float x) { return __ocml_sqrt_f32(x); }

The next section describes OCML controls and how to make them.

Controls¶

OCML supports a number of controls that are provided by linking in specifically named inline functions. These functions are inlined at optimization time and result in specific paths taken with no control flow overhead. These functions all have the form (in C)

__attribute__((always_inline, const)) int
__oclc_control(void)
{ return 1; } // or 0 to disable

The currently supported control are

finite_only_opt - floating point Inf and NaN are never expected to be consumed or produced unsafe_math_opt - lower accuracy results may be produced with higher performance daz_opt - subnormal values consumed and produced may be flushed to zero correctly_rounded_sqrt32 - float square root must be correctly rounded ISA_version - an integer representation of the ISA version of the target device

Versioning¶

OCML ships as a single LLVM-IR bitcode file named

ocml-{LLVM rev}-{OCLM rev}.bc

where {LLVM rev} is the version of LLVM used to create the file, of the form X.Y, e.g. 3.8, and {OCML rev} is the OCML library version of the form X.Y, currently 0.9.

Tables¶

Some OCML functions require access to tables of constants. These tables are currently named with the prefix __ocmltbl_ and are placed in LLVM address space 2.

Naming convention¶

OCML functions follow a simple naming convention:

__ocml_{function}_{type suffix}

where {function} is generally the familiar libm name of the function, and {type suffix} indicates the type of the floating point arguments or results, and is one of

f16 – 16 bit floating point (half precision) f32 – 32 bit floating point (single precision) f64 – 64 bit floating point (double precision)

For example, __ocml_sqrt_f32 is the name of the OCML single precision square root function.

OCML does not currently support higher than double precision due to the lack of support on most devices.

Supported functions¶

The following table contains a list of {function} currently supported by OCML, a brief description of each, and the maximum relative error in ULPs for each floating point type. A “c” in the last 3 columns indicates that the function is required to be correctly rounded.

{function}	Description	f32 max err	f64 max err	f16 max err
acos	arc cosine	4	4	2
acosh	arc hyperbolic cosine	4	4	2
acospi	arc cosine / π	5	5	2
add_{rm}	add with specific rounding mode	c	c	c
asin	arc sine	4	4	2
asinh	arc hyperbolic sin	4	4	2
asinpi	arc sine / pi	5	5	2
atan2	two argument arc tangent	6	6	2
atan2pi	two argument arc tangent / pi	6	6	2
atan	single argument arc tangent	5	5	2
atanh	arc hyperbolic tangent	5	5	2
atanpi	single argument arc tangent / pi	5	5	2
cbrt	cube root	2	2	2
ceil	round upwards to integer	c	c	c
copysign	copy sign of second argument to absolute value of first	0	0	0
cos	cosine	4	4	2
cosh	hyperbolic cosine	4	4	2
cospi	cosine of argument times pi	4	4	2
div_{rm}	correctly rounded division with specific rounding mode	c	c	c
erf	error function	16	16	4
erfc	complementary error function	16	16	4
erfcinv	inverse complementary error function	7	8	3
erfcx	scaled error function	6	6	2
erfinv	inverse error function	3	8	2
exp10	10x	3	3	2
exp2	2x	3	3	2
exp	ex	3	3	2
expm1	ex - 1, accurate at 0	3	3	2
fabs	absolute value	0	0	0
fdim	positive difference	c	c	c
floor	round downwards to integer	c	c	c
fma[_{rm}]	fused (i.e. singly rounded) multiply-add, with optional specific rounding	c	c	c
fmax	maximum, avoids NaN	0	0	0
fmin	minimum, avoids NaN	0	0	0
fmod	floating point remainder	0	0	0
fpclassify	classify floating point
fract	fractional part	c	c	c
frexp	extract significand and exponent	0	0	0
hypot	length, with overflow control	4	4	2
i0	modified Bessel function of the first kind, order 0, I0	6	6	2
i1	modified Bessel function of the first kind, order 1, I1	6	6	2
ilogb	extract exponent	0	0	0
isfinite	tests finiteness
isinf	test for Inf
isnan	test for NaN
isnormal	test for normal
j0	Bessel function of the first kind, order 0, J0	6 (<12)	6 (<12)	2 (<12)
j1	Bessel function of the first kind, order 1, J1	6 (<12)	6 (<12)	2 (<12)
ldexp	multiply by 2 raised to an integral power	c	c	c
len3	three argument hypot	2	2	2
len4	four argument hypot	2	2	2
lgamma	log Γ function	6(>0)	4(>0)	3(>0)
lgamma_r	log Γ function with sign	6(>0)	4(>0)	3(>0)
log10	log base 10	3	3	2
log1p	log base e accurate near 1	2	2	2
log2	log base 2	3	3	2
log	log base e	3	3	2
logb	extract exponent	0	0	0
mad	multiply-add, implementation defined if fused	c	c	c
max	maximum without special NaN handling	0	0	0
maxmag	maximum magnitude	0	0	0
min	minimum without special NaN handling	0	0	0
minmag	minimum magnitude	0	0	0
modf	extract integer and fraction	0	0	0
mul_{rm}	multiply with specific rounding mode	c	c	c
nan	produce a NaN with a specific payload	0	0	0
ncdf	standard normal cumulateive distribution function	16	16	4
ncdfinv	inverse standard normal cumulative distribution function	16	16	4
nearbyint	round to nearest integer (see also rint)	0	0	0
nextafter	next closest value above or below	0	0	0
pow	general power	16	16	4
pown	power with integral exponent	16	16	4
powr	power with positive floating point exponent	16	16	4
rcbrt	reciprocal cube root	2	2	2
remainder	floating point remainder	0	0	0
remquo	floating point remainder and lowest integral quotient bits	0	0	0
rhypot	reciprocal hypot	2	2	2
rint	round to nearest integer	c	c	c
rlen3	reciprocal len3	2	2	2
rlen4	reciprocal len4	2	2	2
rootn	nth root	16	16	4
round	round to integer, always away from 0	c	c	c
rsqrt	reciprocal square root	2	2	1
scalb	multiply by 2 raised to a power	c	c	c
scalbn	multiply by 2 raised to an integral power (see also ldexp)	c	c	c
signbit	nonzero if argument has sign bit set
sin	sine function	4	4	2
sincos	simultaneous sine and cosine evaluation	4	4	2
sincospi	sincos function of argument times pi	4	4	2
sinh	hyperbolic sin	4	4	2
sinpi	sine of argument times pi	4	4	2
sqrt	square root	3/c	3/c	c
sub_{rm}	subtract with specific rounding mode	c	c	c
tan	tangent	5	5	2
tanh	hyperbolic tangent	5	5	2
tanpi	tangent of argument times pi	6	6	2
tgamma	true Γ function	16	16	4
trunc	round to integer, towards zero	c	c	c
y0	Bessel function of the second kind, order 0, Y0	2 (<12)	6 (<12)	6 (<12)
y1	Bessel function of the second kind, order 1, Y1	2 (<12)	6 (<12)	6 (<12)

For the functions supporting specific roundings, the rounding mode {rm} can be one of

rte – round towards nearest even

rtp – round towards positive infinity

rtn – round towards negative infinity

rtz – round towards zero