TornadoVM Flags

TornadoVM provides runtime and compiler flags to enable experimental features, tuning, and profiling. These flags fall into two categories:

  1. JVM Flags (passed with the -D prefix via the –jvm option)

  2. TornadoVM CLI Flags (passed directly to the tornado Python wrapper)

Note

In the examples below, s0 refers to a task graph and t0 to a specific task within that graph.

Example Usage

$ tornado --jvm "-Dtornado.fullDebug=true" -m tornado.examples/uk.ac.manchester.examples.compute.Montecarlo 1024

Debugging and Logging

CLI Flags

Flag

Description

--fullDebug

Enables full debug mode (maps to -Dtornado.fullDebug=true).

--debug

Enables basic debug output such as compilation status and device info.

--printKernel

Prints generated OpenCL/PTX/SPIR-V kernels.

--threadInfo

Displays the number of threads used.

--devices

Lists available hardware devices.

JVM Flags

Flag

Description

-Dtornado.fullDebug=true

Enables full debug output including bytecode and runtime internals.

-Dtornado.fpgaDumpLog=true

Dumps FPGA HLS compilation logs.

-Dtornado.printKernel=true

Prints generated OpenCL/PTX/SPIR-V kernels.

-Dtornado.print.kernel.dir=FILENAME

Saves generated kernels to the specified file.

-Dtornado.threadInfo=true

Displays the number of threads used.

-Dtornado.print.bytecodes=true

Prints TornadoVM Internal Bytecodes to stdout.

-Dtornado.dump.bytecodes.dir=FILENAME

Dumps TornadoVM Internal Bytecodes to the specified file.

Profiling

CLI Flags

Flag

Description

--enableProfiler console

Prints profiling metrics as JSON to stdout.

--enableProfiler silent

Collects profiling metrics internally (see TornadoVM Profiler API).

--dumpProfiler FILENAME

Saves profiling output to the specified file.

JVM Flags

Flag

Description

-Dtornado.profiler=true

Enables profiling and prints metrics as JSON to sdout.

-Dtornado.log.profiler=true

Collects profiling metrics internally for logging.

-Dtornado.profiler.dump.dir=FILENAME

Saves profiling output to the specified file.

Performance & Scheduling

JVM Flags

Flag

Description

-Dtornado.ns.time=true

Uses nanoseconds for timing instead of milliseconds (default: true).

-Ds0.t0.global.workgroup.size=X,Y,Z

Sets custom global workgroup size.

-Ds0.t0.local.workgroup.size=X,Y,Z

Sets custom local workgroup size.

-Dtornado.concurrent.devices=true

Enables concurrent execution across devices (default: false).

-Dtornado.{ptx,opencl}.priority=X

Sets driver priority (default: PTX=1, OpenCL=0).

Precompiled and FPGA Options

JVM Flags

Flag

Description

-Dtornado.precompiled.binary=PATH

Path to precompiled kernel or FPGA bitstream.

-Dtornado.fpga.conf.file=FILE

Path to the FPGA configuration file.

Optimizations

JVM Flags

Flag

Description

-Dtornado.enable.fma=true

Enables fused multiply-add (default: true). May cause issues on some platforms.

-Dtornado.enable.mathOptimizations=true

Enables math simplifications (e.g., 1/sqrt(x)rsqrt) (default: true).

-Dtornado.experimental.partial.unroll=true

Enables loop partial unrolling (default: false). Use -Dtornado.partial.unroll.factor=FACTOR.

-Dtornado.enable.nativeFunctions=true

Enables native math functions (default: false).

CUDA (PTX Specific)

Flags

Flag

Description

CU_JIT_OPTIMIZATION_LEVEL

Level of optimizations to apply to generated code (0 - 4), with 4 being the highest level of optimizations (default: 4).

CU_JIT_MAX_REGISTERS

Max number of registers that a thread may use (default: none).

CU_JIT_TARGET

Target microarchitecture (default: none). Note that the available target microarchitecture depends on the CUDA version. Currently CUDA 13.0 supports the following: 30, 32, 35, 37, 50, 52, 53, 60, 61, 62, 70, 72, 75, 80, 86, 87, 89, 90, 100, 103, 110, 120, 121. Older version of CUDA might supports less microarchitecture, for example, CUDA 12.0 supports up to 90.

CU_JIT_CACHE_MODE

Specifies whether to enable caching explicitly (-dlcm). 0, compile with no -dlcm flag specified. 1, compile with L1 cache disabled (use only L2 cache). 2, compile with L1 cache enabled (use both L1 and L2 cache) (default: none).

CU_JIT_GENERATE_DEBUG_INFO

Specifies whether to create debug information in output (-g) (0: false) (default: none).

CU_JIT_LOG_VERBOSE

Generate verbose log messages (0: false) (default: none).

CU_JIT_GENERATE_LINE_INFO

Generate line number information (-lineinfo) (0: false) (default: none).

Level Zero (SPIR-V Specific)

JVM Flags

Flag

Description

-Dtornado.spirv.levelzero.alignment=64

Sets memory alignment (in bytes) for Level Zero buffers (default: 64).

-Dtornado.spirv.levelzero.thread.dispatcher=true

Uses Level Zero’s thread block suggestion (default: true).

-Dtornado.spirv.loadstore=false

Optimizes Loads/Stores and simplifies the generated SPIR-V binary (experimental - default: false).

-Dtornado.spirv.levelzero.memoryAlloc.shared=false

Enables shared memory buffers (default: false).

Notes

All Java flags (those beginning with -Dtornado.) are defined in the TornadoOptions.java file.

TornadoVM CLI flags (those beginning with --) are mapped to Java flags by the Python interface for ease of use. For example, --printKernel maps internally to -Dtornado.printKernel=true.