TornadoVM Flags

TornadoVM provides runtime and compiler flags to enable experimental features, tuning, and profiling. These flags fall into two categories:

JVM Flags (passed with the -D prefix via the –jvm option)
TornadoVM CLI Flags (passed directly to the tornado Python wrapper)

Note

In the examples below, s0 refers to a task graph and t0 to a specific task within that graph.

Example Usage

$ tornado --jvm "-Dtornado.fullDebug=true" -m tornado.examples/uk.ac.manchester.examples.compute.Montecarlo 1024

Debugging and Logging

CLI Flags

Flag	Description
`--fullDebug`	Enables full debug mode (maps to `-Dtornado.fullDebug=true`).
`--debug`	Enables basic debug output such as compilation status and device info.
`--printKernel`	Prints generated OpenCL/PTX/SPIR-V kernels.
`--threadInfo`	Displays the number of threads used.
`--devices`	Lists available hardware devices.

JVM Flags

Flag	Description
`-Dtornado.fullDebug=true`	Enables full debug output including bytecode and runtime internals.
`-Dtornado.fpgaDumpLog=true`	Dumps FPGA HLS compilation logs.
`-Dtornado.printKernel=true`	Prints generated OpenCL/PTX/SPIR-V kernels.
`-Dtornado.print.kernel.dir=FILENAME`	Saves generated kernels to the specified file.
`-Dtornado.threadInfo=true`	Displays the number of threads used.
`-Dtornado.print.bytecodes=true`	Prints TornadoVM Internal Bytecodes to stdout.
`-Dtornado.dump.bytecodes.dir=FILENAME`	Dumps TornadoVM Internal Bytecodes to the specified file.

Profiling

CLI Flags

Flag	Description
`--enableProfiler console`	Prints profiling metrics as JSON to stdout.
`--enableProfiler silent`	Collects profiling metrics internally (see TornadoVM Profiler API).
`--dumpProfiler FILENAME`	Saves profiling output to the specified file.

JVM Flags

Flag	Description
`-Dtornado.profiler=true`	Enables profiling and prints metrics as JSON to sdout.
`-Dtornado.log.profiler=true`	Collects profiling metrics internally for logging.
`-Dtornado.profiler.dump.dir=FILENAME`	Saves profiling output to the specified file.

Performance & Scheduling

JVM Flags

Flag	Description
`-Dtornado.ns.time=true`	Uses nanoseconds for timing instead of milliseconds (default: true).
`-Ds0.t0.global.workgroup.size=X,Y,Z`	Sets custom global workgroup size.
`-Ds0.t0.local.workgroup.size=X,Y,Z`	Sets custom local workgroup size.
`-Dtornado.concurrent.devices=true`	Enables concurrent execution across devices (default: false).
`-Dtornado.{ptx,opencl}.priority=X`	Sets driver priority (default: PTX=1, OpenCL=0).

Precompiled and FPGA Options

JVM Flags

Flag	Description
`-Dtornado.precompiled.binary=PATH`	Path to precompiled kernel or FPGA bitstream.
`-Dtornado.fpga.conf.file=FILE`	Path to the FPGA configuration file.

Optimizations

JVM Flags

Flag	Description
`-Dtornado.enable.fma=true`	Enables fused multiply-add (default: true). May cause issues on some platforms.
`-Dtornado.enable.mathOptimizations=true`	Enables math simplifications (e.g., `1/sqrt(x)` → `rsqrt`) (default: true).
`-Dtornado.experimental.partial.unroll=true`	Enables loop partial unrolling (default: false). Use `-Dtornado.partial.unroll.factor=FACTOR`.
`-Dtornado.enable.nativeFunctions=true`	Enables native math functions (default: false).

CUDA (PTX Specific)

Flags

Flag	Description
`CU_JIT_OPTIMIZATION_LEVEL`	Level of optimizations to apply to generated code (0 - 4), with 4 being the highest level of optimizations (default: 4).
`CU_JIT_MAX_REGISTERS`	Max number of registers that a thread may use (default: none).
`CU_JIT_TARGET`	Target microarchitecture (default: none). Note that the available target microarchitecture depends on the CUDA version. Currently CUDA 13.0 supports the following: 30, 32, 35, 37, 50, 52, 53, 60, 61, 62, 70, 72, 75, 80, 86, 87, 89, 90, 100, 103, 110, 120, 121. Older version of CUDA might supports less microarchitecture, for example, CUDA 12.0 supports up to 90.
`CU_JIT_CACHE_MODE`	Specifies whether to enable caching explicitly (-dlcm). 0, compile with no -dlcm flag specified. 1, compile with L1 cache disabled (use only L2 cache). 2, compile with L1 cache enabled (use both L1 and L2 cache) (default: none).
`CU_JIT_GENERATE_DEBUG_INFO`	Specifies whether to create debug information in output (-g) (0: false) (default: none).
`CU_JIT_LOG_VERBOSE`	Generate verbose log messages (0: false) (default: none).
`CU_JIT_GENERATE_LINE_INFO`	Generate line number information (-lineinfo) (0: false) (default: none).

Level Zero (SPIR-V Specific)

JVM Flags

Flag	Description
`-Dtornado.spirv.levelzero.alignment=64`	Sets memory alignment (in bytes) for Level Zero buffers (default: 64).
`-Dtornado.spirv.levelzero.thread.dispatcher=true`	Uses Level Zero’s thread block suggestion (default: true).
`-Dtornado.spirv.loadstore=false`	Optimizes Loads/Stores and simplifies the generated SPIR-V binary (experimental - default: false).
`-Dtornado.spirv.levelzero.memoryAlloc.shared=false`	Enables shared memory buffers (default: false).

Notes

All Java flags (those beginning with -Dtornado.) are defined in the TornadoOptions.java file.

TornadoVM CLI flags (those beginning with --) are mapped to Java flags by the Python interface for ease of use. For example, --printKernel maps internally to -Dtornado.printKernel=true.