TornadoVM Changelog
This file summarizes the new features and major changes for each TornadoVM version.
TornadoVM 1.1.0
31/03/25
Improvements
#620: Support of computation with mixed precision
FP16
toFP32
for matrix operations.#622: New API to allow buffer mapping between two different buffers on the hardware accelerator.
#624: Enhanced TornadoVM profiler with correct information for the
UNDER_DEMAND
transfer to host data.#627: New feature to persist data on the hardware accelerator, and consume data already allocated on the hardware accelerator.
#630: Support for atomics using the kernel API for OpenCL and PTX backends.
#636: TornadoVM bytecode logging improved.
#642: Math functions extended:
acosh
andasinh
supported for OpenCL and SPIR-V.#645: Memory deallocations improved. Action by default when closing the
TornadoExecutionPlan
resource.
Compatibility
#625: Documentation to build on RISC-V updated.
#632: Add maven build with Single thread.
#633: Add tests for running multiple task graphs with different grid schedulers.
#638: Add tests to check force copy in buffers and persist buffers on the hardware accelerator.
#640: Rename XPUFuffer to FieldBuffer for all backends.
#649: Update the fast mode to live mode for testing.
#654: Add loop condition test in white list.
Bug Fixes
#626: Fix data accessors when using the
UNDER_DEMAND
transfer to host innovation from the task-graph.#628: Device filtering API fixed to use device type and device names.
#635: Update nodes for local memory to be subtype of
ValueNode
instead ofConstantNode
in the TornadoVM IR.#639: Fix subgraph execution when combining with the
GridScheduler
.#644: Fix TornadoVM execution frame setter.
#646: Fix shared memory buffers across task-graphs when no new allocation is present as new parameters for the following task-graphs.
#647: Fix
UNDER_DEMAND
invocation for the batch processor mode and read-write arrays.#651: Fix memory mapping regions for the PTX Backend.
#653: Object repetition with shared buffers on
ON_DEVICE
bytecodes.
TornadoVM 1.0.10
31/01/25
Improvements
Compatibility
Bug Fixes
TornadoVM 1.0.9
20th December 2024
Improvements
#573: Enhanced output of unit-tests with a summary of pass-rates and fail-rates.
#576: Extended support for 3D matrices.
#580: Extended debug information for execution plans.
#584: Added helper menu for the
tornado
launcher script when no arguments are passed.#589: Enable partial loop unrolling for all backends.
#594: Added RISC-V 64 CPU port support to run OpenCL with vector instructions RVV 1.0 (using the Codeplay OCK Toolkit).
#598: OpenCL low-level buffers tagged as read, write and read/write based on the data dependency analysis.
#601: Feature to select an immutable task graph to execute from a multi-task graph execution plan.
Compatibility
#570: Extended timeout for all suite of unit-tests.
#579: Removed legacy JDK 8 and JDK11 build options from the TornadoVM installer.
#582: Restored tornado runner scripts for IntellIJ.
#583: Automatic generation of IDE IntelliJ configuration runner files from the TornadoVM command.
#597: Updated white-list of unit-test and checkstyle improved.
Bug Fixes
#571: Fix issues with bracket closing for if/loops conditions.
#572: Fix for printing default execution plans (execution plans with default parameters).
#575: Fix the Level Zero version used for building the SPIR-V backend.
#577: Fix checkstyle.
#587: Fix thread scheduler for new NVIDIA Drivers.
#592: Fix
Float.POSITIVE_INFINITY
andFloat.NEGATIVE_INFINITIVE
constants for the OpenCL, CUDA and SPIR-V backends.#596: Fix extra closing bracket during the code-generation for the FPGAs.
Remove the intermediate CUDA pinned memory regions in the JNI code: link
Fix bitwise negation operations for the PTX backend: link
GetBackendImpl::getAllDevices
thread-safe: linkCheck size elements for memory segments: link.
TornadoVM 1.0.8
30th September 2024
Improvements
#565: New API call in the Execution Plan to log/trace the executed configuration plans.
#563: Expand the TornadoVM profiler with Level Zero Sysman Energy Metrics.
#559: Refactoring Power Metric handlers for PTX and OpenCL.
#548: Benchmarking improvements.
#549: Prebuilt API tests added using multiple backend-setup.
Add internal tests for monitoring memory management (link).
Compatibility
#561: Build for OSx 14.6 and OSx 15 fixed.
Bug Fixes
#564: Jenkins configuration fixed to run KFusion per backend.
#562: Warmup action from the Execution Plan fixed to run with correct internal IDs.
#557: Shared Execution Plans Context fixed.
#553: OpenCL compiler flags for Intel Integrated GPUs fixed.
#552: Fixed runtime to select any device among multiple SPIR-V devices.
Fixed zero extend arithmetic operations: link
TornadoVM 1.0.7
30th August 2024
Improvements
#468: Cleanup Abstract Metadata Class.
#473: Add maven plugin to build TornadoVM source for the releases.
#474: Refactor <X>TornadoDevice to place common methods in the
TornadoXPUInterface
.#482: Help messages improved when an out-of-memory exception is raised.
#484: Double-type for the trigonometric functions added in the
TornadoMath
class.#487: Prebuilt API simplified.
#494: Add test to trigger unsupported features related to direct use of Memory Segments.
#509: Add a quick pass configuration to skip the heavy tests during active development.
#532: Improve thread scheduler to support RISC-V Accelerators from Codeplay.
#533: Support for scalar values to be passed via lambda expressions as tasks.
#538:
README
file updated.#539: Refactor core classes and add new API methods to pass compilation flags to the low-level driver compilers (OpenCL, PTX and Level Zero).
#542: Tagged LevelZero JNI and Beehive Toolkit dependencies added in the build and installer.
Compatibility
#465: Support for JDK 22 and GraalVM 24.0.2.
#486: Temurin for Windows added in the list of supported JDKs.
#525: Revert usage of String Templates in preparation for JDK 23.
#527: SPIR-V version parameter added. TornadoVM may run previous SPIR-V versions (e.g., ComputeAorta from Codeplay).
#513: LevelZero JNI Library updated to v0.1.4.
Bug Fixes
#470: README documentation fixed.
#478: Fix the test names that are present in the white list.
#488: FP64 Kind for radian operations and the PTX backend fixed.
#493: Tests Whitelist for PTX backend fixed.
#502: Fix barrier type in the documentation regarding programmability of reductions.
#514: Installer script fixed.
#540: Fix issue with clean-up execution IDs function.
#541: Fix Data Accessors for the prebuilt API.
#543: Fix checkstyle condition and FP16 error message improved.
TornadoVM 1.0.6
27th June 2024
Improvements
#442: Support for multiple SPIR-V device versions (>= 1.2).
#444: Enabling automatic device memory clean-up after each run from the execution plan.
#448: API extension to query device memory consumption at the TaskGraph granularity.
#451: Option to select the default SPIR-V runtime.
#455: Refactoring the API and documentation updated.
#460: Refactoring all examples to use try-with-resources execution plans by default.
#462: Support for copy array references from private to private memory on the hardware accelerator.
Compatibility
#438: No writes for intermediate files to avoid permissions issues with Jenkins.
#440: Update Jenkinsfile for CI/CD testing.
#443: Level Zero and OpenCL runtimes for SPIR-V included in the Jenkins CI/CD.
#450: TornadoVM benchmark script improved to report dimensions and sizes.
#453: Update Jenkinsfile with regards to the runtime for SPIR-V.
Bug Fixes
TornadoVM 1.0.5
26th May 2024
Improvements
#402: Support for TornadoNativeArrays from FFI buffers.
#403: Clean-up and refactoring for the code analysis of the loop-interchange.
#405: Disable Loop-Interchange for CPU offloading..
#407: Debugging OpenCL Kernels builds improved.
#410: CPU block scheduler disabled by default and option to switch between different thread-schedulers added.
#418: TornadoOptions and TornadoLogger improved.
#423: MxM using ns instead of ms to report performance.
#425: Vector types for
Float<Width>
andInt<Width>
supported.#429: Documentation of the installation process updated and improved.
#432: Support for SPIR-V code generation and dispatcher using the TornadoVM OpenCL runtime.
Compatibility
#409: Guidelines to build the documentation.
#411: Windows installer improved.
#412: Python installer improved to check download all Python dependencies before the main installer.
#413: Improved documentation for installing all configurations of backends and OS.
#424: Use Generic GPU Scheduler for some older NVIDIA Drivers for the OpenCL runtime.
#430: Improved the installer by checking that the TornadoVM environment is loaded upfront.
Bug Fixes
#400: Fix batch computation when the global thread indexes are used to compute the outputs.
#414: Recover Test-Field unit-tests using Panama types.
#415: Check style errors fixed.
#416: FPGA execution with multiple tasks in a task-graph fixed.
#417: Lazy-copy out fixed for Java fields.
#420: Fix Mandelbrot example.
#421: OpenCL 2D thread-scheduler fixed for NVIDIA GPUs.
#422: Compilation for NVIDIA Jetson Nano fixed.
#426: Fix Logger for all backends.
#428: Math cos/sin operations supported for vector types.
#431: Jenkins files fixed.
TornadoVM 1.0.4
30th April 2024
Improvements
#369: Introduction of Tensor types in TornadoVM API and interoperability with ONNX Runtime.
#370 : Array concatenation operation for TornadoVM native arrays.
#371: TornadoVM installer script ported for Windows 10/11.
#372: Add support for
HalfFloat
(Float16
) in vector types.#374: Support for TornadoVM array concatenations from the constructor-level.
#375: Support for TornadoVM native arrays using slices from the Panama API.
#376: Support for lazy copy-outs in the batch processing mode.
#377: Expand the TornadoVM profiler with power metrics for NVIDIA GPUs (OpenCL and PTX backends).
#384: Auto-closable Execution Plans for automatic memory management.
Compatibility
Bug Fixes
#367: Fix for Graal/Truffle languages in which some Java modules were not visible.
#373: Fix for data copies of the
HalfFloat
types for all backends.#378: Fix free memory markers when running multi-thread execution plans.
#379: Refactoring package of vector api unit-tests.
#380: Fix event list sizes to accommodate profiling of large applications.
#385: Fix code check style.
#387: Fix TornadoVM internal events in OpenCL, SPIR-V and PTX for running multi-threaded execution plans.
#388: Fix of expected and actual values of tests.
#392: Fix installer for using existing JDKs.
#389: Fix
DataObjectState
for multi-thread execution plans.#396: Fix JNI code for the CUDA NVML library access with OpenCL.
TornadoVM 1.0.3
27th March 2024
Improvements
#344: Support for Multi-threaded Execution Plans.
#347: Enhanced examples.
#350: Obtain internal memory segment for the Tornado Native Arrays without the object header.
#357: API extensions to query and apply filters to backends and devices from the
TornadoExecutionPlan
.#359: Support Factory Methods for FFI-based array collections to be used/composed in TornadoVM Task-Graphs.
Compatibility
#351: Compatibility of TornadoVM Native Arrays with the Java Vector API.
#352: Refactor memory limit to take into account primitive types and wrappers.
#354: Add DFT-sample benchmark in FP32.
#356: Initial support for Windows 11 using Visual Studio Development tools.
#361: Compatibility with the SPIR-V toolkit v0.0.4.
#366: Level Zero JNI Dependency updated to 0.1.3.
Bug Fixes
TornadoVM 1.0.2
29/02/2024
Improvements
Compatibility
#337 : Initial support for Graal and JDK 21.0.2
Bug Fixes
#322: Fix duplicate thread-info debug message when the debug option is also enabled.
#325: Set/Get accesses for the
MatrixVectorFloat4
type fixed#326: Fix installation script for running with Python >= 3.12
#327: Fix Memory Limits for all supported Panama off-heap types.
#329: Fix timers for the dynamic reconfiguration policies
#330: Fix the profiler logs when silent mode is enabled
#332: Fix Batch processing when having multiple task-graphs in a single execution plan.
TornadoVM 1.0.1
30/01/2024
Improvements
Compatibility/Integration
Bug Fixes
TornadoVM 1.0
05/12/2023
Improvements
Brand-new API for allocating off-heap objects and array collections using the Panama Memory Segment API. - New Arrays, Matrix and Vector type objects are allocated using the Panama API. - Migration of existing applications to use the new Panama-based types: https://tornadovm.readthedocs.io/en/latest/offheap-types.html
Handling of the TornadoVM’s internal bytecode improved to avoid write-only copies from host to device.
cospi
andsinpi
math operations supported for OpenCL, PTX and SPIR-V.Vector 16 data types supported for
float
,double
andint
.Support for Mesa’s
rusticl
.Device default ordering improved based on maximum thread size.
Move all the installation and configuration scripts from Bash to Python.
The installation process has been improved for Linux and OSx with M1/M2 chips.
Documentation improved.
Add profiling information for the testing scripts.
Compatibility/Integration
Integration with the Graal 23.1.0 JIT Compiler.
Integration with OpenJDK 21.
Integration with Truffle Languages (Python, Ruby and Javascript) using Graal 23.1.0.
TornadoVM API Refactored.
Backport bug-fixes for branch using OpenJDK 17:
master-jdk17
Bug fixes:
Multiple SPIR-V Devices fixed.
Runtime Exception when no SPIR-V devices are present.
Issue with the kernel context API when invoking multiple kernels fixed.
MTMD mode is fixed when running multiple backends on the same device.
long
type as a constant parameter for a kernel fixed.FPGA Compilation and Execution fixed for AWS and Xilinx devices.
Batch processing fixed for different data types of the same size.
TornadoVM 0.15.2
26/07/2023
Improvements
Initial Support for Multi-Tasks on Multiple Devices (MTMD): This mode enables the execution of multiple independent tasks on more than one hardware accelerators. Documentation in link: https://tornadovm.readthedocs.io/en/latest/multi-device.html
Support for trigonometric
radian
,cospi
andsinpi
functions for the OpenCL/PTX and SPIR-V backends.Clean-up Java modules not being used and TornadoVM core classes refactored.
Compatibility/Integration
Initial integration with ComputeAorta (part of the Codeplay’s oneAPI Construction Kit for RISC-V) to run on RISC-V with Vector Instructions (OpenCL backend) in emulation mode.
Beehive SPIR-V Toolkit dependency updated.
Tests for prebuilt SPIR-V kernels fixed to dispatch SPIR-V binaries through the Level Zero and OpenCL runtimes.
Deprecated
javac.py
script removed.
Bug fixes:
TornadoVM OpenCL Runtime throws an exception when the detected hardware does not support FP64.
Fix the installer for the older Apple with the x86 architecture using AMD GPUs.
Installer for ARM based systems fixed.
Installer fixed for Microsoft WSL and NVIDIA GPUs.
OpenCL code generator fixed to avoid using the reserved OpenCL keywords from Java function parameters.
Dump profiler option fixed.
TornadoVM 0.15.1
15/05/2023
Improvements
Introduction of a device selection heuristic based on the computing capabilities of devices. TornadoVM selects, as the default device, the fastest device based on its computing capability.
Optimisation of removing redundant data copies for Read-Only and Write-Only buffers from between the host (CPU) and the device (GPU) based on the Tornado Data Flow Graph.
New installation script for TornadoVM.
Option to dump the TornadoVM bytecodes for the unit tests.
Full debug option improved. Use
--fullDebug
.
Compatibility/Integration
Integration and compatibility with the Graal 22.3.2 JIT Compiler.
Improved compatibility with Apple M1 and Apple M2 through the OpenCL Backend.
GraalVM/Truffle programs integration improved. Use
--truffle
in thetornado
script to run guest programs with Truffle. Example:tornado --truffle python myProgram.py
Full documentation: https://tornadovm.readthedocs.io/en/latest/truffle-languages.html
Bug fixes:
Documentation that resets the device’s memory: https://github.com/beehive-lab/TornadoVM/blob/master/tornado-api/src/main/java/uk/ac/manchester/tornado/api/TornadoExecutionPlan.java#L282
Append the Java
CLASSPATH
to thecp
option from thetornado
script.Dependency fixed for the
cmake-maven
plugin fixed for ARM-64 arch.Fixed the automatic installation for Apple M1/M2 and ARM-64 and NVIDIA Jetson nano computing systems.
Integration with IGV fixed. Use the
--igv
option for thetornado
andtornado-test
scripts.
TornadoVM 0.15
27/01/2023
Improvements
New TornadoVM API:
API refactoring (
TaskSchedule
has been renamed toTaskGraph
)Introduction of the Immutable
TaskGraphs
Introduction of the TornadoVM Execution Plans: (
TornadoExecutionPlan
)The documentation of migration of existing TornadoVM applications to the new API can be found here: https://tornadovm.readthedocs.io/en/latest/programming.html#migration-to-tornadovm-v0-15
Launch a new website https://tornadovm.readthedocs.io/en/latest/ for the documentation
Improved documentation
Initial support for Intel ARC discrete GPUs.
Improved TornadoVM installer for Linux
ImprovedTornadoVM launch script with optional parameters
Support of large buffer allocations with Intel Level Zero. Use:
tornado.spirv.levelzero.extended.memory=True
Bug fixes:
Vector and Matrix types
TornadoVM Floating Replacement compiler phase fixed
Fix
CMAKE
for Intel ARC GPUsDevice query tool fixed for the PTX backend
Documentation for Windows 11 fixed
TornadoVM 0.14.1
29/09/2022
Improvements
The tornado command is replaced from a Bash to a Python script.
Use
tornado --help
to check the new options and examples.
Support of native tests for the SPIR-V backend.
Improvement of the OpenCL and PTX tests of the internal APIs.
Compatibility/Integration
Integration and compatibility with the Graal 22.2.0 JIT Compiler.
Compatibility with JDK 18 and JDK 19.
Compatibility with Apple M1 Pro using the OpenCL backend.
Bug Fixes
CUDA PTX generated header fixed to target NVIDIA 30xx GPUs and CUDA 11.7.
The signature of generated PTX kernels fixed for NVIDIA driver >= 510 and 30XX GPUs when using the TornadoVM Kernel API.
Tests of virtual OpenCL devices fixed.
Thread deployment information for the OpenCL backend is fixed.
TornadoVMRuntimeCI
moved toTornadoVMRutimeInterface
.
TornadoVM 0.14
15/06/2022
New Features
New device memory management for addressing the memory allocation limitations of OpenCL and enabling pinned memory of device buffers.
The execution of task-schedules will still automatically allocate/deallocate memory every time a task-schedule is executed, unless lock/unlock functions are invoked explicitly at the task-schedule level.
One heap per device has been replaced with a device buffer per input variable.
A new API call has been added for releasing memory:
unlockObjectFromMemory
A new API call has been added for locking objects to the device:
lockObjectInMemory
This requires the user to release memory by invokingunlockObjectFromMemory
at the task-schedule level.
Enhanced Live Task migration by supporting multi-backend execution (PTX <-> OpenCL <-> SPIR-V).
Compatibility/Integration
Integration with the Graal 22.1.0 JIT Compiler
JDK 8 deprecated
Azul Zulu JDK supported
OpenCL 2.1 as a default target for the OpenCL Backend
Single Docker Image for Intel XPU platforms, including the SPIR-V backend (using the Intel Integrated Graphics), and OpenCL (using the Intel Integrated Graphics, Intel CPU and Intel FPGA in emulation mode). Image: https://github.com/beehive-lab/docker-tornado#intel-integrated-graphics
Improvements/Bug Fixes
SIGNUM
Math Function included for all three backends.SPIR-V optimizer enabled by default (3x reduce in binary size).
Extended Memory Mode enabled for the SPIR-V Backend via Level Zero.
Phi instructions fixed for the SPIR-V Backend.
SPIR-V Vector Select instructions fixed.
Duplicated IDs for Non-Inlined SPIR-V Functions fixed.
Refactoring of the TornadoVM Math Library.
FPGA Configuration files fixed.
Bitwise operations for OpenCL fixed.
Code Generation Times and Backend information are included in the profiling info.
TornadoVM 0.13
21/03/2022
Integration with JDK 17 and Graal 21.3.0
JDK 11 is the default version and the support for the JDK 8 has been deprecated
Support for extended intrinsics regarding math operations
Native functions are enabled by default
Support for 2D arrays for PTX and SPIR-V backends:
Integer Test Move operation supported:
Improvements in the SPIR-V Backend:
Experimental SPIR-V optimizer. Binary size reduction of up to 3x
Fix malloc functions for Level-Zero
Support for pre-built SPIR-V binary modules using the TornadoVM runtime for OpenCL
Performance increase due to cached buffers on GPUs by default
Disassembler option for SPIR-V binary modules. Use
--printKernel
Improved Installation:
Full automatic installer script integrated
Documentation about the installation for Windows 11
Refactoring and several bug fixes
https://github.com/beehive-lab/TornadoVM/commit/57694186b42ec28b16066fb549ab8fcf9bff9753
Vector types fixed:
Fix AtomicInteger get for OpenCL:
Dependencies for Math3 and Lang3 updated
TornadoVM 0.12
17/11/2021
New backend: initial support for SPIR-V and Intel Level Zero
Level-Zero dispatcher for SPIR-V integrated
SPIR-V Code generator framework for Java
Benchmarking framework improved to accommodate all three backends
Driver metrics, such as kernel time and data transfers included in the benchmarking framework
TornadoVM profiler improved:
Command line options added:
--enableProfiler <silent|console>
and--dumpProfiler <jsonFile>
Logging improve for debugging purposes. JIT Compiler, JNI calls and code generation
New math intrinsincs operations supported
Several bug fixes:
Duplicated barriers removed. TornadoVM BARRIER bytecode fixed when running multi-context
Copy in when having multiple reductions fixed
TornadoVM profiler fixed for multiple context switching (device switching)
Pretty printer for device information
TornadoVM 0.11
29/09/2021
TornadoVM JIT Compiler upgrade to work with Graal 21.2.0 and JDK 8 with JVMCI 21.2.0
Refactoring of the Kernel Parallel API for Heterogeneous Programming:
Methods
getLocalGroupSize(index)
andgetGlobalGroupSize
moved to public fields to keep consistency with the rest of the thread properties within theKernelContext
class.
Compiler update to register the global number of threads: https://github.com/beehive-lab/TornadoVM/pull/133/files
Simplification of the TornadoVM events handler: https://github.com/beehive-lab/TornadoVM/pull/135/files
Renaming the Profiler API method from
event.getExecutionTime
toevent.getElapsedTime
: https://github.com/beehive-lab/TornadoVM/pull/134Deprecating
OCLWriteNode
andPTXWriteNode
and fixing stores for bytes: https://github.com/beehive-lab/TornadoVM/pull/131Refactoring of the FPGA IR extensions, from the high-tier to the low-tier of the JIT compiler
Utilizing the FPGA Thread-Attributes compiler phase for the FPGA execution
Using the
GridScheduler
object (if present) or use a default value (e.g., 64, 1, 1) for defining the FPGA OpenCL local workgroup
Several bugs fixed:
Codegen for sequential kernels fixed
Function parameters with non-inlined method calls fixed
TornadoVM 0.10
29/06/2021
TornadoVM JIT Compiler sync with Graal 21.1.0
Experimental support for OpenJDK 16
Tracing the TornadoVM thread distribution and device information with a new option
--threadInfo
instead of--debug
Refactoring of the new API:
TornadoVMExecutionContext
renamed toKernelContext
GridTask
renamed toGridScheduler
AWS F1 AMI version upgraded to 1.10.0 and automated the generation of AFI image
Xilinx OpenCL backend expanded with:
- Initial integration of Xilinx OpenCL attributes for loop
pipelining in the TornadoVM compiler
Support for multiple compute units
Logging FPGA compilation option added to dump FPGA HLS compilation to a file
TornadoVM profiler enhanced for including data transfers for the stack-frame and kernel dispatch time
Initial support for 2D Arrays added
Several bug fixes and stability support for the OpenCL and PTX backends
TornadoVM 0.9
15/04/2021
Expanded API for expressing kernel parallelism within Java. It can work with the existing loop parallelism in TornadoVM.
Direct access to thread-ids, OpenCL local memory (PTX shared memory), and barriers
TornadoVMContext
added:Code examples:
Documentation:
Profiler integrated with Chrome debug:
Use flags:
-Dtornado.chrome.event.tracer.enabled=True -Dtornado.chrome.event.tracer.filename=userFile.json
Added support for Windows 10:
TornadoVM running with Windows JDK 11 supported (Linux & Windows)
Xilinx FPGAs workflow supported for Vitis 2020.2
Pre-compiled tasks for Xilinx/Intel FPGAs fixed
Slambench fixed when compiling for PTX and OpenCL backends
Several bug fixes for the runtime, JIT compiler and data management.
TornadoVM 0.8
19/11/2020
Added PTX backend for NVIDIA GPUs
Build TornadoVM using
make BACKEND=ptx,opencl
to obtain the two supported backends.
TornadoVM JIT Compiler aligned with Graal 20.2.0
Support for other JDKs:
Red Hat Mandrel 11.0.9
Amazon Coretto 11.0.9
GraalVM LabsJDK 11.0.8
OpenJDK 11.0.8
OpenJDK 12.0.2
OpenJDK 13.0.2
OpenJDK 14.0.2
Support for hybrid (CPU-GPU) parallel reductions
New API for generic kernel dispatch. It introduces the concept of
WorkerGrid
andGridTask
A
WorkerGrid
is an object that stores how threads are organized on an OpenCL device:java WorkerGrid1D worker1D = new WorkerGrid1D(4096);
A
GridTask
is a map that relates a task-name with a worker-grid.java GridTask gridTask = new GridTask(); gridTask.set("s0.t0", worker1D);
A TornadoVM Task-Schedule can be executed using a
GridTask
:java ts.execute(gridTask);
More info: link
TornadoVM profiler improved
Profiler metrics added
Code features per task-graph
Lazy device initialisation moved to early initialisation of PTX and OpenCL devices
Initial support for Atomics (OpenCL backend)
Task Schedules with 11-14 parameters supported
Documentation improved
Bug fixes for code generation, numeric promotion, basic block traversal, Xilinx FPGA compilation.
TornadoVM 0.7
22/06/2020
Support for ARM Mali GPUs.
Support parallel reductions on FPGAs
Agnostic FPGA vendor compilation via configuration files (Intel & Xilinx)
Support for AWS on Xilinx FPGAs
Recompilation for different input data sizes supported
New TornadoVM API calls:
Update references for re-compilation:
taskSchedule.updateReferences(oldRef, newRef);
Use the default OpenCL scheduler:
taskSchedule.useDefaultThreadScheduler(true);
Use of JMH for benchmarking
Support for Fused Multiply-Add (FMA) instructions
Easy-selection of different devices for unit-tests
tornado-test.py -V --debug -J"-Dtornado.unittests.device=0:1"
Bailout mechanism improved from parallel to sequential
Improve thread scheduling
Support for private memory allocation
Assertion mode included
Documentation improved
Several bug fixes
TornadoVM 0.6
21/02/2020
TornadoVM compatible with GraalVM 19.3.0 using JDK 8 and JDK 11
TornadoVM compiler update for using Graal 19.3.0 compiler API
Support for dynamic languages on top of Truffle
Support for multiple tasks per task-schedule on FPGAs (Intel and Xilinx)
Support for OSX Mojave and Catalina
Task-schedule name handling for FPGAs improved
Exception handling improved
Reductions for
long
type supportedBug fixes for ternary conditions, reductions and code generator
Documentation improved
TornadoVM 0.5
16/12/2019
Initial support for Xilinx FPGAs
TornadoVM API classes are now
Serializable
Initial support for local memory for reductions
JVMCI built with local annotation patch removed. Now TornadoVM requires unmodified JDK8 with JVMCI support
Support of multiple reductions within the same
task-schedules
Emulation mode on Intel FPGAs is fixed
Fix reductions on Intel Integrated Graphics
TornadoVM driver OpenCL initialization and OpenCL code cache improved
Refactoring of the FPGA execution modes (full JIT and emulation modes improved).
TornadoVM 0.4
14/10/2019
Profiler supported
Use
-Dtornado.profiler=True
to enable profilerUse
-Dtornado.profiler=True -Dtornado.profiler.save=True
to dump the profiler logs
Feature extraction added
Use
-Dtornado.feature.extraction=True
to enable code extraction features
Mac OSx support
Automatic reductions composition (map-reduce) within the same task-schedule
Bug related to a memory leak when running on GPUs solved
Bug fixes and stability improvements
TornadoVM 0.3
22/07/2019
New Matrix 2D and Matrix 3D classes with type specializations.
New API-call
TaskSchedule#batch
for batch processing. It allows programmers to run with more data than the maximum capacity of the accelerator by creating batches of executions.FPGA full automatic compilation pipeline.
FPGA options simplified:
-Dtornado.precompiled.binary=<binary>
for loading the bitstream.-Dtornado.opencl.userelative=True
for using relative addresses.-Dtornado.opencl.codecache.loadbin=True
removed.
Reductions support enhanced and fully automated on GPUs and CPUs.
Initial support for reductions on FPGAs.
Initial API for profiling tasks integrated.
TornadoVM 0.2
25/02/2019
Rename to TornadoVM
Device selection for better performance (CPU, multi-core, GPU, FPGA) via an API for Dynamic Reconfiguration
Added methods
executeWithProfiler
andexecuteWithProfilerSequential
with an input policy.Policies:
Policy.PERFORMANCE
,Policy.END_2_END
, andPolicy.LATENCY
implemented.
Basic heuristic for predicting the highest performing target device with Dynamic Reconfiguration
Initial FPGA integration for Altera FPGAs:
Full JIT compilation mode
Ahead of time compilation mode
Emulation/debug mode
FPGA JIT compiler specializations
Added support for Java reductions:
Compiler specializations for CPU and GPU reductions
Performance and stability fixes
Tornado 0.1.0
07/09/2018
Initial Implementation of the Tornado compiler
Initial GPU/CPU code generation for OpenCL
Initial support in the runtime to execute OpenCL programs generated by the Tornado JIT compiler
Initial Tornado-API release (
@Parallel
Java annotation andTaskSchedule
API)Multi-GPU enabled through multiple tasks-schedules