[JNI Programming] The main design in JNI

[JNI Programming] The main design in JNI

The main design issues in JNI are discussed here. Most of the design issues in this section are related to local methods.

1. JNI interface functions and pointers

Native code accesses JVM features by calling JNI functions. JNI functions are available through interface pointers. The interface pointer is a pointer to a pointer. This pointer points to an array of pointers, and each pointer points to an interface function. Each interface function has a predefined offset in the array. Figure 2-1 illustrates the organization of interface pointers.

The organization of JNI interface is similar to C++ virtual function table or COM interface. The advantage of using interface tables instead of hard-wired function items is that the JNI namespace is separated from native code. The VM can easily provide multiple versions of the JNI function table. For example, the VM can support two JNI function tables:

  • Perform a thorough inspection of illegal parameters, suitable for debugging;
  • The other is the minimum amount of inspection required to implement the JNI specification, so it is more efficient.

The JNI interface pointer is only valid in the current thread. Therefore, native methods cannot pass interface pointers from one thread to another. A VM that implements JNI can allocate and store thread-local data in the area pointed to by the JNI interface pointer.

The native method receives the JNI interface pointer as a parameter. When the VM calls a local method multiple times from the same Java thread, it guarantees to pass the same interface pointer to the local method. However, native methods can be called from different Java threads, so different JNI interface pointers can be received.

2. compile, load and link local methods

Since the JVM is multi-threaded, the native library should also be compiled and linked with a native compiler that supports multi-threading. For example, for C++ code compiled with the Sun Studio compiler, the -mt flag should be used. For code that follows the GNU gcc compiler, the -D_REENTRANT or -D_POSIX_C_SOURCE flag should be used. For more information, refer to the native compiler documentation.

All local methods are loaded by the System.loadLibrary method. In the following example, the class initialization method loads a platform-specific native library where the native method f is defined:

package pkg; class Cls { native double f(int i, String s); static { System.loadLibrary("pkg_Cls"); } } Copy code

The parameter of System.loadLibrary is the library name arbitrarily chosen by the programmer. The system follows a standard, but platform-specific method to convert library names to native library names. For example, the Solaris system converts the name pkg_Cls to libpkg_Cls.so. When the system is Win32, convert the same pkg_Cls name to pkg_Cls.dll.

The programmer can use a library to store all the native methods required by any number of classes, as long as the classes are loaded using the same class loader. The VM maintains a list of loaded local libraries for each class loader. VM vendors should choose local library names that minimize name conflicts.

If the underlying operating system does not support dynamic linking, all native methods must be linked to the VM in advance. In this case, the VM completes the System.loadLibrary call without actually loading the library.

The programmer can also call the JNI function RegisterNatives() to register the native methods associated with the class. The RegisterNatives() function is particularly useful for statically linked functions.

3. resolve the local method name

The dynamic linker resolves the entry based on the name of the entry. The local method name is formed by concatenating the following components:

  • Prefix Java_
  • A confusing fully qualified class name
  • Underscore (" _ ") separator
  • A confusing method name
  • For overloaded native methods, two underscores ("__") are followed by confusing parameter signatures

The VM checks whether the method name matches the method residing in the local library. The VM first looks for the short name; that is, the name without the parameter signature. Then, it looks for the long name with the parameter signature. The programmer only needs to use the long name when the local method is overloaded by another local method. However, if the local method and the non-local method have the same name, this is not a problem. Non-native methods (Java methods) do not reside in native libraries.

In the following example, it is not necessary to link the local method g with a long name, because the other method g is not a local method and therefore is not in the local library.

class Cls1 { int g(int i); native int g(double d); } Copy code

We adopted a simple name obfuscation mode to ensure that all Unicode characters are converted to valid C function names. In fully qualified class names, we use underscore (" _ ") characters instead of slashes ("/"). Since the name or type descriptor never starts with a number, we can use _0,..., _9 to represent escape sequences, as shown in the following table:

Escape sequenceMeans
_0XXXXA Unicode character XXXX. Note the use of lowercase letters to represent non-ASCII Unicode characters, for example, _0abcd contrasts with _0ABCD.
_2The character ";" in the signature
_3The character "[" in the signature

Both native methods and interface APIs follow the standard library calling conventions on a given platform. For example, UNIX systems use the C calling convention, while Win32 systems use __stdcall.

4. local method parameters

The JNI interface pointer is the first parameter of the native method. The type of JNI interface pointer is JNIEnv. The second parameter depends on whether the local method is static or non-static. The second parameter of the non-static local method is a reference to the object. The second parameter of the static local method is a reference to its Java class. The remaining parameters correspond to regular Java method parameters. The local method call passes the result back to the calling routine through the return value. The following chapters describe the mapping between Java and C types.

The code sample demonstrates how to implement the local method f using C functions. The declaration of the local method f is as follows:

package pkg; class Cls { native double f(int i, String s); ... } Copy code

The long name of the C function is Java_pkg_Cls_f_ILjava_lang_String_2, which implements the native method f:

Use C to implement native methods

jdouble Java_pkg_Cls_f__ILjava_lang_String_2 ( JNIEnv *env,/* interface pointer */ jobject obj,/* "this" pointer */ jint i,/* argument #1 */ jstring s)/* argument #2 */ { /* Obtain a C-copy of the Java string */ const char *str = (*env)->GetStringUTFChars(env, s, 0); /* process the string */ ... /* Now we are done with str */ (*env)->ReleaseStringUTFChars(env, s, str); return ... } Copy code

Note that we always use the interface pointer env to manipulate Java objects. Using C++, you can write a slightly cleaner version of the code, as shown in the following code example:

Use C++ to implement native methods

extern "C"/* specify the C calling convention */ jdouble Java_pkg_Cls_f__ILjava_lang_String_2 ( JNIEnv *env,/* interface pointer */ jobject obj,/* "this" pointer */ jint i,/* argument #1 */ jstring s)/* argument #2 */ { const char *str = env->GetStringUTFChars(s, 0); ... env->ReleaseStringUTFChars(s, str); return ... } Copy code

In C++, the extra layer of indirection and interface pointer parameters disappear from the source code. However, the underlying mechanism is exactly the same as C. In C++, JNI functions are defined as inline member functions, which can be extended to C-corresponding functions.

5. refer to Java objects

Copy basic types such as integers and characters between Java and native code. On the other hand, any Java object is passed by reference. The VM must keep track of all objects passed to the native code so that the garbage collector will not release these objects. In turn, the native code must have a way to inform the VM that it no longer needs the object. In addition, the garbage collector must be able to move objects referenced by native code.

5.1 Global and local references

JNI divides the object references used by local code into two categories: local references and global references. The local reference is valid during the local method call and is automatically released after the native method returns. The global reference remains valid until it is explicitly released.

The object is passed to the local method as a local reference. All Java objects returned by JNI functions are local references. JNI allows programmers to create global references from local references. JNI functions expect Java objects to accept global and local references. A local method can return a local or global reference of the VM as its result.

In most cases, the programmer should rely on the VM to release all local references after the native method returns. However, sometimes programmers should explicitly release local references. For example, consider the following situation:

  • The local method accesses a large Java object, thereby creating a local reference to the Java object. Then, the native method performs additional calculations before returning to the caller. A local reference to a large Java object will prevent the object from being garbage collected, even if the object is no longer used in the remaining calculations.
  • Native methods create a large number of local references, although not all references are used at the same time. Since the VM needs a certain amount of space to track local references, creating too many local references may cause the system to run out of memory. For example, a local method loops through a large number of objects, retrieves elements that are local references, and manipulates an element in each iteration. After each iteration, the programmer no longer needs local references to array elements.

JNI allows programmers to manually delete local references anywhere in the local method. To ensure that programmers can manually release local references, JNI functions are not allowed to create additional local references unless they return references as a result.

Local references are only valid in the thread that created them. Native code cannot pass local references from one thread to another.

5.2 Implementing local references

In order to implement local references, the JVM creates a registry for each control transition from Java to native methods. The registry maps non-movable local references to Java objects and prevents objects from being garbage collected. All Java objects passed to the native method (including those returned as a result of a JNI function call) will be automatically added to the registry. Delete the registry after the local method returns, allowing all its entries to be garbage collected.

There are different ways to implement the registry, such as using a table, a linked list, or a hash table. Although reference counting can be used to avoid duplicates in the registry, the JNI implementation is not obliged to detect and collapse duplicates.

Note that conservatively scanning the local stack cannot faithfully implement local references. Local code can store local references to global or heap data structures.

6. access to Java objects

JNI provides a rich set of accessor functions for global and local references. This means that no matter how the Java object is represented inside the VM, the same native method implementation will work. This is a key reason why various VM implementations support JNI.

The cost of using accessor functions through opaque references is higher than the cost of directly accessing C data structures. We believe that in most cases, Java programmers use native methods to perform some important tasks that will mask the overhead of this interface.

6.1 Access to the original array

For large Java objects containing many basic data types (such as integer arrays and strings), this overhead is unacceptable (consider native methods for performing vector and matrix calculations). Iterating over the Java array and using function calls to retrieve each element is very inefficient.

One solution introduces the concept of "fixed" so that the native method can require the VM to fix the contents of the array. Then, the native method receives a direct pointer to the element. However, this method has two meanings:

  • The garbage collector must support pinning.
  • The VM must continuously lay out the primitive array in the memory. Although this is the most natural implementation of most basic arrays, Boolean arrays can be implemented as packed or unpacked. Therefore, native code that relies on the precise layout of Boolean arrays is not portable.

We adopted a compromise method to overcome the above two problems.

1. we provide a set of functions for copying the original array elements between the Java array segment and the local memory buffer. If the native method only needs to access a small number of elements in a large array, use these functions.

2. the programmer can use another set of functions to retrieve a compressed version of the array elements. Keep in mind that these functions may require the JVM to perform storage allocation and replication. Whether these functions actually copy the array depends on the VM implementation, as shown below:

  • If the garbage collector supports pinning, and the layout of the array is the same as expected by the native method, no copying is required.
  • Otherwise, copy the array to an immovable memory block (for example, in the C heap) and perform the necessary format conversion. Returns a pointer to the copy.

Finally, the interface provides functions to notify VM native code that it no longer needs to access array elements. When you call these functions, the system either uncompresses the array or reconciles the original array with its immovable copy and releases the copy.

Our approach provides flexibility. For each given array, the garbage collector algorithm can individually decide whether to copy or fix. For example, the garbage collector can copy small objects, but lock large objects.

The JNI implementation must ensure that native methods running in multiple threads can access the same array at the same time. For example, JNI can reserve an internal counter for each fixed array, so that one thread will not unpin an array that is also fixed by another thread. Note that JNI does not need to lock the primitive array that is exclusively accessed by the native method. Updating the Java array from different threads at the same time will lead to indeterminate results.

6.2 Access fields and methods

JNI allows native code to access fields and call methods of Java objects. JNI identifies methods and fields by their symbolic names and type signatures. The two steps extract and locate the field or method from the name and signature of the field or method. For example, to call the method f in the cls class, the native code first obtains a method ID, as shown below:

jmethodID mid = env->GetMethodID(cls, f , (ILjava/lang/String;)D ); Copy code

Then, the native code can reuse the method ID as follows:

jdouble result = env-> CallDoubleMethod (obj , mid, 10, str); duplicated code

The field or method ID does not prevent the VM from uninstalling the class derived from that ID. After unloading the class, the method or field ID is invalid. Therefore, the native code must ensure:

  • Maintain a real-time reference to the underlying class, or
  • Recalculate method or field ID

If it intends to use the method or field ID for a longer period of time.

JNI does not impose any restrictions on how to implement field and method IDs internally.

7. programming error report

JNI does not check for programming errors, such as passing NULL pointers or illegal parameter types. Illegal parameter types include the use of ordinary Java objects instead of Java class objects. For the following reasons, JNI does not check for these programming errors:

  • Forcing the JNI function to check all possible error conditions will reduce the performance of normal (correct) native methods.
  • In many cases, there is not enough runtime type information to perform such checks.

Most C library functions cannot prevent programming errors. For example, the printf() function usually causes a runtime error when it receives an invalid address, rather than returning an error code. Forcing C library functions to check all possible error conditions may result in repeated such checks-once in user code and then in the library again.

Programmers must not pass illegal pointers or parameters of the wrong type to JNI functions. Doing so may cause arbitrary consequences, including system state corruption or VM crashes.

8. Java exception

JNI allows native methods to raise arbitrary Java exceptions. Native code can also handle unhandled Java exceptions. Unhandled Java exceptions will be propagated back to the VM.

8.1 Exception and error codes

Some JNI functions use the Java exception mechanism to report error conditions. In most cases, JNI functions report error conditions by returning error codes and throwing Java exceptions. The error code is usually a special return value (such as NULL), which is outside the range of the normal return value. Therefore, programmers can:

  • Quickly check the return value of the previous JNI call to determine whether an error has occurred, and
  • Call a function ExceptionOccurred() to obtain an exception object containing a more detailed description of the error condition.

In two cases, the programmer needs to check the exception and cannot check the error code first:

  • The JNI function that calls the Java method returns the result of the Java method. The programmer must call ExceptionOccurred() to check for exceptions that may occur during the execution of the Java method.
  • Some JNI array access functions do not return error codes, but may throw ArrayIndexOutOfBoundsException or ArrayStoreException.

In all other cases, a non-error return value ensures that no exception will be thrown.

8.2 Asynchronous exception

In the case of multiple threads, threads other than the current thread may issue asynchronous exceptions. Asynchronous exceptions will not immediately affect the execution of native code in the current thread until:

  • Native code calls one of the JNI functions that may cause synchronization exceptions, or
  • Native code uses ExceptionOccurred() to explicitly check for synchronous and asynchronous exceptions.

Please note that only those JNI functions that may cause synchronous exceptions will check for asynchronous exceptions.

Local methods should insert ExceptionOccurred() checks where necessary (for example, in a tight loop without other exception checks) to ensure that the current thread responds to asynchronous exceptions within a reasonable time.

8.3 Exception handling

There are two ways to handle exceptions in native code:

  • The native method can choose to return immediately, causing an exception to be thrown in the Java code that initiated the native method call.
  • Native code can clear the exception by calling ExceptionClear(), and then execute its own exception handling code.

After the exception is raised, the native code must first clear the exception, and then make other JNI calls. When there is a pending exception, the JNI functions that can be safely called are:

ExceptionOccurred() ExceptionDescribe() ExceptionClear() ExceptionCheck() ReleaseStringChars() ReleaseStringUTFChars() ReleaseStringCritical() Release<Type>ArrayElements() ReleasePrimitiveArrayCritical() DeleteLocalRef() DeleteGlobalRef() DeleteWeakGlobalRef() MonitorExit() PushLocalFrame() PopLocalFrame() Copy code