Object Detection and Tracking in Android (Native C++)- Implementation Part 2

6 min readAug 18, 2023

This is the High level implementation part from here.

Previous part I mainly discussed the theoretical topics that I am using in the detection and tracking of objects.

I will try to cover all concepts that I used throughout my journey in this blog.

The implementation is on Android using NDK C++. Most of the logical parts are implemented in Native C++. It’s essential to have a basic knowledge of the C++ programming language, including C++ Pointers and Vectors. You should also understand what CPP source and header files are.

Header files store predefined functions with a .h file extension, while Source files store the implementation of these predefined functions with a .cpp file extension.

Source files are the files that contain the code that gets compiled. The implementation of your algorithm is contained in a source file. On the other hand, header files contain code (usually function or class definitions) that are included in your source file using the #include preprocessor directive.

Prepare Android Project

Create Android projects with NDK, add a “jni” folder with a “CMakeLists.txt” file, and mention it in the app-level “build.gradle” file.

.
.
android {
    ...
      externalNativeBuild {
           cmake {
                path file('jni/CMakeLists.txt')
                version '3.18.1'
           }
      }
   ...
}
.
.

The CMakeLists file contains all the included libraries and source files for our jni folder.

This project will incorporate OpenCV, NCNN Android Vulkan, and YOLOv7. I will utilize Git submodules to integrate them into my project, which will contribute to minimizing my repository size. OpenCV serves the purpose of handling computer vision and image processing tasks. NCNN Android Vulkan, as a high-performance neural network inference framework, is used in place of Tensorflow Lite for tasks such as image rotation, transformation, and format conversion. YOLOv7, on the other hand, is an object detection model.

Please add the necessary submodules.

git submodule add --name ncnn-android-vulkan-2023 https://github.com/kanxoramesh/ncnn-20230517-android-vulkan.git app/jni/ncnn-android-vulkan-2023
git submodule add --name opencv https://github.com/kanxoramesh/ncnn-20230517-android-vulkan.git opencv

This action will include the `/opencv` and `app/jni/ncnn-android-vulkan-2023` folders in the project. You can review the `.gitmodules` file to verify the location and repository URLs of your modules.

Update CMakeLists

Incorporate all necessary dependency libraries into the CMake configuration. Integrate your source files and establish the required linking.

cmake_minimum_required(VERSION 3.10)
set(CMAKE_CXX_FLAGS  "${CMAKE_CXX_FLAGS} -Wall -Werror -fno-exceptions -frtti")set(target library-name)
project(${target} CXX)

set(ONNXRUNTIME_DIR ${CMAKE_SOURCE_DIR}/onnxruntime/${ANDROID_ABI})
set_target_properties(libonnxruntime PROPERTIES IMPORTED_LOCATION ${ONNXRUNTIME_DIR}/lib/libonnxruntime.so)
include_directories(${ONNXRUNTIME_DIR}/include)

set(ANDROID_OPENCV_COMPONENTS "opencv_java" CACHE STRING "")
message(STATUS "ANDROID_ABI=${ANDROID_ABI}")
find_package(OpenCV REQUIRED COMPONENTS ${ANDROID_OPENCV_COMPONENTS})

file(GLOB srcs *.cpp *.c)
file(GLOB hdrs *.hpp *.h)

include_directories("${CMAKE_CURRENT_LIST_DIR}")
add_library(${target} SHARED ${srcs} ${hdrs} yoloncnn.cpp yolo/yolo.cpp
        ndkcamera.cpp)

find_library( log-lib log ) # Library required by NDK.
find_library(android-lib android) # for AssetManager functionality
find_library(jnigraphics-lib jnigraphics)

target_link_libraries(${target} ${ANDROID_OPENCV_COMPONENTS} ncnn camera2ndk mediandk libonnxruntime)

Assets

Include asset files within the Android assets folder.

yolov7-tiny.bin
yolov7-tiny.param

Native Code

The Java Native Interface (JNI) serves to invoke native C++ files from Android Kotlin. I am crafting a Kotlin file containing function references to JNI methods. This file acts as the central link between Kotlin and C++, housing all the essential functions.


class NcnnYolov7(private val listener: NativeListener) {

    /**
     * this method will be called from the NDK to update the object counts
     * every time when unmatched object found, we need to update counts in bottomsheet view in kotlin
     */
    fun countUpdate(counts: IntArray) {
        if (counts.isNotEmpty())
            listener.onCountUpdate(counts)
    }

    /**
     * load model with configuration parameters
     * AssetManager ref used to access assets ref from c++
     * modelid is asset path of yolo model
     * cpugpu is used to configure CPU(0) or GPU(1)
     * conf is acceptable confidence
     * trackingFrame is the frame number used to track detected objects after they have been detected in one frame.
     */
    external fun loadModel(
        mgr: AssetManager?,
        modelid: Int,
        cpugpu: Int,
        conf: Float,
        trackingFrame: Int,
    ): Boolean

    /**
     * open NDK camera with facing i.e. front(0) or back(1)
     */
    external fun openCamera(facing: Int): Boolean

    /**
     * close the camera
     */
    external fun closeCamera(): Boolean

    /**
     * changed the camera surface
     */
    external fun setOutputWindow(surface: Surface?): Boolean

    /**
     * get events from NDK. is list of detection objects description
     */
    external fun getEvents(): Array<Event>
    companion object {
        init {
            System.loadLibrary("opencvml")
        }
    }
}

/**
 * listener to update object counts
 */
interface NativeListener {
    fun onCountUpdate(counts: IntArray)
}

Develop a native C++ function within the `/app/jni/yoloncnn.cpp` file. For the `openCamera` function, we define a native function as follows:

JNIEXPORT jboolean 
JNICALL
Java_com_ml_ramesh_camera_NcnnYolov7_openCamera(JNIEnv *env, jobject thiz,
                                                         jint facing) {
    if (facing < 0 || facing > 1)
        return JNI_FALSE;__android_log_print(ANDROID_LOG_DEBUG, "ncnn", "openCamera %d", facing);
    cameraThread = std::thread(cameraThreadd, env, (int) facing);
    return JNI_TRUE;
}

NDK Camera

This facilitates access to the camera’s lower-level functionalities using C++ code, enabling efficient processing and manipulation of camera data. I transitioned from Android’s Camera2 and CameraX APIs to an NDK camera for enhanced performance in critical areas. This mechanism is employed to capture frames, process images, and apply transformations to frames.

class NdkCameraWindow : public NdkCamera
{
public:
    NdkCameraWindow();
    virtual ~NdkCameraWindow();
    void set_window(ANativeWindow* win);
    virtual void on_image_render(cv::Mat& rgb) const;
    virtual void on_image(const unsigned char* nv21, int nv21_width, int nv21_height) const;
};

Object detection with Yolov7

The YOLO version 7 model is employed in conjunction with the NCNN Android-Vulkan high-performance neural network inference framework for object detection. The Yolo class encompasses certain attributes responsible for loading, detecting, and rendering bounding boxes on the canvas.

#ifndef YOLO_H
#define YOLO_H

class Yolo {
public:
    Yolo();
    int load();
    int detect(cv::Mat &rgb, std::vector<Object> &objects);
    int draw(cv::Mat &rgb, const std::vector<Object> &objects);
};
#endif

Load the model before starting detection with the required configuration parameters.

 if (!g_yolo)
       g_yolo = new Yolo;
       g_yolo->load(assetManager, modeltype, target_size, norm_vals[(int) modelid], isGpuEnabled, confidence);

Umm, so every time we grab an image(frame) using the NDK camera, we’ll send that current frame to the Yolo detector, and it’s gonna return a list of detected objects with their bounding boxes, how confident the detector is about them, and labels! and then we will draw the bounding boxes in this frame.

std::vector<Object> objects;
g_yolo->detect(currentframe, objects);
g_yolo->draw(rgb, objects);

OpenCV Object Tracker

We did detection from the YoloV7 using ncnn android-ulkan framework model and got a detection list. Now time to track them in subsequent frames. One problem is how to assign an id to newly detected objects. This assignment id problem can be resolved with the Hungarian Algorithm.

I am calculating the matrix cost for this problem using Euclidean Distance. You can learn more from another post of mine on how to calculate the cost matrix. Based on the results from the Hungarian Algorithm, update the existing tracks, and for new objects, initialize new tracks.

std::tuple<std::vector<std::pair<int, int>>, std::set<int>, std::set<int>>
hungarian_algorithm(vector<pair<cv::Point, int>> input_centroids, double max_distance,
                    vector<pair<cv::Point, int>> tracks_centroids) {
.
.
        vector<vector<double>> costmatrix(tracks_centroids.size(),
                                          vector<double>(input_centroids.size()));
.
.


        return std::make_tuple(matches, unmatched_tracks, unmatched_detections);
}

This function will return a tuple of matches, an unmatched_tracks list, and an unmatched_detections list. For the matches, we need to update the tracker. For unmatched_tracks, we will check their time to live and mark them as expired if they meet the condition. For unmatched_detections, new tracks should be created for each of them.

for (auto item: matches) {
        //update existing tracker
    }

for (auto item: unmatched_tracks) {
        //update time to live param for this track
    }


for (auto item: unmatched_detections) {
        //new tracker initialize
    }

Count the Objects

We have incremented the count of new objects from the unmatched_detections list. We are utilizing the NDK method to transfer this count of objects from C++ to Kotlin code.

void countObject(std::map<int, int> countmap) {
    if (countmap.size() > 0) {
        g_vm->AttachCurrentThread(&tls_env, nullptr);

        jmethodID updateUIMethod = tls_env->GetMethodID(tls_env->GetObjectClass(g_obj),
                                                        "countUpdate",
                                                        "([I)V");

        std::vector<int> count;
        for (auto it = countmap.begin(); it != countmap.end(); ++it) {
            count.push_back(it->first);
            count.push_back(it->second);
        }
        jintArray output = tls_env->NewIntArray(2);
        tls_env->SetIntArrayRegion(output, 0, 2, &count[0]);
        tls_env->CallVoidMethod(g_obj, updateUIMethod, output);

        g_vm->DetachCurrentThread();
    }
}

This method is most useful, as we are not calling NDK methods from our Kotlin code; instead, we are calling Kotlin methods from our C++ code.

Final Demo

In Next session I will try to cover how I improve the Detection and Tracking Performance using other different models.

Download this android Apk file to test in your device.

Thank you.