Remote Screen Sharing And Controlling Application: Technical Issues and How I Overcame Them. Part-1
As software engineers, we face numerous difficulties, some of which can be easily solved, while others may require considerable effort. We are always striving to be proactive and productive in our work — it is our life and our duty. :)
Let’s be focus on topic >-<
It has been a while since I last worked on Android development, as my current focus is on backend development. However, I believe it would be beneficial to make some notes regarding the challenges I encountered during a previous Android project. In this discussion, I will provide a brief overview of the technical difficulties I faced and the strategies I employed to overcome them. These efforts ultimately led to the successful completion and safe delivery of the project to the client.
TechStack: Kotlin, Java, Flutter, JavaScript(Typescript)
Database: Firestore
Framework: WEBRTC, Socket Programming, Android Management API
Let’s dive on my project details
The project is an enterprise-level Fleet Management System that reduces manual paperwork obligations. The easy-to-use app, which can be accessed from any mobile device, allows operators to efficiently manage their work diary reporting from one tool, from any place at any time. The company can manage fleet status/location, driver status, and work logs for the vehicles.
My task was to develop an application that helps non-tech drivers who didn’t know how to log their status (break, lunch, driving, etc.) in the app. I also had to handle the device remotely. Furthermore, the client had asked whether we can remotely access the device and do the work ourselves, such as clicking on the remote screen and updating work logs for the device.
I have heard about Anydesk and TeamViewer applications, which offer remote access and remote control capabilities. If they can do it, then I thought, why can’t I? So, I started researching more about remote access and control.
Initially, I researched how to handle devices and came up with the idea of developing a kiosk launcher. A kiosk launcher restrict users to access specific user interface with the ability to customize their device by allowing permission to whitelist or blacklist specific apps. Since the launcher is almost always running, it needs to be as bug-free as possible to ensure its intended functions are carried out effectively. A launcher would be perfect for this project as it would enable only specified packages to be installed and could handle ringtones, brightness, screen timeout, and other features necessary to customize the device.
One problem is kiosk launcher should be device owner. we would have to manually set a command from the device terminal during the initial setup
adb shell dpm set-device-owner com.youpackage.com/.DemoAdminReceiver
While the suggested approach may address the issue to some extent, it does have certain drawbacks. For instance, in the event of a device crash or the launcher being destroyed, executing the command from the device terminal again would be necessary (although this occurrence is rare). Remote device handling does present some challenges, and it’s worth exploring alternative solutions to mitigate these issues.
Here Android Management API came in :)
What is Android Management API?
From Android Management API is a set of tools provided by Google to help developers manage and secure Android devices within their organization. The API allows developers to remotely manage and configure devices over-the-air using cloud-based tools and services.
With the Android Management API, developers can perform tasks such as enrolling new devices, configuring device settings, updating device policies, and remotely wiping devices. The API also allows developers to create custom apps and push them to devices, set up and manage Wi-Fi and other network configurations, and restrict access to certain device features.
The Android Management API is intended for developers who need to manage and secure large numbers of Android devices, such as those used in enterprise or educational settings. It provides a streamlined, scalable way to manage devices, making it easier for developers to ensure the security and compliance of their device fleet.
All we need to do is enroll device with just scanning a QR code :O It replaced the kiosk launcher that i just made :(
Hmm ! sounds good and now How to Remotely control device screen? this feature is not provided by android management API.
As mentioned earlier, I conducted research on how AnyDesk and Teamviewer function. I attempted to decompile the AnyDesk application, which provided me with a set of .dex files containing executable code for the Android runtime. However, reading the decompiled files is not straightforward, as it often requires going through XML files to gain some understanding. Despite the challenges, even small hints can be highly stimulating for the mind during the development process. Additionally, I came across some block comments within the decompiled code, which provided further insights that I incorporated into my own application. Notably, both AnyDesk and Teamviewer utilize WEBRTC, a technology I started learning as it facilitates real-time communication.
WEBRTC
With WebRTC, you can add real-time communication capabilities to your application that works on top of an open standard. It supports video, voice, and generic data to be sent between peers, allowing developers to build powerful voice- and video-communication solutions
How WEBRTC Works?
- Signaling: Before two devices can establish a connection, they need to exchange information about their session. This signaling process typically involves using a signaling server to exchange metadata, such as network addresses and session descriptions. Signaling can be implemented using various protocols, including WebSocket, HTTP, or other custom solutions.
- Peer-to-Peer Connection Establishment: Once the devices have exchanged the necessary information through signaling, they can establish a direct peer-to-peer connection. WebRTC uses the Interactive Connectivity Establishment (ICE) framework, which employs several techniques to discover and establish the best connection path between peers. ICE uses protocols like STUN (Session Traversal Utilities for NAT) and TURN (Traversal Using Relays around NAT) to traverse firewalls, Network Address Translators (NATs), and other network obstacles.
- Media Stream Capture: After the connection is established, WebRTC enables the capture of audio and video streams from the user’s microphone and camera using the browser’s MediaStream API. This API provides access to the user’s media devices and allows developers to capture and handle media streams.
I made a server with NodeJS implementing WEBRTC and android client to share the screen. As for different clients residing on different networks, we need TURN server to be effectively communicate. I tried to make my own turn server in windows
TURN Server
When using Google ICE servers, they work fine if both clients are in the same network. However, for clients from different networks, you will need to set up your own TURN (Traversal Using Relays around NAT) servers. A popular open-source TURN server implementation is coTURN.
By setting up your own TURN server using coTURN or a similar solution, you can enable communication between clients in different networks by relaying their traffic through the server. This allows the clients to establish a direct connection even if they are behind firewalls or NAT devices.
Setting up a TURN server involves configuring coTURN with appropriate network settings and credentials, such as listening IP addresses, ports, and authentication mechanisms. It’s important to ensure that the TURN server is reachable by both clients and properly configured to handle the relayed traffic.
Once the TURN server is set up and running, you will need to configure your client applications to use the TURN server’s credentials (e.g., username, password) when negotiating the ICE candidates. This allows the clients to utilize the TURN server for relaying their traffic when direct communication is not possible.
It’s worth noting that setting up and maintaining a TURN server can have some associated costs and considerations, such as server resources and bandwidth requirements. Additionally, you may need to configure your network infrastructure (e.g., firewalls, routers) to allow the necessary traffic to pass through the TURN server.
Overall, using a self-hosted TURN server like coTURN can facilitate communication between clients in different networks by relaying their traffic, overcoming the limitations posed by network configurations and NAT traversal.
Signaling Server
A signaling server is a crucial component in WebRTC. Its primary purpose is to facilitate the initial connection establishment and ongoing communication between peers.
Connection Establishment: When two peers intend to communicate, they need a way to discover each other and establish a direct connection.
Signaling Protocol: The signaling server uses a signaling protocol to facilitate the communication between peers. eg. WebSocket,
Message Exchange: The peers exchange signaling messages via the signaling server. These messages contain session descriptions, which include information about the media streams (audio, video) and their respective network configurations.
ICE Candidates: The Interactive Connectivity Establishment (ICE) framework helps establish the most suitable network path for communication. During the signaling process, ICE candidates are exchanged between peers to gather information about their network addresses and connectivity options.
Connection Establishment Completion: Once the peers have exchanged the necessary signaling messages, including session descriptions and ICE candidates, they can establish a direct peer-to-peer connection using the gathered information.
Accessibility Service
Accessibility services in Android are primarily designed to assist users with disabilities by providing enhanced functionality and improving the overall accessibility of the device. These services enable users to navigate, interact, and use their Android devices effectively, regardless of their physical or cognitive abilities.
One of the features provided by accessibility services is the ability to perform actions on behalf of the user, such as generating touch events or injecting gestures into the user interface. This functionality is intended to support users who may have difficulty using the standard touch input methods, allowing them to control their devices through alternative means.
However, it’s important to note that the use of accessibility services for purposes other than their intended accessibility functions may not align with the original design principles and guidelines. While certain actions like dispatching gestures through accessibility services may be technically possible, using them for remote event injection or similar purposes deviates from their intended use case.
I believe that everything is possible if you give your 100% in anything. Finally, I am able to implement the remote screen sharing feature. I was happy! The client was happy too! I will now work on the second part, which is coding and conducting the demo. Cheers, guys!
Find Second part here